How to produce popular stories in multiple languages -#BBCNewsHack
Pia, a 19-year-old from Nuuk in Greenland has had five abortions in the last two years. Her story is proving a captivating case study for journalists (above) as well as officials from the 17,000 populated country looking for ways to arrest unwanted pregnancies.
Greenland is presently experiencing more abortions per year than births; 800 to 700 respectively. Its neighbour, the Netherlands, by contrast has one of the lowest rates in the world.
That this story could pique interest around the world may not be surprising in a networked society where “Oh-My-God” expressions means a story is never far away from going viral. Featured online on the BBC World Service in Turkish, Hausa and Arabic, to name a few languages, according to data available by Telescope (analytical software used within the BBC) the story travelled.
It was widely read in Turkish, 419401 views; in Hausa, 81346; and Arabic, 29370 page views. The figures raise interesting questions. Amongst them, that whilst English is still the most widespread language in the world, online in markets where English is secondary or not spoken at all, local languages rule.
There’s an economic and soft power imperative in the appeal to consumers who appreciate being catered for. However, just because a story has the potential to be popular does that mean the story needs to be translated or even adapted to its new territory, particularly if it runs the risk of being interpreted as heresy.
To access a story like Pia’s in English would require the use of translation software, such as google translate, which may not necessarily convert the story accurately. Lexical and syntactical structures in different languages, as well as nuances e.g. metaphors may not capture the essence of what’s initially being reported.
Issues of Translating
In a highly competitive news market, if the CNN meme in the 1990s for story production was “kill what you can eat”, whereby an item was produced for multiple production outlets (that is now commonplace), today the motto is “recycle what you want to eat”.
But there are snags. Adapting stories in different languages can be expensive and time consuming. Not many media outlets can afford this. Also as a publisher how do you know a story may be popular in other languages in the first place, and could designing an ergonomically suitable user interface for consumers to access multi-languages in one fell swoop bring tangible benefits?
These and and a range of other issues were some of the problems six teams of five people, consisting of designers, journalists and coders addressed coming together at the BBC Hackathon entitled: ‘Tools for multilingual newsrooms’.
Opened by the BBC’s Director of Digital Development James Montgomery he set the tone asking the following which different teams would answer in different ways. They were:
- How do you identify stories which are about to trend in the time?
- How do you identify World Service languages which have not been reported on the story?
- How do you let an editor of the services with summary?
Team Connected Stories Approach
To tell Pia’s story for a multilingual audience at present requires a classical approach. An editor identifies a story based on editorial judgement and instinctive nous. Their hunch might be the story has legs in different languages so the editor draws the attention of journalists from requisite language departments, then using, say, a template the story is manually translated from one language to another accordingly.
On the other hand editorial teams could be working independently so there’s no central repository to indicate the story has been translated into multiple languages. Tags and meta data might yield results to correlate multiples sources.
Team Connected Stories, with the aid of the BBC’s Digi Hub editor Hernando Alvarez found various versions of Pia’s story over two days. But that’s the cheat. The team were told it was popular so they just hunted for the different languages, a task that required at best speaking that language in the first place.
Could there be another way, starting from first principles? Could it be possible to sample any story and assess whether it could be popular? And if it were the case, could you build a public facing interface that showed the various languages?
After direction from the BBC Connected Team, following an agile prototype workflow, Kristine, who became de facto project manager came up with this brainstorming with colleagues.
Sean, a visual journalist from the BBC converted this to the diagram below which illustrates how a story in different languages could be recognised using, say, a public interface (globe), and meta day or keyword extraction. In this situation journalists could have been working independent of one another, but once recognised that the story is doing well would be pull it into a story template.
Sean’s mock up of the interface for the story as a prototype was this, with a drop down arrow for more languages
A further mock up using a different story illustrated how multiple languages could be used in video.
The central question in determining popularity is the secret sauce. From consulting with Andy from the BBC News Lab, a solution may not be far off in the future; it’s a matter of plumbing.
Using named entities, concept themes, and BBC archive, the combination of machine language and a neural network could determine whether a story would be popular or not.
This would relieve the burden on editors. Furthermore, these stories could be passed through enhanced translation software. One of the Hackathon teams from Edinburgh prototyped the possibility of a robust language software.
The diagram for the prediction of a story and its popularity looks something like this below. The higher the score ( between 0 and 1)the greater the chance of its popularity. As an app this could sit on a phone. Imagine that, the opportunity to identify a story’s virality and then published in multiple languages at press of a button?
Of note, in spite of the use of enhanced translation software, it would still require a human to check the copy before publishing, perhaps augmenting the counter argument that machines will replace humans.
All the different teams produced varying designs for the challenge, drawing praise from editors and participants. A summary of the solution-approach below.
Previous Hack can be read here
Dr David Dunkley Gyimah was part of Team Connected Stories. He’s a Knight Batten winner for innovation in Journalism — the first Brit to win this award. He’s been a journalist for thirty years e.g. BBC and Channel 4 News, dotcoms and has taught for seventeen, more recently setting up the journalism LAB. He’s a senior lecturer at Cardiff Jomec . He publishes Viewmagazine.tv. ff him @viewmagazine More on David here