Journalists vs robots - writing with AI

Think:Act Magazine Artificial Intelligence
Journalists vs robots - writing with AI

March 29, 2018
Article

by Wolfgang Zehrt
Illustrations by Mungang Kim

If you think disruption of the media business is old news, think again. It's only just getting started, and the largest threat – or opportunity, depending on how you look at it – is looming within sight: robot writers.

Top journalists still in demand

Ask Anybody to describe a journalist to you and you'll get as many answers as the people you question. The stereotype fits a number of images and descriptions: There's the US cliché of a trench coat-wearing scribbler in a trilby hat with a press ticket in the band; then there's the hard-playing, hard-drinking hack of London's Fleet Street; and what about the fearless seekers of truth all over the world who are prepared to die – and sometimes do – to get their stories out? However tenacious they are, though, we all think of journalists as human beings, and that the role they play is a vital component in a functioning democracy: speaking truth to power and promoting freedom of thought as well as speech.

According to a recent BBC news report, some 90% of all news content will be written by "robots" by 2022.

In recent years, however, there seems to be an ever-growing set of conditions that threaten to damage the fourth estate, the way journalism works and those sweating over their keyboards. The disruption of the business model with the advent of the internet and digital publishing was one such threat. The specter and accusation of "fake news" striking at the integrity and trustworthiness of the media was another. But added to these is a new and possibly much more mortal threat. It isn't so much about business or about how the news is delivered – online, print, mobile – but more about how the news is written. How it is produced. How it is generated.

Pens and data-flows: disruption is far from old news - for journalists, it's just begun.
Pens and data-flows: disruption is far from old news - for journalists, it's just begun.

According to a recent BBC news report, by 2022 some 90% of all news content will be written by "robots." Digitization and the rapidly increasing amount of data it has made readily available is enabling large parts of today's reporting to be created by a computer: the weather, football and stock markets have been the first areas in which "natural language generation" programs have been able to deliver good, readable stories.

Today, a computer at the Norwegian NTB news agency even writes a large proportion of its election reporting. That said, editor-in-chief Mads Yngve Storvik stresses that he can't see a robot being able to conduct an interview any time soon. At the US news agency The Associated Press, a computer already produces 10,000 economic and baseball reports every month. And under new owner Jeff Bezos, billionaire founder of Amazon, The Washington Post is rapidly developing a new content management system (CMS) that has put automated content generation at its heart right from the get-go.

Ten thousand economic and baseball reports are produced by a computer at US news agency The Associate Press each month.

Robot journalism will likely lead to thousands of media job losses around the world. However, it might not mean that all is lost for the journalists who can find the right way through. Investigative stories such as The Panama Papers-style reports, or outstanding portraits and profiles – the kind of content that differentiates a publication from its competitors – could thrive in this new media age. No matter how good, a stock market report will never win any journalism prizes, whether it was written by a robot or a human. Leaving aside such specialist reporting, though, can the news media become fully automated? The dream might be for day-to-day, high-frequency and personalized news business to be handled by computers that never need to stop for breaks, but will it always need that special human touch? What about the questions of judgment and tone?

While machine learning already works well when it comes to replicating the style and voice of very specific media (tabloid, serious, B2B), artificial intelligence still finds it difficult to summarize the most relevant messages from documents and data if a human has not previously provided examples to explain what the key findings might be. Software has so far not had the world knowledge to realize, for example, that a rate drop of more than 3% in a day would normally be unusual for a stock market heavyweight such as a major auto manufacturer, but that it may well occur in conjunction with new revelations such as a diesel scandal. However, software can now look for a suitable quote from analysts on precisely this rate trend and can incorporate it perfectly, both in terms of content and language. This is something that would have been unthinkable just a year ago.

The robot journalist is getting ready for the next step in its evolution, but its human colleagues still have to tell it which subjects are worth writing about and what data should be used. That could soon change. It is already employing relatively simple-to-use algorithms to automatically work out topics that take into account the frequency of keywords in internet searches while cross-checking the potential theme against the intensity of discussion of the event on social media. Current topics that are shaping opinions can thus be identified with a great degree of reliability: The robot simply has to track down images from databases using the right keywords. Fully automated video content could also be created in this way.

Unexpected: Young adults unsettled in parts of economically strong Bavaria (robot-generated-article)

While youth unemployment throughout Bavaria is on an historically low level, it remains unchanged at a high 16% (August 2017) in Lower Bavaria (Munich: 2,9%). This is particularly noteworthy because Lower Bavaria is on a good second place regarding to employment rates in the state of Bavaria in general.

Although the start of the new training year in September will decrease the rate, the Lower Bavarians will still be in last place in youth unemployment. This, obviously, also affects the consumption of young adults.

While the research institute GfK expects a further increase in consumption expenditures for all of Germany, there are other signals from the young adults in Lower Bavaria. For example, cars defined as typical "beginner cars" remain an average of 9 weeks longer than the previous year at the used car dealer. The explanation cannot be that younger adults buy more new cars, because in the starter price class up to 16,000 Euros, the approval numbers in August have fallen by almost 20%.

A similar picture emerges in the demand for low-priced single-apartments: while comparable apartments have been offered on the market for an average of only 7 weeks in the last 5 years, now it is already 11 weeks. According to the latest official survey for 2014, the Lower Bavarians came to an annual income of just under 21,000 euros. In neighboring Oberbayern, household net income is significantly higher, at around EUR 25,000.

This part of Bavaria is particularly dependent on the automobile industry. At the Dingolfing-based BMW factory, some 18,000 employees produce up to 1,400 vehicles per day. With around 6,500 employees, the automotive supplier ZF Passau produces systems for the automotive industry, while the automotive supplier Dräxelmeier has 6,000 employees. At these factories, for example, job openings for job applicants have fallen by almost 30% in the past 12 months.

Lower Bavaria is losing an above-average number of young residents from some parts of the state, even though the population will remain stable overall. In the 19-25 age group, which is particularly important for employers, there is a clear warning sign: from today's 89.1 the proportion of these entrants will reduce to 72.7 out of 1,000 persons within next 10 years (Statistisches Landesamt). Only a few bigger towns in this part of Bavaria will remain unaffected by this development.

THIS ARTICLE WAS WRITTEN ENTIRELY BY AI AUTOMATION , without any human intervention or correction during or after the writing process. Human input involved finding data sources and instructing the machine how to analyze them. An analytical piece like this can now be written without supervision – additional data sources can be added simply to expand the scope of the article.

Content automation comprises powerful analysis and language diversification

There is another significant reason for developing fast new media content that is rapidly updated. Thomas Scialom, a researcher at the French natural language generating startup Recital, put his finger on it: "Mobile media use means that there are ever fewer visual aids to help readers understand content, and the amount of information is also limited by the size of the screen. At the same time, the time spent reading content on a mobile is also falling." Shorter information, written specifically in the tone of the target audience and with personalized, targeted content – that's not something a human writer can produce, but an AI journalist can. For example, why should someone read a report about all of the Nasdaq rates if you only need a report on trends for Apple shares? Could a graphic provide that information? Automated text generation can do far more: The report on Apple shares can not only summarize historic trends, but also provide rankings – is it only Apple shares that are under pressure today, or is the trend also affecting China's Tencent? Is Amazon on the up and Alibaba plummeting? Does it have something to do with the start of the vacation in the US or China? Has disappointing economic data been released? The more sources that are used, the better an article will be than a graphic.

Anglo-German business news agency dpa-AFX was one of the first to develop a template solution: It was simply a case of filling in the gaps in pre-written sentences with new data. The sentences provided today are much more varied and sophisticated, but this basic principle is only slowly being replaced. Hamburg-based computer linguist Patrick McCrae explains, "The perfect solution for content automation must consequently not only have extensive opportunities for language diversification, but also have the ability to incorporate powerful analysis. Text generation will not progress far beyond the dynamic filling-in of gaps in the texts unless artificial intelligence is involved. Truly interesting texts with diverse content will be created if surprising, non-trivial findings can be extracted from the relevant data sources. That's exactly why we need artificial intelligence." Here's a powerful example: One German digital publisher can generate a view of the monthly employment and training markets at the touch of a button for 411 regions across Germany, focusing on different professional groups or levels of education, if desired. The software makes discoveries in the mountain of data with each analytical pass that a human editor would have only spotted by chance, if at all.

Future scenario of texts in ongoing update processes

But biggest challenge in this relatively new world of automated writing, and one that is causing even the likes of Google and IBM's Watson to thrash their disks, is free text creation written from flowing text sources, and not just from data stored or handled in a particular way. McCrae frames the problem clearly: "Automated text comprehension without any limitations in terms of the subject is a problem that computer science has yet to solve."

Yet if the problem were to be solved, it would deliver a great prize: a kind of perpetual media motion where new texts could be created with a seemingly endless output of flowing text. It could be the salvation for news agencies, which could start with one single text created by software and then turn it into 120 different versions for 120 newspaper customers. But will software be capable of understanding flowing text, including irony, sarcasm or annotations? Scialom sums up the problem quite simply: "The biggest challenge is currently getting machines to understand unstructured data." Once this challenge has been met, only a small band of journalists will be relevant,likely writing highly specialized and unique contributions for which readers around the world will be willing to pay a decent price. We will just have to wait to see what publications will have survived for them to be writing them for.

Further reading
Our Think:Act magazine
blue background
Think:Act Edition

AI think, therefore AI am

{[downloads[language].preview]}

What exactly do people mean when they talk about AI in 2018? Where do I start if I want to embrace AI in my business? Get your questions answered in our Think:Act magazine on artificial intelligence.

Published February 2018. Available in
Subscribe now!

Curious about the contents of our newest Think:Act magazine? Receive your very own copy by signing up now! Subscribe here to receive our Think:Act magazine and the latest news from Roland Berger.

Portrait of Think:Act Magazine

Think:Act Magazine

Munich Office, Central Europe