Data-Driven News Generation for Automated Journalism

Leo Leppänen, Myriam Munezero, Mark Granroth-Wilding, Hannu Toivonen

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review


Despite increasing amounts of data and ever improving natural language generation techniques, work on automated journalism is still relatively scarce. In this paper, we explore the field and challenges associated with building a journalistic natural language generation system. We present a set of requirements that should guide system design, including transparency, accuracy, modifiability and transferability. Guided by the requirements, we present a data-driven architecture for automated journalism that is largely domain and language independent. We illustrate its practical application in the production of news articles upon a user request about the 2017 Finnish municipal elections in three languages, demonstrating the successfulness of the data-driven, modular approach of the design. We then draw some lessons for future automated journalism.
Original languageEnglish
Title of host publicationThe 10th International Natural Language Generation conference, Proceedings of the Conference
Number of pages10
Place of PublicationStroudsburg
PublisherThe Association for Computational Linguistics
Publication date4 Sept 2017
ISBN (Print)978-1-945626-52-4
Publication statusPublished - 4 Sept 2017
MoE publication typeA4 Article in conference proceedings
EventInternational Conference on Natural Language Generation - Tilburg, Netherlands
Duration: 5 Nov 20188 Nov 2018
Conference number: 11

Fields of Science

  • 113 Computer and information sciences
  • natural language generation
  • natural language processing
  • news automation
  • computational creativity

Cite this