Return to search

Addressing Challenges of Modern News Agencies via Predictive Modeling, Deep Learning, and Transfer Learning

Today's news agencies are moving from traditional journalism, where publishing just a few news articles per day was sufficient, to modern content generation mechanisms, which create more than thousands of news pieces every day.

With the growth of these modern news agencies comes the arduous task of properly handling this massive amount of data that is generated for each news article.

Therefore, news agencies are constantly seeking solutions to facilitate and automate some of the tasks that have been previously done by humans.

In this dissertation, we focus on some of these problems and provide solutions for two broad problems which help a news agency to not only have a wider view of the behaviour of readers around the article but also to provide an automated tools to ease the job of editors in summarizing news articles.

These two disjoint problems are aiming at improving the users' reading experience by helping the content generator to monitor and focus on poorly performing content while allow them to promote the good-performing ones.

We first focus on the task of popularity prediction of news articles via a combination of regression, classification, and clustering models.

We next focus on the problem of generating automated text summaries for a long news article using deep learning models.

The first problem aims at helping the content developer in understanding of how a news article is performing over the long run while the second problem provides automated tools for the content developers to generate summaries for each news article. / Doctor of Philosophy / Nowadays, each person is exposed to an immense amount of information from social media, blog posts, and online news portals. Among these sources, news agencies are one of the main content providers for each person around the world. Contemporary news agencies are moving from traditional journalism to modern techniques from different angles. This is achieved either by building smart tools to track the behaviour of readers’ reaction around a specific news article or providing automated tools to facilitate the editor’s job in providing higher quality content to readers. These systems should not only be able to scale well with the growth of readers but also they have to be able to process ad-hoc requests, precisely since most of the policies and decisions in these agencies are taken around the result of these analytical tools. As part of this new movement towards adapting new technologies for smart journalism, we have worked on various problems with The Washington Post news agency on building tools for predicting the popularity of a news article and automated text summarization model. We develop a model that monitors each news article after its publication and provide prediction over the number of views that this article will receive within the next 24 hours. This model will help the content creator to not only promote potential viral article in the main page of the web portal or social media, but also provide intuition for editors on potential poorly performing articles so that they can edit the content of those articles for better exposure. On the other hand, current news agencies are generating more than a thousands news articles per day and generating three to four summary sentences for each of these news pieces not only become infeasible in the near future but also very expensive and time-consuming. Therefore, we also develop a separate model for automated text summarization which generates summary sentences for a news article. Our model will generate summaries by selecting the most salient sentence in the news article and paraphrase them to shorter sentences that could represent as a summary sentence for the entire document.

Identiferoai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/91910
Date22 July 2019
CreatorsKeneshloo, Yaser
ContributorsComputer Science, Ramakrishnan, Naren, Reddy, Chandan K., Yao, Danfeng (Daphne), Prakash, B. Aditya, Han, Eui-Hong
PublisherVirginia Tech
Source SetsVirginia Tech Theses and Dissertation
Detected LanguageEnglish
TypeDissertation
FormatETD, application/pdf
RightsIn Copyright, http://rightsstatements.org/vocab/InC/1.0/

Page generated in 0.0016 seconds