Goto

Collaborating Authors

 news coverage


Identifying and Investigating Global News Coverage of Critical Events Such as Disasters and Terrorist Attacks

Cai, Erica, Chen, Xi, Keeney, Reagan Grey, Zuckerman, Ethan, O'Connor, Brendan, Grabowicz, Przemyslaw A.

arXiv.org Artificial Intelligence

Comparative studies of news coverage are challenging to conduct because methods to identify news articles about the same event in different languages require expertise that is difficult to scale. We introduce an AI-powered method for identifying news articles based on an event FINGERPRINT, which is a minimal set of metadata required to identify critical events. Our event coverage identification method, FINGERPRINT TO ARTICLE MATCHING FOR EVENTS (FAME), efficiently identifies news articles about critical world events, specifically terrorist attacks and several types of natural disasters. FAME does not require training data and is able to automatically and efficiently identify news articles that discuss an event given its fingerprint: time, location, and class (such as storm or flood). The method achieves state-of-the-art performance and scales to massive databases of tens of millions of news articles and hundreds of events happening globally. We use FAME to identify 27,441 articles that cover 470 natural disaster and terrorist attack events that happened in 2020. To this end, we use a massive database of news articles in three languages from MediaCloud, and three widely used, expert-curated databases of critical events: EM-DAT, USGS, and GTD. Our case study reveals patterns consistent with prior literature: coverage of disasters and terrorist attacks correlates to death counts, to the GDP of a country where the event occurs, and to trade volume between the reporting country and the country where the event occurred. We share our NLP annotations and cross-country media attention data to support the efforts of researchers and media monitoring organizations.


The Hype Index: an NLP-driven Measure of Market News Attention

Cao, Zheng, Wunkaew, Wanchaloem, Geman, Helyette

arXiv.org Artificial Intelligence

Natural Language Processing (NLP) has become an increasingly powerful tool in finance, transforming how researchers and practitioners extract predictive signals from unstructured text. With the rise of real-time news feeds and scalable NLP models, media content now plays a central role in market forecasting, risk management, and behavioral analysis. This paper contributes to that growing body of literature by introducing a novel framework for measuring media-driven attention in equities: the Hype Index. Our approach begins with the construction of a News Count-Based Hype Index, which quantifies the relative media exposure of each stock or sector by calculating its share of daily financial news coverage within the S&P 100 universe. This measure captures how disproportionately a given asset appears in financial media, independent of its economic footprint. To address size-related bias and better isolate disproportionate attention, we introduce the Capitalization Adjusted Hype Index. Defined as the ratio of a stock's or sector's news count weight to its market capitalization weight within its peer cluster, this adjusted index reflects deviations from a benchmark of proportionality. In doing so, it highlights assets that receive media attention in excess of what would be expected based on their economic size.


DAVID MARCUS: Public broadcasting's purpose has passed. It's time to pull the plug

FOX News

Rep. Brandon Gill, R- Tex., got into a heated exchange with CNN host Pamela Brown over the Trump administration's crackdown on government spending, specifically for public broadcasting at PBS and NPR. By 1970, both PBS and NPR sprang forth from the CBP, and Americans were treated to the "News Hour," "Sesame Street," British comedies and science programming at a time when there were only three networks, cable TV was strictly for the boondocks, and VCRs were science fiction. A big part of the reason that programming was limited was that production costs for broadcasting were incredibly high. In David Grzybowski's book, 'The Big Story,' he cites Philadelphia news anchor Larry Kane talking about how hard it was during the 1979 Three Mile Island nuclear scare to just get a live TV shot from Harrisburg to Philly: "I know we had a live microwave, but the microwaves didn't go that far. I think we sought some satellite time. The satellite times in those days were 5,000 a minute."


A Multilingual Similarity Dataset for News Article Frame

Chen, Xi, Samory, Mattia, Hale, Scott, Jurgens, David, Grabowicz, Przemyslaw A.

arXiv.org Artificial Intelligence

Understanding the writing frame of news articles is vital for addressing social issues, and thus has attracted notable attention in the fields of communication studies. Yet, assessing such news article frames remains a challenge due to the absence of a concrete and unified standard dataset that considers the comprehensive nuances within news content. To address this gap, we introduce an extended version of a large labeled news article dataset with 16,687 new labeled pairs. Leveraging the pairwise comparison of news articles, our method frees the work of manual identification of frame classes in traditional news frame analysis studies. Overall we introduce the most extensive cross-lingual news article similarity dataset available to date with 26,555 labeled news article pairs across 10 languages. Each data point has been meticulously annotated according to a codebook detailing eight critical aspects of news content, under a human-in-the-loop framework. Application examples demonstrate its potential in unearthing country communities within global news coverage, exposing media bias among news outlets, and quantifying the factors related to news creation. We envision that this news similarity dataset will broaden our understanding of the media ecosystem in terms of news coverage of events and perspectives across countries, locations, languages, and other social constructs. By doing so, it can catalyze advancements in social science research and applied methodologies, thereby exerting a profound impact on our society.


A diverse Multilingual News Headlines Dataset from around the World

Leeb, Felix, Schölkopf, Bernhard

arXiv.org Artificial Intelligence

Babel Briefings is a novel dataset featuring 4.7 million news headlines from August 2020 to November 2021, across 30 languages and 54 locations worldwide with English translations of all articles included. Designed for natural language processing and media studies, it serves as a high-quality dataset for training or evaluating language models as well as offering a simple, accessible collection of articles, for example, to analyze global news coverage and cultural narratives. As a simple demonstration of the analyses facilitated by this dataset, we use a basic procedure using a TF-IDF weighted similarity metric to group articles into clusters about the same event. We then visualize the \emph{event signatures} of the event showing articles of which languages appear over time, revealing intuitive features based on the proximity of the event and unexpectedness of the event. The dataset is available on \href{https://www.kaggle.com/datasets/felixludos/babel-briefings}{Kaggle} and \href{https://huggingface.co/datasets/felixludos/babel-briefings}{HuggingFace} with accompanying \href{https://github.com/felixludos/babel-briefings}{GitHub} code.


Using artificial intelligence and archival news articles, this teen found that Black homicide victims were less humanized in news coverage

#artificialintelligence

Using artificial intelligence and archival news articles, a teenager in Northern Virginia created a program to measure media biases – and in researching older news articles, she found that Black homicide victims were less likely to be humanized in news coverage. Emily Ocasio, an 18-year-old from Falls Church, Virginia, created an AI program that analyzed FBI homicide records between 1976 and 1984 and their corresponding coverage published in The Boston Globe to determine whether victims were presented in a humanizing or impersonal way. After analyzing 5,042 entries, the results showed that Black men under the age of 18 were 30% less likely to receive humanizing coverage than their White counterparts, Ocasio told CNN. Black women were 23% less likely to be humanized in news stories, Ocasio added. A news article was considered humanizing when it mentioned additional information about the victim and presented them "as a person, not just a statistic," Ocasio said in her project presentation.


NELA-GT-2022: A Large Multi-Labelled News Dataset for The Study of Misinformation in News Articles

Gruppi, Maurício, Horne, Benjamin D., Adalı, Sibel

arXiv.org Artificial Intelligence

In this paper, we present the fifth installment of the NELA-GT datasets, NELA-GT-2022. The dataset contains 1,778,361 articles from 361 outlets between January 1st, 2022 and December 31st, 2022. Just as in past releases of the dataset, NELA-GT-2022 includes outlet-level veracity labels from Media Bias/Fact Check and tweets embedded in collected news articles. The NELA-GT-2022 dataset can be found at: https://doi.org/10.7910/DVN/AMCV2H


Characterizing Financial Market Coverage using Artificial Intelligence

Tshimula, Jean Marie, Nkashama, D'Jeff K., Owusu, Patrick, Frappier, Marc, Tardif, Pierre-Martin, Kabanza, Froduald, Brun, Armelle, Patenaude, Jean-Marc, Wang, Shengrui, Chikhaoui, Belkacem

arXiv.org Artificial Intelligence

This paper scrutinizes a database of over 4900 YouTube videos to characterize financial market coverage. Financial market coverage generates a large number of videos. Therefore, watching these videos to derive actionable insights could be challenging and complex. In this paper, we leverage Whisper, a speech-to-text model from OpenAI, to generate a text corpus of market coverage videos from Bloomberg and Yahoo Finance. We employ natural language processing to extract insights regarding language use from the market coverage. Moreover, we examine the prominent presence of trending topics and their evolution over time, and the impacts that some individuals and organizations have on the financial market. Our characterization highlights the dynamics of the financial market coverage and provides valuable insights reflecting broad discussions regarding recent financial events and the world economy.


AI can reveal hidden bias in news media - Futurity

#artificialintelligence

You are free to share this article under the Attribution 4.0 International license. Artificial intelligence can help identify biases in news reporting that we wouldn't otherwise see, researchers report. For a new study, researchers got a computer program to generate news coverage of COVID-19 using headlines from Canadian Broadcast Corporation (CBC) articles as prompts. They then compared the simulated news coverage to the actual reporting at the time. The findings show that CBC coverage was less focused on the medical emergency and more positively focused on personalities and geo-politics.


What AI-generated COVID news tells us that journalists don't

#artificialintelligence

AI can help identify biases in news reporting that we wouldn't otherwise see. Researchers from McGill University have recently directed a computer program to generate news coverage of COVID-19 using headlines from CBC articles as prompts. They then compared the simulated news coverage to the actual reporting at the time and found that CBC coverage was less focused on the medical emergency and more positively focused on personalities and geo-politics. "Reporting on real-world events requires complex choices, including decisions about which events and players take center stage. By comparing what was reported with what could have been reported, our study provides perspective on the editorial choices made by news agencies," says Professor Andrew Piper of the Department of Languages, Literatures, and Cultures at McGill University.