Goto

Collaborating Authors

 news site


Publishers fear AI search summaries and chatbots mean 'end of traffic era'

The Guardian

Search traffic to news sites has already plunged by a third in one year, according to the Reuters Institute for the Study of Journalism. Search traffic to news sites has already plunged by a third in one year, according to the Reuters Institute for the Study of Journalism. Publishers fear AI search summaries and chatbots mean'end of traffic era' Media companies expect web traffic to their sites from online searches to plummet over the next three years, as AI summaries and chatbots change the way consumers use the internet. An overwhelming majority are also planning to encourage their journalists to behave more like YouTube and TikTok content creators this year, as short-form video and audio content continues to boom. The findings are drawn from a new report from the Reuters Institute for the Study of Journalism, which included the views of 280 media leaders from 51 countries.


Why Are Chatbots Parroting Russian Propaganda?

TIME - Tech

Why Are Chatbots Parroting Russian Propaganda? Welcome back to, TIME's new twice-weekly newsletter about AI. Starting today, we'll be publishing these editions both as stories on Time.com and as emails. If you're reading this in your browser, why not subscribe to have the next one delivered straight to your inbox? What to Know: Why are chatbots parroting Russian disinformation? Over the last year, as chatbots have gained the ability to search the internet before providing an answer, the likelihood that they will share false information about specific topics in the news has gone up, according to new research by NewsGuard Technologies.


Context is Key(NMF): Modelling Topical Information Dynamics in Chinese Diaspora Media

arXiv.org Artificial Intelligence

Does the People's Republic of China (PRC) interfere with European elections through ethnic Chinese diaspora media? This question forms the basis of an ongoing research project exploring how PRC narratives about European elections are represented in Chinese diaspora media, and thus the objectives of PRC news media manipulation. In order to study diaspora media efficiently and at scale, it is necessary to use techniques derived from quantitative text analysis, such as topic modelling. In this paper, we present a pipeline for studying information dynamics in Chinese media. Firstly, we present KeyNMF, a new approach to static and dynamic topic modelling using transformer-based contextual embedding models. We provide benchmark evaluations to demonstrate that our approach is competitive on a number of Chinese datasets and metrics. Secondly, we integrate KeyNMF with existing methods for describing information dynamics in complex systems. We apply this pipeline to data from five news sites, focusing on the period of time leading up to the 2024 European parliamentary elections. Our methods and results demonstrate the effectiveness of KeyNMF for studying information dynamics in Chinese media and lay groundwork for further work addressing the broader research questions.


Iran hackers target US officials to influence election, Microsoft says

The Guardian

Microsoft researchers said on Friday that Iran government-tied hackers tried breaking into the account of a "high-ranking official" on the US presidential campaign in June, weeks after breaching the account of a county-level US official. The breaches were part of Iranian groups' increasing attempts to influence the US presidential election in November, the researchers said in a report that did not provide any further detail on the apparent official in question. The report follows recent statements by senior US intelligence officials that they had seen Iran ramp up use of clandestine social media accounts with the aim to use them to try to sow political discord in the US. The report also reveals how Russia and China are exploiting US political polarization to advance their own divisive messaging in a consequential election year. Iran's mission to the UN in New York told Reuters in a statement that its cyber capabilities were "defensive and proportionate to the threats it faces" and that it had no plans to launch cyber-attacks.


FNDaaS: Content-agnostic Detection of Fake News sites

arXiv.org Artificial Intelligence

Automatic fake news detection is a challenging problem in misinformation spreading, and it has tremendous real-world political and social impacts. Past studies have proposed machine learning-based methods for detecting such fake news, focusing on different properties of the published news articles, such as linguistic characteristics of the actual content, which however have limitations due to the apparent language barriers. Departing from such efforts, we propose FNDaaS, the first automatic, content-agnostic fake news detection method, that considers new and unstudied features such as network and structural characteristics per news website. This method can be enforced as-a-Service, either at the ISP-side for easier scalability and maintenance, or user-side for better end-user privacy. We demonstrate the efficacy of our method using data crawled from existing lists of 637 fake and 1183 real news websites, and by building and testing a proof of concept system that materializes our proposal. Our analysis of data collected from these websites shows that the vast majority of fake news domains are very young and appear to have lower time periods of an IP associated with their domain than real news ones. By conducting various experiments with machine learning classifiers, we demonstrate that FNDaaS can achieve an AUC score of up to 0.967 on past sites, and up to 77-92% accuracy on newly-flagged ones.


GREENER: Graph Neural Networks for News Media Profiling

arXiv.org Artificial Intelligence

We study the problem of profiling news media on the Web with respect to their factuality of reporting and bias. This is an important but under-studied problem related to disinformation and "fake news" detection, but it addresses the issue at a coarser granularity compared to looking at an individual article or an individual claim. This is useful as it allows to profile entire media outlets in advance. Unlike previous work, which has focused primarily on text (e.g.,~on the text of the articles published by the target website, or on the textual description in their social media profiles or in Wikipedia), here our main focus is on modeling the similarity between media outlets based on the overlap of their audience. This is motivated by homophily considerations, i.e.,~the tendency of people to have connections to people with similar interests, which we extend to media, hypothesizing that similar types of media would be read by similar kinds of users. In particular, we propose GREENER (GRaph nEural nEtwork for News mEdia pRofiling), a model that builds a graph of inter-media connections based on their audience overlap, and then uses graph neural networks to represent each medium. We find that such representations are quite useful for predicting the factuality and the bias of news media outlets, yielding improvements over state-of-the-art results reported on two datasets. When augmented with conventionally used representations obtained from news articles, Twitter, YouTube, Facebook, and Wikipedia, prediction accuracy is found to improve by 2.5-27 macro-F1 points for the two tasks.


An AI Twitter bot that only tweets good news, with Python and GPT2

#artificialintelligence

Running AI these days is increasingly simple due to the hard work of open source contributors producing top-notch libraries out there, and research groups opening up their work so others can build on it. One key library doing that is HuggingFace's Transformers library. HuggingFace are a startup building, amongst other NLP-related products, a library and model ecosystem that allows almost anyone to quickly and easily set up AI-powered chat bots that can consume or produce natural language. In this post, I'll demonstrate how I used this library to produce a Twitter bot that is only tweeting made-up (and slightly quirky) good news This blog post isn't meant to explain any theory, but for those who aren't familiar, the easiest way to explain this kind of AI, is they're sophisticated pattern recognition systems. If you feed it enough data, it can build up an ability to recognize the patterns in the english language, to the extent that if you ask it to repeat the pattern, not only will it generate mostly correct English grammar, it might also from time to time generate a coherent sentence!


Identifying Sponsored Content in News Sites With Machine Learning

#artificialintelligence

Researchers from the Netherlands have developed a new machine learning method that's capable of distinguishing sponsored or otherwise paid content within news platforms, to an accuracy of more than 90%, in response to growing interest from advertisers in'native' advertising formats that are difficult to distinguish from'real' journalistic output. The new paper, titled Distinguishing Commercial from Editorial Content in News, comes from researchers at Leiden University. The authors observe that though more serious publications, which can more easily dictate terms to advertisers, will make a reasonable effort to distinguish'partner content' from the general run of news and analysis, the standards are slowly but inexorably shifting to increased integration between editorial and commercial teams on an outlet, which they consider an alarming and negative trend. 'The ability to disguise content, willingly or unwillingly, and the probability that advertorials are not recognized as such even if properly labelled is significant. Marketers call it native [advertising] for a reason.'


News_headlines_web_scrapper

#artificialintelligence

Hey guys! this blog is about a mini project of scraping the contents from the websites. I hope you enjoy this article. We are using Beautiful soup to scrape the text data. Beautiful Soup is a Python package for parsing HTML and XML documents. It creates a parse tree for parsed pages that can be used to extract data from HTML, which is useful for web scraping.


Microsoft's robot editor confuses mixed-race Little Mix singers

The Guardian

Microsoft's decision to replace human journalists with robots has backfired, after the tech company's artificial intelligence software illustrated a news story about racism with a photo of the wrong mixed-race member of the band Little Mix. A week after the Guardian revealed plans to fire the human editors who run MSN.com and replace them with Microsoft's artificial intelligence code, an early rollout of the software resulted in a story about the singer Jade Thirlwall's personal reflections on racism being illustrated with a picture of her fellow band member Leigh-Anne Pinnock. Thirlwall, who attended a recent Black Lives Matter protest in London, criticised MSN on Friday, saying she was sick of "ignorant" media making such mistakes. She posted on Instagram: "@MSN If you're going to copy and paste articles from other accurate media outlets, you might want to make sure you're using an image of the correct mixed race member of the group." "This shit happens to @leighannepinnock and I ALL THE TIME that it's become a running joke," she said.