AITopics

2508.11133

Country:

Europe > Monaco (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Government > Regional Government (1.00)
Media (0.93)
Leisure & Entertainment (0.92)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Goren, Yaniv, Cohen, Yuval, Apartsin, Alexander, Aperstein, Yehudit

Beyond Words: Interjection Classification for Improved Human-Computer Interaction

arXiv.org Artificial IntelligenceSep-4-2025

In the realm of human-computer interaction, fostering a natural dialogue between humans and machines is paramount. A key, often overlooked, component of this dialogue is the use of interjections such as "mmm" and "hmm". Despite their frequent use to express agreement, hesitation, or requests for information, these interjections are typically dismissed as "non-words" by Automatic Speech Recognition (ASR) engines. Addressing this gap, we introduce a novel task dedicated to interjection classification, a pioneer in the field to our knowledge. This task is challenging due to the short duration of interjection signals and significant inter- and intra-speaker variability. In this work, we present and publish a dataset of interjection signals collected specifically for interjection classification. We employ this dataset to train and evaluate a baseline deep learning model. To enhance performance, we augment the training dataset using techniques such as tempo and pitch transformation, which significantly improve classification accuracy, making models more robust. The interjection dataset, a Python library for the augmentation pipeline, baseline model, and evaluation scripts, are available to the research community.

artificial intelligence, machine learning, natural language, (16 more...)

2509.03181

Country:

Asia > Middle East > Israel (0.14)
Europe > Spain (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Media (0.46)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
(3 more...)

da Fonseca, Henrique Correia, Fernandes, António, Song, Zhao, Cimpeanu, Theodor, Balabanova, Nataliya, Bashir, Adeela, Bova, Paolo, Buscemi, Alessio, Di Stefano, Alessandro, Duong, Manh Hong, Domingos, Elias Fernandez, Ogbo, Ndidi Bianca, Powers, Simon T., Proverbio, Daniele, Shamszaman, Zia Ush, Santos, Fernando P., Han, The Anh, Krellner, Marcus

Can Media Act as a Soft Regulator of Safe AI Development? A Game Theoretical Analysis

arXiv.org Artificial IntelligenceSep-4-2025

When developers of artificial intelligence (AI) products need to decide between profit and safety for the users, they likely choose profit. Untrustworthy AI technology must come packaged with tangible negative consequences. Here, we envisage those consequences as the loss of reputation caused by media coverage of their misdeeds, disseminated to the public. We explore whether media coverage has the potential to push AI creators into the production of safe products, enabling widespread adoption of AI technology. We created artificial populations of self-interested creators and users and studied them through the lens of evolutionary game theory. Our results reveal that media is indeed able to foster cooperation between creators and users, but not always. Cooperation does not evolve if the quality of the information provided by the media is not reliable enough, or if the costs of either accessing media or ensuring safety are too high. By shaping public perception and holding developers accountable, media emerges as a powerful soft regulator -- guiding AI safety even in the absence of formal government oversight.

large language model, machine learning, natural language, (21 more...)

2509.0265

Country:

Europe (1.00)
North America > United States (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Media > News (1.00)
Government (0.94)
Law (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.69)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.68)
(2 more...)

FOX NewsSep-3-2025, 10:00:40 GMT

AI video tech fast-tracks humanoid robot training

Fox News Flash top headlines are here. Check out whats clicking on Foxnews.com. One of the biggest hurdles in developing humanoid robots is the sheer amount of training data required. Teaching machines to act like humans demands massive video datasets. Collecting that data is expensive, time-consuming and difficult to scale.

artificial intelligence, robot, vidar, (10 more...)

FOX News

Country: Asia > China (0.05)

Industry:

Media > News (0.54)
Health & Medicine (0.52)

Technology:

Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.68)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.43)

SlateSep-3-2025, 01:30:00 GMT

It's Always Been Our Meanest Sci-Fi Franchise--and Our Most Honest

Alien: Earth begins where most Alien stories end: with a crew of blue-collar workers realizing that they are, and have always been, doomed. Deemed expendable by their employers over the monsters in the cargo hold (at least the crew of the USCSS Maginot, unlike the Nostromo, knew the monsters were the mission), they are made mortally aware of their place at the bottom of several food chains at once. With the FX show's fifth episode, cheekily titled "In Space, No One …," creator Noah Hawley takes us back to the Maginot's corridors to give viewers a rendition of Alien in miniature, retrofitting the sturdy bones of Ridley Scott's seminal film to his own ends. This may sound like a cynical enterprise, but it's par for the course for Alien. As Slate's own Sam Adams has noted, the series is Hollywood's greatest non-franchise, a collection of films (and comic books and video games) constantly remixing a few primary colors into compelling new shades.

alien, artificial intelligence, meanest sci-fi franchise, (14 more...)

Slate

Industry:

Leisure & Entertainment (1.00)
Media > Film (0.49)
Media > Television (0.40)

Technology: Information Technology > Artificial Intelligence (0.50)

Al JazeeraSep-3-2025, 00:29:11 GMT

Russia-Ukraine war: List of key events, day 1,287

Russian drone attacks and shelling killed three people and injured five others in Ukraine's Dnipropetrovsk region, Governor Serhiy Lysak wrote on Telegram. Two people were killed in Russian attacks on the Polohivskyi district, as Russian forces launched 578 attacks on 18 settlements in Ukraine's Zaporizhia region, Governor Ivan Fedorov said. Separate Russian attacks also killed one person in Kherson, one person in the Kyiv region and one person in Donetsk, local officials reported, according to the Kyiv Independent news outlet. A Ukrainian drone injured three people in the village of Proletarsky, in Russia's Belgorod region, Governor Vyacheslav Gladkov said. Russian forces seized the Ukrainian settlement of Fedorivka in the Donetsk region, Russian state news agency TASS reported, citing the Russian Ministry of Defence.

artificial intelligence, ministry, russia, (14 more...)

Al Jazeera

Country:

Asia > Russia (1.00)
Europe > Ukraine > Kyiv Oblast > Kyiv (0.52)
Europe > Ukraine > Donetsk Oblast > Donetsk (0.52)
(3 more...)

Industry:

Media > News (0.99)
Government > Regional Government > Europe Government > Russia Government (0.84)
Government > Regional Government > Asia Government > Russia Government (0.84)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)

Truong, Quang-Trung, Wong, Yuk-Kwan, Dang, Vo Hoang Kim Tuyen, Gotama, Rinaldi, Nguyen, Duc Thanh, Yeung, Sai-Kit

MSC: A Marine Wildlife Video Dataset with Grounded Segmentation and Clip-Level Captioning

Marine videos present significant challenges for video understanding due to the dynamics of marine objects and the surrounding environment, camera motion, and the complexity of underwater scenes. Existing video captioning datasets, typically focused on generic or human-centric domains, often fail to generalize to the complexities of the marine environment and gain insights about marine life. To address these limitations, we propose a two-stage marine object-oriented video captioning pipeline. We introduce a comprehensive video understanding benchmark that leverages the triplets of video, text, and segmentation masks to facilitate visual grounding and captioning, leading to improved marine video understanding and analysis, and marine video generation. Additionally, we highlight the effectiveness of video splitting in order to detect salient object transitions in scene changes, which significantly enrich the semantics of captioning content. Our dataset and code have been released at https://msc.hkustvgd.com.

large language model, machine learning, natural language, (14 more...)

doi: 10.1145/3746027.3758198

2508.04549

Country:

Asia (0.69)
North America > United States (0.28)

Genre: Research Report (0.40)

Industry: Media > Photography (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Deep Binding of Language Model Virtual Personas: a Study on Approximating Political Partisan Misperceptions

Kang, Minwoo, Moon, Suhong, Lee, Seung Hyeong, Raj, Ayush, Suh, Joseph, Chan, David M., Canny, John

Large language models (LLMs) are increasingly capable of simulating human behavior, offering cost-effective ways to estimate user responses to various surveys and polls. However, the questions in these surveys usually reflect socially understood attitudes: the patterns of attitudes of old/young, liberal/conservative, as understood by both members and non-members of those groups. It is not clear whether the LLM binding is \emph{deep}, meaning the LLM answers as a member of a particular in-group would, or \emph{shallow}, meaning the LLM responds as an out-group member believes an in-group member would. To explore this difference, we use questions that expose known in-group/out-group biases. This level of fidelity is critical for applying LLMs to various political science studies, including timely topics on polarization dynamics, inter-group conflict, and democratic backsliding. To this end, we propose a novel methodology for constructing virtual personas with synthetic user "backstories" generated as extended, multi-turn interview transcripts. This approach is justified by the theory of \emph{narrative identity} which argues that personality at the highest level is \emph{constructed} from self-narratives. Our generated backstories are longer, rich in detail, and consistent in authentically describing a singular individual, compared to previous methods. We show that virtual personas conditioned on our backstories closely replicate human response distributions (up to an 87% improvement as measured by Wasserstein Distance) and produce effect sizes that closely match those observed in the original studies of in-group/out-group biases. Altogether, our work extends the applicability of LLMs beyond estimating socially understood responses, enabling their use in a broader range of human studies.

large language model, machine learning, natural language, (15 more...)

2504.11673

Country: North America > United States > California (0.67)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)
Personal > Interview (0.88)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Consumer Health (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Vîrlan, Mihnea-Alexandru, Smădu, Răzvan-Alexandru, Cercel, Dumitru-Clementin, Pop, Florin, Cercel, Mihaela-Claudia

SaRoHead: Detecting Satire in a Multi-Domain Romanian News Headline Dataset

The primary goal of a news headline is to summarize an event in as few words as possible. Depending on the media outlet, a headline can serve as a means to objectively deliver a summary or improve its visibility. For the latter, specific publications may employ stylistic approaches that incorporate the use of sarcasm, irony, and exaggeration, key elements of a satirical approach. As such, even the headline must reflect the tone of the satirical main content. Current approaches for the Romanian language tend to detect the non-conventional tone (i.e., satire and clickbait) of the news content by combining both the main article and the headline. Because we consider a headline to be merely a brief summary of the main article, we investigate in this paper the presence of satirical tone in headlines alone, testing multiple baselines ranging from standard machine learning algorithms to deep learning models. Our experiments show that Bidirectional Transformer models outperform both standard machine-learning approaches and Large Language Models (LLMs), particularly when the meta-learning Reptile approach is employed.

computational linguistic, large language model, machine learning, (20 more...)

2504.07612

Country:

Europe (1.00)
Asia (0.68)
North America > United States > Minnesota (0.28)

Genre:

Overview (0.66)
Research Report > New Finding (0.47)

Industry: Media > News (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Krishnan, Aravind, Reddy, Siva, Mosbach, Marius

Not All Data Are Unlearned Equally

Machine unlearning is concerned with the task of removing knowledge learned from particular data points from a trained model. In the context of large language models (LLMs), unlearning has recently received increased attention, particularly for removing knowledge about named entities from models for privacy purposes. While various approaches have been proposed to address the unlearning problem, most existing approaches treat all data points to be unlearned equally, i.e., unlearning that Montreal is a city in Canada is treated exactly the same as unlearning the phone number of the first author of this paper. In this work, we show that this all data is equal assumption does not hold for LLM unlearning. We study how the success of unlearning depends on the frequency of the knowledge we want to unlearn in the pre-training data of a model and find that frequency strongly affects unlearning, i.e., more frequent knowledge is harder to unlearn. Additionally, we uncover a misalignment between probability and generation-based evaluations of unlearning and show that this problem worsens as models become larger. Overall, our experiments highlight the need for better evaluation practices and novel methods for LLM unlearning that take the training data of models into account.

large language model, machine learning, natural language, (20 more...)

2504.05058

Country:

North America > United States (1.00)
Europe (1.00)
Asia (1.00)
(2 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Leisure & Entertainment (1.00)
Media (0.67)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)