AITopics | average rating

Collaborating Authors

average rating

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Rho-Perfect: Correlation Ceiling For Subjective Evaluation Datasets

Cumlin, Fredrik

arXiv.org Machine LearningFeb-10-2026

ABSTRACT Subjective ratings contain inherent noise that limits the model-human correlation, but this reliability issue is rarely quantified. In this paper, we present ρ-Perfect, a practical estimation of the highest achievable correlation of a model on subjectively rated datasets. We define ρ-Perfect to be the correlation between a perfect predictor and human ratings, and derive an estimate of the value based on heteroscedastic noise scenarios, a common occurrence in subjectively rated datasets. We show that ρ-Perfect squared estimates test-retest correlation and use this to validate the estimate. We demonstrate the use of ρ-Perfect on a speech quality dataset and show how the measure can distinguish between model limitations and data quality issues.

artificial intelligence, correlation, machine learning, (17 more...)

arXiv.org Machine Learning

2602.08552

Country:

Europe > Sweden (0.40)
North America > United States > Iowa > Johnson County > Iowa City (0.14)
North America > United States > Massachusetts > Middlesex County > Reading (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.95)

Add feedback

019fa4fdf1c04cf73ba25aa2223769cd-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 07:19:49 GMT

cfd ebug, maverick rating, recommendation accuracy, (14 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > Queensland (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry:

Media > Film (0.68)
Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Spiral of Silence in Large Language Model Agents

Zhong, Mingze, Fang, Meng, Shi, Zijing, Huang, Yuxuan, Zheng, Shunfeng, Du, Yali, Chen, Ling, Wang, Jun

arXiv.org Artificial IntelligenceOct-9-2025

The Spiral of Silence (SoS) theory holds that individuals with minority views often refrain from speaking out for fear of social isolation, enabling majority positions to dominate public discourse. When the 'agents' are large language models (LLMs), however, the classical psychological explanation is not directly applicable, since SoS was developed for human societies. This raises a central question: can SoS-like dynamics nevertheless emerge from purely statistical language generation in LLM collectives? We propose an evaluation framework for examining SoS in LLM agents. Specifically, we consider four controlled conditions that systematically vary the availability of 'History' and 'Persona' signals. Opinion dynamics are assessed using trend tests such as Mann-Kendall and Spearman's rank, along with concentration measures including kurtosis and interquartile range. Experiments across open-source and closed-source models show that history and persona together produce strong majority dominance and replicate SoS patterns; history signals alone induce strong anchoring; and persona signals alone foster diverse but uncorrelated opinions, indicating that without historical anchoring, SoS dynamics cannot emerge. The work bridges computational sociology and responsible AI design, highlighting the need to monitor and mitigate emergent conformity in LLM-agent systems.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2510.0236

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
Europe > United Kingdom > England > Merseyside > Liverpool (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.93)

Industry:

Leisure & Entertainment (0.70)
Media > Film (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Add feedback

Trading Personalization for Accuracy: Data Debugging in Collaborative Filtering Long Chen

Neural Information Processing SystemsOct-1-2025, 21:48:49 GMT

Collaborative filtering has been widely used in recommender systems.

artificial intelligence, cfd ebug, maverick rating, (15 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > Queensland (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > Canada (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry:

Media > Film (0.68)
Leisure & Entertainment (0.68)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)

Add feedback

Datasets for Navigating Sensitive Topics in Recommendation Systems

Kovacs, Amelia, Chee, Jerry, Kazemian, Kimia, Dean, Sarah

arXiv.org Artificial IntelligenceSep-10-2025

Personalized AI systems, from recommendation systems to chatbots, are a prevalent method for distributing content to users based on their learned preferences. However, there is growing concern about the adverse effects of these systems, including their potential tendency to expose users to sensitive or harmful material, negatively impacting overall well-being. To address this concern quantitatively, it is necessary to create datasets with relevant sensitivity labels for content, enabling researchers to evaluate personalized systems beyond mere engagement metrics. To this end, we introduce two novel datasets that include a taxonomy of sensitivity labels alongside user-content ratings: one that integrates MovieLens rating data with content warnings from the Does the Dog Die? community ratings website, and another that combines fan-fiction interaction data and user-generated warnings from Archive of Our Own.

artificial intelligence, dataset, survey article, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3701716.3715294

2509.07269

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New York > Tompkins County > Ithaca (0.05)
North America > United States > New York > New York County > New York City (0.04)

Genre:

Research Report (0.64)
Overview > Growing Problem (0.34)

Industry:

Media (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)

Add feedback

Social Influence Distorts Ratings in Online Interfaces

Kontalexi, Marina, Gelastopoulos, Alexandros, Analytis, Pantelis P.

arXiv.org Artificial IntelligenceFeb-27-2025

Theoretical work on sequential choice and large-scale experiments in online ranking and voting systems has demonstrated that social influence can have a drastic impact on social and technological systems. Yet, the effect of social influence on online rating systems remains understudied and the few existing contributions suggest that online ratings would self-correct given enough users. Here, we propose a new framework for studying the effect of social influence on online ratings. We start from the assumption that people are influenced linearly by the observed average rating, but postulate that their propensity to be influenced varies. When the weight people assign to the observed average depends only on their own latent rating, the resulting system is linear, but the long-term rating may substantially deviate from the true mean rating. When the weight people put on the observed average depends on both their own latent rating and the observed average rating, the resulting system is non-linear, and may support multiple equilibria, suggesting that ratings might be path-dependent and deviations dramatic. Our results highlight potential limitations in crowdsourced information aggregation and can inform the design of more robust online rating systems.

average rating, latent rating, social influence, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3701716.3715592

2502.19861

Country:

Oceania > Australia > New South Wales > Sydney (0.05)
Europe > Denmark > Southern Denmark (0.05)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Communications > Social Media > Crowdsourcing (0.67)

Add feedback

A Measure of the System Dependence of Automated Metrics

von Däniken, Pius, Deriu, Jan, Cieliebak, Mark

arXiv.org Artificial IntelligenceDec-28-2024

Automated metrics for Machine Translation have made significant progress, with the goal of replacing expensive and time-consuming human evaluations. These metrics are typically assessed by their correlation with human judgments, which captures the monotonic relationship between human and metric scores. However, we argue that it is equally important to ensure that metrics treat all systems fairly and consistently. In this paper, we introduce a method to evaluate this aspect.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2412.03152

Country:

Asia > Singapore (0.05)
North America > United States (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

End-to-end Training for Recommendation with Language-based User Profiles

Gao, Zhaolin, Zhou, Joyce, Dai, Yijia, Joachims, Thorsten

arXiv.org Artificial IntelligenceOct-24-2024

Many online platforms maintain user profiles for personalization. Unfortunately, these profiles are typically not interpretable or easily modifiable by the user. To remedy this shortcoming, we explore natural language-based user profiles, as they promise enhanced transparency and scrutability of recommender systems. While existing work has shown that language-based profiles from standard LLMs can be effective, such generalist LLMs are unlikely to be optimal for this task. In this paper, we introduce LangPTune, the first end-to-end learning method for training LLMs to produce language-based user profiles that optimize recommendation effectiveness. Through comprehensive evaluations of LangPTune across various training configurations and benchmarks, we demonstrate that our approach significantly outperforms existing profile-based methods. In addition, it approaches performance levels comparable to state-of-the-art, less transparent recommender systems, providing a robust and interpretable alternative to conventional systems. Finally, we validate the relative interpretability of these language-based user profiles through user studies involving crowdworkers and GPT-4-based evaluations. Implementation of LangPTune can be found at https://github.com/ZhaolinGao/LangPTune.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.1887

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
(15 more...)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > Experimental Study (0.46)
Research Report > New Finding (0.45)

Industry:

Media > Television (1.00)
Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Performance of Recent Large Language Models for a Low-Resourced Language

Jayakody, Ravindu, Dias, Gihan

arXiv.org Artificial IntelligenceJul-31-2024

Large Language Models (LLMs) have shown significant advances in the past year. In addition to new versions of GPT and Llama, several other LLMs have been introduced recently. Some of these are open models available for download and modification. Although multilingual large language models have been available for some time, their performance on low-resourced languages such as Sinhala has been poor. We evaluated four recent LLMs on their performance directly in the Sinhala language, and by translation to and from English. We also evaluated their fine-tunability with a small amount of fine-tuning data. Claude and GPT 4o perform well out-of-the-box and do significantly better than previous versions. Llama and Mistral perform poorly but show some promise of improvement with fine tuning.

llm, sinhala, translation, (13 more...)

arXiv.org Artificial Intelligence

2407.2133

Country:

Asia > Sri Lanka (0.05)
North America > United States (0.04)
Asia > Middle East > UAE (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Understanding Subjectivity through the Lens of Motivational Context in Model-Generated Image Satisfaction

Dutta, Senjuti, Chen, Sherol, Mak, Sunny, Ahmad, Amnah, Collins, Katherine, Butryna, Alena, Ramachandran, Deepak, Dvijotham, Krishnamurthy, Pavlick, Ellie, Rajakumar, Ravi

arXiv.org Artificial IntelligenceFeb-26-2024

Image generation models are poised to become ubiquitous in a range of applications. These models are often fine-tuned and evaluated using human quality judgments that assume a universal standard, failing to consider the subjectivity of such tasks. To investigate how to quantify subjectivity, and the scale of its impact, we measure how assessments differ among human annotators across different use cases. Simulating the effects of ordinarily latent elements of annotators subjectivity, we contrive a set of motivations (t-shirt graphics, presentation visuals, and phone background images) to contextualize a set of crowdsourcing tasks. Our results show that human evaluations of images vary within individual contexts and across combinations of contexts. Three key factors affecting this subjectivity are image appearance, image alignment with text, and representation of objects mentioned in the text. Our study highlights the importance of taking individual users and contexts into account, both when building and evaluating generative models

annotator, motivational context, subjectivity, (16 more...)

arXiv.org Artificial Intelligence

2403.05576

Country:

North America > United States > Tennessee > Knox County > Knoxville (0.14)
North America > United States > California > Santa Clara County > Mountain View (0.05)
North America > United States > Virginia (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.68)
Information Technology (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)

Add feedback