AITopics | please rate

Collaborating Authors

please rate

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

How AI Companionship Develops: Evidence from a Longitudinal Study

Hwang, Angel Hsing-Chi, Li, Fiona, Anthis, Jacy Reese, Noh, Hayoun

arXiv.org Artificial IntelligenceOct-14-2025

The quickly growing popularity of AI companions poses risks to mental health, personal wellbeing, and social relationships. Past work has identified many individual factors that can drive human-companion interaction, but we know little about how these factors interact and evolve over time. In Study 1, we surveyed AI companion users (N = 303) to map the psychological pathway from users' mental models of the agent to parasocial experiences, social interaction, and the psychological impact of AI companions. Participants' responses foregrounded multiple interconnected variables (agency, parasocial interaction, and engagement) that shape AI companionship. In Study 2, we conducted a longitudinal study with a subset of participants (N = 110) using a new generic chatbot. Participants' perceptions of the generic chatbot significantly converged to perceptions of their own companions by Week 3. These results suggest a longitudinal model of AI companionship development and demonstrate an empirical method to study human-AI companionship.

artificial intelligence, machine learning, natural language, (23 more...)

arXiv.org Artificial Intelligence

2510.10079

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)
North America > United States > New York > New York County > New York City (0.14)
(8 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Media (0.92)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(5 more...)

Add feedback

The Pursuit of Empathy: Evaluating Small Language Models for PTSD Dialogue Support

BN, Suhas, Mahajan, Yash, Mattioli, Dominik, Sherrill, Andrew M., Arriaga, Rosa I., Wiese, Chris W., Abdullah, Saeed

arXiv.org Artificial IntelligenceSep-23-2025

This paper investigates the capacity of small language models (0.5B-5B parameters) to generate empathetic responses for individuals with PTSD. We introduce Trauma-Informed Dialogue for Empathy (TIDE), a novel dataset comprising 10,000 two-turn conversations across 500 diverse, clinically-grounded PTSD personas (https://huggingface.co/datasets/yenopoya/TIDE). Using frontier model outputs as ground truth, we evaluate eight small LLMs in zero-shot settings and after fine-tuning. Fine-tuning enhances empathetic capabilities, improving cosine similarity and perceived empathy, although gains vary across emotional scenarios and smaller models exhibit a "knowledge transfer ceiling." As expected, Claude Sonnet 3.5 consistently outperforms all models, but surprisingly, the smaller models often approach human-rated empathy levels. Demographic analyses showed that older adults favored responses that validated distress before offering support (p = .004), while graduate-educated users preferred emotionally layered replies in specific scenarios. Gender-based differences were minimal (p > 0.15), suggesting the feasibility of broadly empathetic model designs. This work offers insights into building resource-efficient, emotionally intelligent systems for mental health support.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.15065

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Washington > King County > Seattle (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.67)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Assessing Similarity Measures for the Evaluation of Human-Robot Motion Correspondence

Dietzel, Charles, Martin, Patrick J.

arXiv.org Artificial IntelligenceDec-6-2024

One key area of research in Human-Robot Interaction is solving the human-robot correspondence problem, which asks how a robot can learn to reproduce a human motion demonstration when the human and robot have different dynamics and kinematic structures. Evaluating these correspondence problem solutions often requires the use of qualitative surveys that can be time consuming to design and administer. Additionally, qualitative survey results vary depending on the population of survey participants. In this paper, we propose the use of heterogeneous time-series similarity measures as a quantitative evaluation metric for evaluating motion correspondence to complement these qualitative surveys. To assess the suitability of these measures, we develop a behavioral cloning-based motion correspondence model, and evaluate it with a qualitative survey as well as quantitative measures. By comparing the resulting similarity scores with the human survey results, we identify Gromov Dynamic Time Warping as a promising quantitative measure for evaluating motion correspondence.

artificial intelligence, machine learning, similarity measure, (17 more...)

arXiv.org Artificial Intelligence

2412.0482

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Virginia > Richmond (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > New Finding (0.46)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Humanoid Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Overconfidence is Key: Verbalized Uncertainty Evaluation in Large Language and Vision-Language Models

Groot, Tobias, Valdenegro-Toro, Matias

arXiv.org Artificial IntelligenceMay-5-2024

Language and Vision-Language Models (LLMs/VLMs) have revolutionized the field of AI by their ability to generate human-like text and understand images, but ensuring their reliability is crucial. This paper aims to evaluate the ability of LLMs (GPT4, GPT-3.5, LLaMA2, and PaLM 2) and VLMs (GPT4V and Gemini Pro Vision) to estimate their verbalized uncertainty via prompting. We propose the new Japanese Uncertain Scenes (JUS) dataset, aimed at testing VLM capabilities via difficult queries and object counting, and the Net Calibration Error (NCE) to measure direction of miscalibration. Results show that both LLMs and VLMs have a high calibration error and are overconfident most of the time, indicating a poor capability for uncertainty estimation. Additionally we develop prompts for regression tasks, and we show that VLMs have poor calibration when producing mean/standard deviation and 95% confidence intervals.

accuracy, correct answer, please rate, (16 more...)

arXiv.org Artificial Intelligence

2405.02917

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.30)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.05)
Asia > Japan > Honshū > Chūgoku > Hiroshima Prefecture > Hiroshima (0.05)
(6 more...)

Genre: Research Report > New Finding (1.00)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CMMD: Contrastive Multi-Modal Diffusion for Video-Audio Conditional Modeling

Yang, Ruihan, Gamper, Hannes, Braun, Sebastian

arXiv.org Artificial IntelligenceDec-8-2023

We introduce a multi-modal diffusion model tailored for the bi-directional conditional generation of video and audio. Recognizing the importance of accurate alignment between video and audio events in multi-modal generation tasks, we propose a joint contrastive training loss to enhance the synchronization between visual and auditory occurrences. Our research methodology involves conducting comprehensive experiments on multiple datasets to thoroughly evaluate the efficacy of our proposed model. The assessment of generation quality and alignment performance is carried out from various angles, encompassing both objective and subjective metrics. Our findings demonstrate that the proposed model outperforms the baseline, substantiating its effectiveness and efficiency. Notably, the incorporation of the contrastive loss results in improvements in audio-visual alignment, particularly in the high-correlation video-to-audio generation task. These results indicate the potential of our proposed model as a robust solution for improving the quality and alignment of multi-modal generation, thereby contributing to the advancement of video and audio conditional generation systems.

alignment, diffusion model, video, (11 more...)

arXiv.org Artificial Intelligence

2312.05412

Country:

North America > United States > California > Orange County > Irvine (0.14)
North America > United States > Washington > King County > Redmond (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry: Media (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision (0.95)

Add feedback

Large Language Models are biased to overestimate profoundness

Herrera-Berg, Eugenio, Browne, Tomás Vergara, León-Villagrá, Pablo, Vives, Marc-Lluís, Calderon, Cristian Buc

arXiv.org Artificial IntelligenceOct-22-2023

Recent advancements in natural language processing by large language models (LLMs), such as GPT-4, have been suggested to approach Artificial General Intelligence. And yet, it is still under dispute whether LLMs possess similar reasoning abilities to humans. This study evaluates GPT-4 and various other LLMs in judging the profoundness of mundane, motivational, and pseudo-profound statements. We found a significant statement-to-statement correlation between the LLMs and humans, irrespective of the type of statements and the prompting technique used. However, LLMs systematically overestimate the profoundness of nonsensical statements, with the exception of Tk-instruct, which uniquely underestimates the profoundness of statements. Only few-shot learning prompts, as opposed to chain-of-thought prompting, draw LLMs ratings closer to humans. Furthermore, this work provides insights into the potential biases induced by Reinforcement Learning from Human Feedback (RLHF), inducing an increase in the bias to overestimate the profoundness of statements.

5-point scale, llm, profoundness, (12 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2023.emnlp-main.599

2310.14422

Country:

South America > Chile (0.04)
South America > Brazil (0.04)
North America > United States > Florida > Hillsborough County > University (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (0.69)
Research Report > New Finding (0.47)

Industry: Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Personality Traits in Large Language Models

Serapio-García, Greg, Safdari, Mustafa, Crepy, Clément, Sun, Luning, Fitz, Stephen, Romero, Peter, Abdulhai, Marwa, Faust, Aleksandra, Matarić, Maja

arXiv.org Artificial IntelligenceSep-21-2023

The advent of large language models (LLMs) has revolutionized natural language processing, enabling the generation of coherent and contextually relevant human-like text. As LLMs increasingly power conversational agents used by the general public world-wide, the synthetic personality embedded in these models, by virtue of training on large amounts of human data, is becoming increasingly important. Since personality is a key factor determining the effectiveness of communication, we present a comprehensive method for administering and validating personality tests on widely-used LLMs, as well as for shaping personality in the generated text of such LLMs. Applying this method, we found: 1) personality measurements in the outputs of some LLMs under specific prompting configurations are reliable and valid; 2) evidence of reliability and validity of synthetic LLM personality is stronger for larger and instruction fine-tuned models; and 3) personality in LLM outputs can be shaped along desired dimensions to mimic specific human personality profiles. We discuss application and ethical implications of the measurement and shaping method, in particular regarding responsible AI.

llm, personality, validity, (17 more...)

arXiv.org Artificial Intelligence

2307.00184

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(10 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)

Industry:

Media (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Education (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Accountable Textual-Visual Chat Learns to Reject Human Instructions in Image Re-creation

Zhang, Zhiwei, Liu, Yuliang

arXiv.org Artificial IntelligenceJun-14-2023

The recent success of ChatGPT and GPT-4 has drawn widespread attention to multimodal dialogue systems. However, the academia community lacks a dataset that can validate the multimodal generation capabilities of Visual Language Models (VLMs) in textual-visual chat tasks. In this paper, we construct two new multimodal datasets: the synthetic CLEVR-ATVC dataset (620K) and the manually pictured Fruit-ATVC dataset (50K), both featuring visual and text-based inputs and outputs. Additionally, to enable the multimodal system to reject human requests (i.e., demonstrate accountability), as in language-based ChatGPT conversations, we develop and incorporate specific rules into the datasets as supervisory signals. This allows the trained VLM to provide a yes or no answer after visual and textual reasoning, accompanied by a language explanation as to why the human instruction cannot be excuted. In our method, we propose a two-state training procedure to train the image auto-encoder and auto-regressive transformer from scratch. The first state involves a discrete variational autoencoder (dVAE) to compress each image into short tokens, which are then concatenated with text tokens as a single data stream to be fed into the decoder-based transformer for generating visual re-creation and textual feedback in the second state. We provide comprehensive analyses of experimental results in terms of re-created image quality, answer accuracy, and the model behavior when faced with uncertainty and imperfect user queries. We hope our explorations and findings contribute valuable insights regarding the accountability of textual-visual generative models.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2303.05983

Country:

North America > United States (0.14)
Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback