AITopics | Personal

Collaborating Authors

Personal

Probing the Gaps in ChatGPT Live Video Chat for Real-World Assistance for People who are Blind or Visually Impaired

Chang, Ruei-Che, Natalie, Rosiana, Xu, Wenqian, Yap, Jovan Zheng Feng, Guo, Anhong

arXiv.org Artificial IntelligenceAug-6-2025

Recent advancements in large multimodal models have provided blind or visually impaired (BVI) individuals with new capabilities to interpret and engage with the real world through interactive systems that utilize live video feeds. However, the potential benefits and challenges of such capabilities to support diverse real-world assistive tasks remain unclear. In this paper, we present findings from an exploratory study with eight BVI participants. Participants used ChatGPT's Advanced Voice with Video, a state-of-the-art live video AI released in late 2024, in various real-world scenarios, from locating objects to recognizing visual landmarks, across unfamiliar indoor and outdoor environments. Our findings indicate that current live video AI effectively provides guidance and answers for static visual scenes but falls short in delivering essential live descriptions required in dynamic situations. Despite inaccuracies in spatial and distance information, participants leveraged the provided visual information to supplement their mobility strategies. Although the system was perceived as human-like due to high-quality voice interactions, assumptions about users' visual abilities, hallucinations, generic responses, and a tendency towards sycophancy led to confusion, distrust, and potential risks for BVI users. Based on the results, we discuss implications for assistive video AI agents, including incorporating additional sensing capabilities for real-world use, determining appropriate intervention timing beyond turn-taking interactions, and addressing ecological and safety concerns.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2508.03651

Country:

Europe (1.00)
North America > United States > California (0.46)
North America > United States > New York > New York County > New York City (0.16)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Personal > Interview (0.93)

Industry:

Health & Medicine > Therapeutic Area (0.94)
Consumer Products & Services > Food, Beverage, Tobacco & Cannabis (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AnnoSense: A Framework for Physiological Emotion Data Collection in Everyday Settings for AI

Singh, Pragya, Gupta, Ankush, Kumar, Mohan, Singh, Pushpendra

arXiv.org Artificial IntelligenceAug-6-2025

Emotional and mental well-being are vital components of quality of life, and with the rise of smart devices like smartphones, wearables, and artificial intelligence (AI), new opportunities for monitoring emotions in everyday settings have emerged. However, for AI algorithms to be effective, they require high-quality data and accurate annotations. As the focus shifts towards collecting emotion data in real-world environments to capture more authentic emotional experiences, the process of gathering emotion annotations has become increasingly complex. This work explores the challenges of everyday emotion data collection from the perspectives of key stakeholders. We collected 75 survey responses, performed 32 interviews with the public, and 3 focus group discussions (FGDs) with 12 mental health professionals. The insights gained from a total of 119 stakeholders informed the development of our framework, AnnoSense, designed to support everyday emotion data collection for AI. This framework was then evaluated by 25 emotion AI experts for its clarity, usefulness, and adaptability. Lastly, we discuss the potential next steps and implications of AnnoSense for future research in emotion AI, highlighting its potential to enhance the collection and analysis of emotion data in real-world contexts.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3749519

2508.0268

Country:

Europe (1.00)
Asia (1.00)
North America > United States > New York (0.29)
North America > United States > Colorado (0.27)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)
(2 more...)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Consumer Health (1.00)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Communications > Mobile (1.00)
(5 more...)

Add feedback

A glimpse into OpenAI's largest ambitions

MIT Technology ReviewAug-5-2025, 09:00:00 GMT

As Will points out, there were two recent wins for OpenAI in its efforts to build AI that outcompetes humans. Its models took second place at a top-level coding competition and--alongside those from Google DeepMind--achieved gold-medal-level results in the 2025 International Math Olympiad. People who believe that AI doesn't pose genuine competition to human-level intelligence might actually take some comfort in that. AI is good at the mathematical and analytical, which are on full display in olympiads and coding competitions. That doesn't mean it's any good at grappling with the messiness of human emotions, making hard decisions, or creating art that resonates with anyone. But that distinction--between machine-like reasoning and the ability to think creatively--is not one OpenAI's heads of research are inclined to make.

intelligence, largest ambition, openai, (1 more...)

MIT Technology Review

Genre: Personal (0.41)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.95)

Add feedback

How Supercomputing Will Evolve, According to Jack Dongarra

WIREDAug-5-2025, 09:00:00 GMT

High-performance supercomputing--once the exclusive domain of scientific research--is now a strategic resource for training increasingly complex artificial intelligence models. This convergence of AI and HPC is redefining not only these technologies, but also the ways in which knowledge is produced, and takes a strategic position in the global landscape. To discuss how HPC is evolving, in July WIRED caught up with Jack Dongarra, a US computer scientist who has been a key contributor to the development of HPC software over the past four decades--so much so that in 2021 he earned the prestigious Turing Award. The meeting took place at the 74th Nobel Laureate Meeting in Lindau, Germany, which brought together dozens of Nobel laureates as well as more than 600 emerging scientists from around the world. This interview has been edited for length and clarity.

approximation, jack dongarra, traditional technique, (3 more...)

WIRED

Country: Europe > Germany (0.27)

Genre: Personal > Interview (0.96)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

It's High Time: A Survey of Temporal Question Answering

Piryani, Bhawna, Abdallah, Abdelrahman, Mozafari, Jamshid, Anand, Avishek, Jatowt, Adam

arXiv.org Artificial IntelligenceAug-5-2025

Time plays a critical role in how information is generated, retrieved, and interpreted. In this survey, we provide a comprehensive overview of Temporal Question Answering (TQA), a research area that focuses on answering questions involving temporal constraints or context. As the amount of time-stamped content from sources like news articles, web archives, and knowledge bases increases, systems must address challenges such as detecting temporal intent, normalizing time expressions, ordering events, and reasoning over evolving or ambiguous facts. We focus on recent advances in TQA enabled by neural architectures, especially transformer-based models and Large Language Models (LLMs), highlighting progress in temporal language modeling, retrieval-augmented generation (RAG), and temporal reasoning. We also discuss benchmark datasets and evaluation strategies designed to test temporal robustness, recency awareness, and generalization.

large language model, machine learning, question answering, (17 more...)

arXiv.org Artificial Intelligence

2505.20243

Country:

North America > United States (1.00)
Europe (1.00)
Asia (1.00)

Genre:

Research Report (1.00)
Overview (1.00)
Personal > Honors (0.93)

Industry:

Law (1.00)
Government (1.00)
Health & Medicine (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Decomposing Representation Space into Interpretable Subspaces with Unsupervised Learning

Huang, Xinting, Hahn, Michael

arXiv.org Artificial IntelligenceAug-5-2025

Understanding internal representations of neural models is a core interest of mechanistic interpretability. Due to its large dimensionality, the representation space can encode various aspects about inputs. To what extent are different aspects organized and encoded in separate subspaces? Is it possible to find these ``natural'' subspaces in a purely unsupervised way? Somewhat surprisingly, we can indeed achieve this and find interpretable subspaces by a seemingly unrelated training objective. Our method, neighbor distance minimization (NDM), learns non-basis-aligned subspaces in an unsupervised manner. Qualitative analysis shows subspaces are interpretable in many cases, and encoded information in obtained subspaces tends to share the same abstract concept across different inputs, making such subspaces similar to ``variables'' used by the model. We also conduct quantitative experiments using known circuits in GPT-2; results show a strong connection between subspaces and circuit variables. We also provide evidence showing scalability to 2B models by finding separate subspaces mediating context and parametric knowledge routing. Viewed more broadly, our findings offer a new perspective on understanding model internals and building circuits.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2508.01916

Genre:

Personal > Honors (0.93)
Research Report > New Finding (0.86)

Industry: Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

A comprehensive taxonomy of hallucinations in Large Language Models

Cossio, Manuel

arXiv.org Artificial IntelligenceAug-5-2025

Large language models (LLMs) have revolutionized natural language processing, yet their propensity for hallucination, generating plausible but factually incorrect or fabricated content, remains a critical challenge. This report provides a comprehensive taxonomy of LLM hallucinations, beginning with a formal definition and a theoretical framework that posits its inherent inevitability in computable LLMs, irrespective of architecture or training. It explores core distinctions, differentiating between intrinsic (contradicting input context) and extrinsic (inconsistent with training data or reality), as well as factuality (absolute correctness) and faithfulness (adherence to input). The report then details specific manifestations, including factual errors, contextual and logical inconsistencies, temporal disorientation, ethical violations, and task-specific hallucinations across domains like code generation and multimodal applications. It analyzes the underlying causes, categorizing them into data-related issues, model-related factors, and prompt-related influences. Furthermore, the report examines cognitive and human factors influencing hallucination perception, surveys evaluation benchmarks and metrics for detection, and outlines architectural and systemic mitigation strategies. Finally, it introduces web-based resources for monitoring LLM releases and performance. This report underscores the complex, multifaceted nature of LLM hallucinations and emphasizes that, given their theoretical inevitability, future efforts must focus on robust detection, mitigation, and continuous human oversight for responsible and reliable deployment in critical applications.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2508.01781

Country: North America > United States (0.46)

Genre:

Research Report (1.00)
Overview (1.00)
Personal > Honors (0.67)

Industry:

Law (0.93)
Health & Medicine > Therapeutic Area > Immunology (0.93)
Health & Medicine > Government Relations & Public Policy (0.68)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Jim Acosta 'interviews' AI-generated avatar of deceased teenager promoting gun control message

FOX NewsAug-4-2025, 21:29:09 GMT

Jim Acosta and James Carville speculated whether President Trump will try to rig the 2026 midterms in his favor on "The Jim Acosta Show." Liberal journalist Jim Acosta "interviewed" the artificially animated avatar of deceased teenager Joaquin Oliver to promote a gun control message on Monday. Working with the gun control group Change the Ref, founded by Oliver's parents, Acosta had conversation on his Substack with an avatar created by the father of the son, who was killed in the Parkland high school shooting in 2018. He would have turned 25 on Monday. "I would like to know what your solution would be for gun violence," Acosta asked.

acosta, artificial intelligence, avatar, (13 more...)

FOX News

Country: North America > United States > Florida > Broward County > Parkland (0.05)

Genre:

Research Report (0.78)
Personal > Interview (0.32)

Industry:

Government (1.00)
Media (1.00)
Education > Health & Safety > School Safety & Security > School Violence (0.96)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.59)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

North Carolina auditor excited for 'real effect' of state-level DOGE: 'Keeping government accountable'

FOX NewsAug-3-2025, 10:10:08 GMT

EXCLUSIVE: North Carolina's state auditor said he is looking forward to making a positive impact on taxpayers by implementing a state version of Department of Government Efficiency (DOGE). In an exclusive interview with Fox News Digital, North Carolina state auditor Dave Boliek said his office would look into how the state government can be more efficient and utilize the resources it has in the "best possible way" for taxpayers. He plans on doing that through House Bill 125, a state-level DOGE initiative named after him that recently passed the legislature. "It helps to give our office and the state auditor's office more resources to take a look at efficiencies and ways to really drill down on determining a good return on investment of taxpayer dollars across North Carolina," Boliek said. "I really support the effort," he said, in part.

artificial intelligence, boliek, government accountable, (10 more...)

FOX News

Country:

North America > United States > North Carolina (1.00)
North America > United States > Texas (0.05)

Genre: Personal (0.56)

Industry: Government > Regional Government > North America Government > United States Government (0.71)

Technology: Information Technology > Artificial Intelligence (0.31)

Add feedback

Automated Feedback on Student-Generated UML and ER Diagrams Using Large Language Models

Gürtl, Sebastian, Schimetta, Gloria, Kerschbaumer, David, Liut, Michael, Steinmaurer, Alexander

arXiv.org Artificial IntelligenceAug-1-2025

UML and ER diagrams are foundational in computer science education but come with challenges for learners due to the need for abstract thinking, contextual understanding, and mastery of both syntax and semantics. These complexities are difficult to address through traditional teaching methods, which often struggle to provide scalable, personalized feedback, especially in large classes. We introduce DUET (Diagrammatic UML & ER Tutor), a prototype of an LLM-based tool, which converts a reference diagram and a student-submitted diagram into a textual representation and provides structured feedback based on the differences. It uses a multi-stage LLM pipeline to compare diagrams and generate reflective feedback. Furthermore, the tool enables analytical insights for educators, aiming to foster self-directed learning and inform instructional strategies. We evaluated DUET through semi-structured interviews with six participants, including two educators and four teaching assistants. They identified strengths such as accessibility, scalability, and learning support alongside limitations, including reliability and potential misuse. Participants also suggested potential improvements, such as bulk upload functionality and interactive clarification features. DUET presents a promising direction for integrating LLMs into modeling education and offers a foundation for future classroom integration and empirical evaluation.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2507.2347

Country:

Europe > Austria (0.30)
North America > Canada (0.28)

Genre:

Research Report (1.00)
Questionnaire & Opinion Survey (0.67)
Personal > Interview (0.34)

Industry:

Education > Curriculum > Subject-Specific Education (0.69)
Education > Educational Setting > Higher Education (0.48)
Education > Educational Technology > Educational Software > Computer Based Training (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)

Add feedback