AITopics

This report provides practical guidance to teams designing or developing AI-enabled systems for how to promote trustworthiness during the data curation phase of development. In this report, the authors first define data, the data curation phase, and trustworthiness. We then describe a series of steps that the development team, especially data scientists, can take to build a trustworthy AI-enabled system. We enumerate the sequence of core steps and trace parallel paths where alternatives exist. The descriptions of these steps include strengths, weaknesses, preconditions, outcomes, and relevant open-source software tool implementations. In total, this report is a synthesis of data curation tools and approaches from relevant academic literature, and our goal is to equip readers with a diverse yet coherent set of practices for improving AI trustworthiness.

data quality, international joint conference, machine learning, (19 more...)

2508.14741

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota (0.27)
North America > United States > California (0.27)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.46)
Instructional Material > Course Syllabus & Notes (0.45)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Government > Military (1.00)
Transportation (0.92)
(4 more...)

Technology:

Information Technology > Data Science > Data Quality > Data Cleaning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Ezell, Carson, Roberts-Gaal, Xavier, Chan, Alan

Incident Analysis for AI Agents

As AI agents become more widely deployed, we are likely to see an increasing number of incidents: events involving AI agent use that directly or indirectly cause harm. For example, agents could be prompt-injected to exfiltrate private information or make unauthorized purchases. Structured information about such incidents (e.g., user prompts) can help us understand their causes and prevent future occurrences. However, existing incident reporting processes are not sufficient for understanding agent incidents. In particular, such processes are largely based on publicly available data, which excludes useful, but potentially sensitive, information such as an agent's chain of thought or browser history. To inform the development of new, emerging incident reporting processes, we propose an incident analysis framework for agents. Drawing on systems safety approaches, our framework proposes three types of factors that can cause incidents: system-related (e.g., CBRN training data), contextual (e.g., prompt injections), and cognitive (e.g., misunderstanding a user request). We also identify specific information that could help clarify which factors are relevant to a given incident: activity logs, system documentation and access, and information about the tools an agent uses. We provide recommendations for 1) what information incident reports should include and 2) what information developers and deployers should retain and make available to incident investigators upon request. As we transition to a world with more agents, understanding agent incidents will become increasingly crucial for managing risks.

information, large language model, machine learning, (22 more...)

2508.14231

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)

Amugongo, Lameck Mbangula, Bidwell, Nicola J, Mwatukange, Joseph

Enriching Moral Perspectives on AI: Concepts of Trust amongst Africans

The trustworthiness of AI is considered essential to the adoption and application of AI systems. However, the meaning of trust varies across industry, research and policy spaces. Studies suggest that professionals who develop and use AI regard an AI system as trustworthy based on their personal experiences and social relations at work. Studies about trust in AI and the constructs that aim to operationalise trust in AI (e.g., consistency, reliability, explainability and accountability). However, the majority of existing studies about trust in AI are situated in Western, Educated, Industrialised, Rich and Democratic (WEIRD) societies. The few studies about trust and AI in Africa do not include the views of people who develop, study or use AI in their work. In this study, we surveyed 157 people with professional and/or educational interests in AI from 25 African countries, to explore how they conceptualised trust in AI. Most respondents had links with workshops about trust and AI in Africa in Namibia and Ghana. Respondents' educational background, transnational mobility, and country of origin influenced their concerns about AI systems. These factors also affected their levels of distrust in certain AI applications and their emphasis on specific principles designed to foster trust. Respondents often expressed that their values are guided by the communities in which they grew up and emphasised communal relations over individual freedoms. They described trust in many ways, including applying nuances of Afro-relationalism to constructs in international discourse, such as reliability and reliance. Thus, our exploratory study motivates more empirical research about the ways trust is practically enacted and experienced in African social realities of AI design, use and governance.

ai system, artificial intelligence, respondent, (15 more...)

2508.14116

Country:

North America > United States (1.00)
Africa (1.00)
Europe > United Kingdom > England (0.46)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > Experimental Study (0.87)
Research Report > New Finding (0.66)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

An automatic patent literature retrieval system based on LLM-RAG

Ding, Yao, Wu, Yuqing, Ding, Ziyang

With the acceleration of technological innovation efficient retrieval and classification of patent literature have become essential for intellectual property management and enterprise RD Traditional keyword and rulebased retrieval methods often fail to address complex query intents or capture semantic associations across technical domains resulting in incomplete and lowrelevance results This study presents an automated patent retrieval framework integrating Large Language Models LLMs with RetrievalAugmented Generation RAG technology The system comprises three components: 1) a preprocessing module for patent data standardization, 2) a highefficiency vector retrieval engine leveraging LLMgenerated embeddings, and 3) a RAGenhanced query module that combines external document retrieval with contextaware response generation Evaluations were conducted on the Google Patents dataset 20062024 containing millions of global patent records with metadata such as filing date domain and status The proposed gpt35turbo0125RAG configuration achieved 805 semantic matching accuracy and 92.1% recall surpassing baseline LLM methods by 28 percentage points The framework also demonstrated strong generalization in crossdomain classification and semantic clustering tasks These results validate the effectiveness of LLMRAG integration for intelligent patent retrieval providing a foundation for nextgeneration AIdriven intellectual property analysis platforms

large language model, machine learning, natural language, (16 more...)

2508.14064

Country: North America > United States (0.70)

Genre: Research Report (1.00)

Industry: Law > Intellectual Property & Technology Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.75)

Ashkenazi, Shaul, Skantze, Gabriel, Stuart-Smith, Jane, Foster, Mary Ellen

Into the Wild: When Robots Are Not Welcome

-- Social robots are increasingly being deployed in public spaces, where they face not only technological difficulties and unexpected user utterances, but also objections from stakeholders who may not be comfortable with introducing a robot into those spaces. We describe our difficulties with deploying a social robot in two different public settings: 1) Student services center; 2) Refugees and asylum seekers drop-in service. Although this is a failure report, in each use case we eventually managed to earn the trust of the staff and form a relationship with them, allowing us to deploy our robot and conduct our studies. We have developed a multilingual robot system (Figure 1) described in [1] for two different use cases: 1) Supporting newly arrived international students in a UK university, answering frequently asked questions; 2) Supporting refugees and asylum seekers with navigating bureaucratic processes. Like most current public-space robot deployments, our field studies involved adding a robot to an existing workplace, with stakeholders including management, visitors, as well as front-line workers who should all be consulted to develop the details of the system to be deployed.

artificial intelligence, international conference, robot, (13 more...)

2508.12075

Country:

North America > United States (0.14)
Europe > Sweden (0.14)
Asia > China (0.14)

Genre: Research Report (0.65)

Industry:

Law (1.00)
Government > Immigration & Customs (0.56)

Technology: Information Technology > Artificial Intelligence > Robots > Robots in the Home (0.59)

Zhou, Quan, Marecek, Jakub, Shorten, Robert

Learning Time-Varying Convexifications of Multiple Fairness Measures

Artificial intelligence has gained widespread popularity and adoption across diverse industries due to its ability of automatic decision-making processes. In numerous contexts where artificial intelligence permeates various aspects of our lives, from business operations to societal dynamics and policy formulation, ensuring fairness is of greatest importance to meeting environmental, social, and governance standards. While for nearly any problem in the field of artificial intelligence, there can exist multiple measures of individual fairness as well as multiple measures of subgroup fairness. Often, Subgroup fairness involves multiple protected attributes (e.g., race, sex), creating numerous combinations of subgroups and corresponding subgroup fairness measures, all of which deserve consideration. Hence, it becomes essential to take into account the trade-offs among optimising for multiple fairness measures.

artificial intelligence, machine learning, vertex, (17 more...)

2508.14311

Country: North America > United States (0.93)

Genre: Research Report (0.50)

Industry:

Government (1.00)
Law > Statutes (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Social Media (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)