AITopics | trustworthy ai

Collaborating Authors

trustworthy ai

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Extending the reward structure in reinforcement learning: an interview with Tanmay Ambadkar

AIHubMar-12-2026, 23:37:38 GMT

In this interview series, we're meeting some of the AAAI/SIGAI Doctoral Consortium participants to find out more about their research. Tanmay Ambadkar is researching the reward structure in reinforcement learning, with the goal of providing generalizable solutions that can provide robust guarantees and are easily deployable. We caught up with Tanmay to find out more about his research, and in particular, the constrained reinforcement learning framework he has been working on. Tell us a bit about your PhD - where are you studying, and what is the topic of your research? I am a 4th year PhD candidate at The Pennsylvania State University, PA, USA.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

AIHub

Country:

North America > United States > Pennsylvania (0.25)
Asia > Singapore (0.05)

Genre: Personal > Interview (1.00)

Industry: Government (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.93)

Add feedback

Trustworthy AI Must Account for Interactions

Cresswell, Jesse C.

arXiv.org Artificial IntelligenceNov-4-2025

Trustworthy AI encompasses many aspirational aspects for aligning AI systems with human values, including fairness, privacy, robustness, explainability, and uncertainty quantification. Ultimately the goal of Trustworthy AI research is to achieve all aspects simultaneously. However, efforts to enhance one aspect often introduce unintended trade-offs that negatively impact others. In this position paper, we review notable approaches to these five aspects and systematically consider every pair, detailing the negative interactions that can arise. For example, applying differential privacy to model training can amplify biases, undermining fairness. Drawing on these findings, we take the position that current research practices of improving one or two aspects in isolation are insufficient. Instead, research on Trustworthy AI must account for interactions between aspects and adopt a holistic view across all relevant axes at once. To illustrate our perspective, we provide guidance on how practitioners can work towards integrated trust, examples of how interactions affect the financial industry, and alternative views.

artificial intelligence, machine learning, prediction, (15 more...)

arXiv.org Artificial Intelligence

2504.0717

Genre: Overview (0.87)

Industry:

Information Technology > Security & Privacy (1.00)
Banking & Finance (1.00)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Building Trustworthy AI by Addressing its 16+2 Desiderata with Goal-Directed Commonsense Reasoning

Tudor, Alexis R., Zeng, Yankai, Wang, Huaduo, Arias, Joaquin, Gupta, Gopal

arXiv.org Artificial IntelligenceNov-3-2025

Current advances in AI and its applicability have highlighted the need to ensure its trustworthiness for legal, ethical, and even commercial reasons. Sub-symbolic machine learning algorithms, such as the LLMs, simulate reasoning but hallucinate and their decisions cannot be explained or audited (crucial aspects for trustworthiness). On the other hand, rule-based reasoners, such as Cyc, are able to provide the chain of reasoning steps but are complex and use a large number of reasoners. We propose a middle ground using s(CASP), a goal-directed constraint-based answer set programming reasoner that employs a small number of mechanisms to emulate reliable and explainable human-style commonsense reasoning. In this paper, we explain how s(CASP) supports the 16 desiderata for trustworthy AI introduced by Doug Lenat and Gary Marcus (2023), and two additional ones: inconsistency detection and the assumption of alternative worlds. To illustrate the feasibility and synergies of s(CASP), we present a range of diverse applications, including a conversational chatbot and a virtually embodied reasoner.

casp, logic & formal reasoning, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2506.12667

Country:

Europe (0.93)
North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Law (1.00)
Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (1.00)
(3 more...)

Add feedback

Towards the Formalization of a Trustworthy AI for Mining Interpretable Models explOiting Sophisticated Algorithms

Guidotti, Riccardo, Cinquini, Martina, Manerba, Marta Marchiori, Setzu, Mattia, Spinnato, Francesco

arXiv.org Artificial IntelligenceNov-3-2025

Interpretable-by-design models are crucial for fostering trust, accountability, and safe adoption of automated decision-making models in real-world applications. In this paper we formalize the ground for the MIMOSA (Mining Interpretable Models explOiting Sophisticated Algorithms) framework, a comprehensive methodology for generating predictive models that balance interpretability with performance while embedding key ethical properties. We formally define here the supervised learning setting across diverse decision-making tasks and data types, including tabular data, time series, images, text, transactions, and trajectories. We characterize three major families of interpretable models: feature importance, rule, and instance based models. For each family, we analyze their interpretability dimensions, reasoning mechanisms, and complexity. Beyond interpretability, we formalize three critical ethical properties, namely causality, fairness, and privacy, providing formal definitions, evaluation metrics, and verification procedures for each. We then examine the inherent trade-offs between these properties and discuss how privacy requirements, fairness constraints, and causal reasoning can be embedded within interpretable pipelines. By evaluating ethical measures during model generation, this framework establishes the theoretical foundations for developing AI systems that are not only accurate and interpretable but also fair, privacy-preserving, and causally aware, i.e., trustworthy.

data mining, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2510.20621

Country: Europe > United Kingdom (0.28)

Genre: Research Report (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.92)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(6 more...)

Add feedback

Optimizing Ethical Risk Reduction for Medical Intelligent Systems with Constraint Programming

Brayé, Clotilde, Bricout, Aurélien, Gotlieb, Arnaud, Lazaar, Nadjib, Vallet, Quentin

arXiv.org Artificial IntelligenceOct-10-2025

Medical Intelligent Systems (MIS) are increasingly integrated into healthcare workflows, offering significant benefits but also raising critical safety and ethical concerns. According to the European Union AI Act, most MIS will be classified as high-risk systems, requiring a formal risk management process to ensure compliance with the ethical requirements of trustworthy AI. In this context, we focus on risk reduction optimization problems, which aim to reduce risks with ethical considerations by finding the best balanced assignment of risk assessment values according to their coverage of trustworthy AI ethical requirements. We formalize this problem as a constrained optimization task and investigate three resolution paradigms: Mixed Integer Programming (MIP), Satisfiability (SAT), and Constraint Programming(CP).Our contributions include the mathematical formulation of this optimization problem, its modeling with the Minizinc constraint modeling language, and a comparative experimental study that analyzes the performance, expressiveness, and scalability of each approach to solving. From the identified limits of the methodology, we draw some perspectives of this work regarding the integration of the Minizinc model into a complete trustworthy AI ethical risk management process for MIS.

artificial intelligence, ethical req, optimization problem, (13 more...)

arXiv.org Artificial Intelligence

2510.07491

Country: Europe (0.66)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.34)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Government > Regional Government > Europe Government (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Add feedback

Bridging Ethical Principles and Algorithmic Methods: An Alternative Approach for Assessing Trustworthiness in AI Systems

Papademas, Michael, Ziouvelou, Xenia, Troumpoukis, Antonis, Karkaletsis, Vangelis

arXiv.org Artificial IntelligenceOct-6-2025

Artificial Intelligence (AI) technology epitomizes the complex challenges posed by human-made artifacts, particularly those widely integrated into society and exerting significant influence, highlighting potential benefits and their negative consequences. While other technologies may also pose substantial risks, AI's pervasive reach makes its societal effects especially profound. The complexity of AI systems, coupled with their remarkable capabilities, can lead to a reliance on technologies that operate beyond direct human oversight or understanding. To mitigate the risks that arise, several theoretical tools and guidelines have been developed, alongside efforts to create technological tools aimed at safeguarding Trustworthy AI. The guidelines take a more holistic view of the issue but fail to provide techniques for quantifying trustworthiness. Conversely, while technological tools are better at achieving such quantification, they lack a holistic perspective, focusing instead on specific aspects of Trustworthy AI. This paper aims to introduce an assessment method that combines the ethical components of Trustworthy AI with the algorithmic processes of PageRank and TrustRank. The goal is to establish an assessment framework that minimizes the subjectivity inherent in the self-assessment techniques prevalent in the field by introducing algorithmic criteria. The application of our approach indicates that a holistic assessment of an AI system's trustworthiness can be achieved by providing quantitative insights while considering the theoretical content of relevant guidelines.

artificial intelligence, information management, trustworthiness, (17 more...)

arXiv.org Artificial Intelligence

2506.22774

Country:

North America > United States (1.00)
Europe (0.68)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Add feedback

Uncovering AI Governance Themes in EU Policies using BERTopic and Thematic Analysis

Golpayegani, Delaram, Lasek-Markey, Marta, Younus, Arjumand, Kerr, Aphra, Lewis, Dave

arXiv.org Artificial IntelligenceSep-18-2025

The upsurge of policies and guidelines that aim to ensure Artificial Intelligence (AI) systems are safe and trustworthy has led to a fragmented landscape of AI governance. The European Union (EU) is a key actor in the development of such policies and guidelines. Its High-Level Expert Group (HLEG) issued an influential set of guidelines for trustworthy AI, followed in 2024 by the adoption of the EU AI Act. While the EU policies and guidelines are expected to be aligned, they may differ in their scope, areas of emphasis, degrees of normativity, and priorities in relation to AI. To gain a broad understanding of AI governance from the EU perspective, we leverage qualitative thematic analysis approaches to uncover prevalent themes in key EU documents, including the AI Act and the HLEG Ethics Guidelines. We further employ quantitative topic modelling approaches, specifically through the use of the BERTopic model, to enhance the results and increase the document sample to include EU AI policy documents published post-2018. We present a novel perspective on EU policies, tracking the evolution of its approach to addressing AI governance.

artificial intelligence, natural language, thematic analysis, (14 more...)

arXiv.org Artificial Intelligence

2509.13387

Country: Europe > Ireland > Leinster > County Dublin > Dublin (0.14)

Genre: Research Report (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government > Regional Government > Europe Government (0.91)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Add feedback

Trustworthy AI: UK Air Traffic Control Revisited

Procter, Rob, Rouncefield, Mark

arXiv.org Artificial IntelligenceJul-30-2025

Exploring the socio - technical challenges confronting the adoption of AI in organisational settings is something that has so far been largely absent from the related literature . In particular, r esearch into requirements for trustworthy AI typically overlooks how people deal with the problems of trust in the tools that they use as part of their everyday work practices . This article presents some findings from an ongoing ethnographic study of how current tools are used in air traffic control work and what it r eveals about requirements for trustworthy AI in air traffic control and other safety - critical application domains.

artificial intelligence, atco, real time system, (15 more...)

arXiv.org Artificial Intelligence

2507.21169

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Air (1.00)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Architecture > Real Time Systems (0.98)
Information Technology > Communications > Networks > Sensor Networks (0.88)

Add feedback

AI and Trust

Communications of the ACMJun-12-2025, 14:55:42 GMT

This is a discussion about artificial intelligence (AI), trust, power, and integrity. There are two kinds of trust--interpersonal and social--and we regularly confuse them. What matters here is social trust, which is about reliability and predictability in society. Our confusion will increase with AI, and the corporations controlling AI will use that confusion to take advantage of us. This is a security problem. This is a confidentiality problem. But it is much more an integrity problem. And that integrity is going to be the primary security challenge for AI systems of the future. It's also a regulatory problem, and it is government's role to enable social trust, which means incentivizing trustworthy AI. Okay, so let's break that down. Trust is a complicated concept, and the word is overloaded with many different meanings. When we say we trust a friend, it is less about their specific actions and more about them as a person.

artificial intelligence, integrity, social trust, (16 more...)

Communications of the ACM

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)

Technology: Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Add feedback

Healthy Distrust in AI systems

Paaßen, Benjamin, Alpsancar, Suzana, Matzner, Tobias, Scharlau, Ingrid

arXiv.org Artificial IntelligenceMay-16-2025

Under the slogan of trustworthy AI, much of contemporary AI research is focused on designing AI systems and usage practices that inspire human trust and, thus, enhance adoption of AI systems. However, a person affected by an AI system may not be convinced by AI system design alone -- neither should they, if the AI system is embedded in a social context that gives good reason to believe that it is used in tension with a person's interest. In such cases, distrust in the system may be justified and necessary to build meaningful trust in the first place. We propose the term "healthy distrust" to describe such a justified, careful stance towards certain AI usage practices. We investigate prior notions of trust and distrust in computer science, sociology, history, psychology, and philosophy, outline a remaining gap that healthy distrust might fill and conceptualize healthy distrust as a crucial part for AI usage that respects human autonomy.

artificial intelligence, distrust, healthy distrust, (16 more...)

arXiv.org Artificial Intelligence

2505.09747

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > Switzerland > Basel-City > Basel (0.04)
North America > United States > North Carolina > Durham County > Durham (0.04)
(13 more...)

Genre: Research Report (1.00)

Industry:

Law (1.00)
Health & Medicine (1.00)
Energy (0.68)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback