Goto

Collaborating Authors

 automation bias


This medical startup uses LLMs to run appointments and make diagnoses

MIT Technology Review

"Our focus is really on what we can do to pull the doctor out of the visit," says Akido's CTO. Imagine this: You've been feeling unwell, so you call up your doctor's office to make an appointment. At the appointment, you aren't rushed through describing your health concerns; instead, you have a full half hour to share your symptoms and worries and the exhaustive details of your health history with someone who listens attentively and asks thoughtful follow-up questions. You leave with a diagnosis, a treatment plan, and the sense that, for once, you've been able to discuss your health with the care that it merits. AI companies have stopped warning you that their chatbots aren't doctors Once cautious, OpenAI, Grok, and others will now dive into giving unverified medical advice with virtually no disclaimers. You might not have spoken to a doctor, or other licensed medical practitioner, at all.


Survey for Categorising Explainable AI Studies Using Data Analysis Task Frameworks

Ziadeh, Hamzah, Knoche, Hendrik

arXiv.org Artificial Intelligence

Research into explainable artificial intelligence (XAI) for data analysis tasks suffer from a large number of contradictions and lack of concrete design recommendations stemming from gaps in understanding the tasks that require AI assistance. In this paper, we drew on multiple fields such as visual analytics, cognition, and dashboard design to propose a method for categorising and comparing XAI studies under three dimensions: what, why, and who. We identified the main problems as: inadequate descriptions of tasks, context-free studies, and insufficient testing with target users. We propose that studies should specifically report on their users' domain, AI, and data analysis expertise to illustrate the generalisability of their findings. We also propose study guidelines for designing and reporting XAI tasks to improve the XAI community's ability to parse the rapidly growing field. We hope that our contribution can help researchers and designers better identify which studies are most relevant to their work, what gaps exist in the research, and how to handle contradictory results regarding XAI design.


User Friendly and Adaptable Discriminative AI: Using the Lessons from the Success of LLMs and Image Generation Models

Nguyen, Son The, Tulabandhula, Theja, Watson-Manheim, Mary Beth

arXiv.org Artificial Intelligence

Discriminative methods focus on modeling the conditional probability of outcome(s) given a context (such as a feature vector). In contrast, generative methods focus on modeling the joint distribution of data. Discriminative models have historically found success in classification and regression tasks in various domains (e.g., finance, healthcare, automotive, etc). On the other hand, newer generative models, such as Large Language Models (LLMs) and diffusion models, have succeeded in open-ended tasks that require versatility and creativity in addition to traditional prediction tasks. We hypothesize that the value of these new generative models is enhanced because they are user-friendly and highly adaptable, making it easier for non-experts to interact with them and produce valuable results with minimal effort. However, this is not the case with current discriminative models. In this work, we explore ways to make discriminative models more user-friendly and adaptable, which we hypothesize will increase their adoption in more applications and bring them on par with the success levels seen with generative AI tools.


The Rise of the AI Co-Pilot: Lessons for Design from Aviation and Beyond

Sellen, Abigail, Horvitz, Eric

arXiv.org Artificial Intelligence

The fast pace of advances in AI promises to revolutionize various aspects of knowledge work, extending its influence to daily life and professional fields alike. We advocate for a paradigm where AI is seen as a collaborative co-pilot, working under human guidance rather than as a mere tool. Drawing from relevant research and literature in the disciplines of Human-Computer Interaction and Human Factors Engineering, we highlight the criticality of maintaining human oversight in AI interactions. Reflecting on lessons from aviation, we address the dangers of over-relying on automation, such as diminished human vigilance and skill erosion. Our paper proposes a design approach that emphasizes active human engagement, control, and skill enhancement in the AI partnership, aiming to foster a harmonious, effective, and empowering human-AI relationship. We particularly call out the critical need to design AI interaction capabilities and software applications to enable and celebrate the primacy of human agency. This calls for designs for human-AI partnership that cede ultimate control and responsibility to the human user as pilot, with the AI co-pilot acting in a well-defined supporting role.


Dimensionality Reduction for Improving Out-of-Distribution Detection in Medical Image Segmentation

Woodland, McKell, Patel, Nihil, Taie, Mais Al, Yung, Joshua P., Netherton, Tucker J., Patel, Ankit B., Brock, Kristy K.

arXiv.org Artificial Intelligence

Clinically deployed segmentation models are known to fail on data outside of their training distribution. As these models perform well on most cases, it is imperative to detect out-of-distribution (OOD) images at inference to protect against automation bias. This work applies the Mahalanobis distance post hoc to the bottleneck features of a Swin UNETR model that segments the liver on T1-weighted magnetic resonance imaging. By reducing the dimensions of the bottleneck features with principal component analysis, OOD images were detected with high performance and minimal computational load.


The Alienness of AI Is a Bigger Problem Than Its Imperfection

#artificialintelligence

Today I bring you a fresh perspective on a topic I've written about a lot on TAB: AI imperfection. But, instead of enumerating the ways in which AI systems fail, as I typically do, I'm going to change my point of view to give you a new--and rather convincing--argument that I haven't seen written anywhere else. Let's start from the beginning. A few days ago, before ChatGPT was a thing, I was scrolling Twitter and saw this picture (try to recognize what you're looking at): It took me a whole minute to realize it's just a little doggy. Then it struck me: humans are nowhere near perfect.


Language models that can search the web hold promise -- but also raise concerns

#artificialintelligence

Did you miss a session at the Data Summit? Language models -- AI systems that can be prompted to write essays and emails, answer questions, and more -- remain flawed in many ways. Because they "learn" to write from examples on the web, including problematic social media posts, they're prone to generating misinformation, conspiracy theories, and racist, sexist, or otherwise toxic language. Another major limitation of many of today's language models is that they're "stuck in time," in a sense. Because they're trained once on a large collection of text from the web, their knowledge of the world -- which they gain from that collection -- can quickly become outdated depending on when they were deployed.


Intelligent Decision Assistance Versus Automated Decision-Making: Enhancing Knowledge Work Through Explainable Artificial Intelligence

Schemmer, Max, Kühl, Niklas, Satzger, Gerhard

arXiv.org Artificial Intelligence

While recent advances in AI-based automated decision-making have shown many benefits for businesses and society, they also come at a cost. It has for long been known that a high level of automation of decisions can lead to various drawbacks, such as automation bias and deskilling. In particular, the deskilling of knowledge workers is a major issue, as they are the same people who should also train, challenge and evolve AI. To address this issue, we conceptualize a new class of DSS, namely Intelligent Decision Assistance (IDA) based on a literature review of two different research streams -- DSS and automation. IDA supports knowledge workers without influencing them through automated decision-making. Specifically, we propose to use techniques of Explainable AI (XAI) while withholding concrete AI recommendations. To test this conceptualization, we develop hypotheses on the impacts of IDA and provide first evidence for their validity based on empirical studies in the literature.


Decision-makers Processing of AI Algorithmic Advice: Automation Bias versus Selective Adherence

Alon-Barkat, Saar, Busuioc, Madalina

arXiv.org Artificial Intelligence

Artificial intelligence algorithms are increasingly adopted as decisional aides by public organisations, with the promise of overcoming biases of human decision-makers. At the same time, the use of algorithms may introduce new biases in the human-algorithm interaction. A key concern emerging from psychology studies regards human overreliance on algorithmic advice even in the face of warning signals and contradictory information from other sources (automation bias). A second concern regards decision-makers inclination to selectively adopt algorithmic advice when it matches their pre-existing beliefs and stereotypes (selective adherence). To date, we lack rigorous empirical evidence about the prevalence of these biases in a public sector context. We assess these via two pre-registered experimental studies (N=1,509), simulating the use of algorithmic advice in decisions pertaining to the employment of school teachers in the Netherlands. In study 1, we test automation bias by exploring participants adherence to a prediction of teachers performance, which contradicts additional evidence, while comparing between two types of predictions: algorithmic v. human-expert. We do not find evidence for automation bias. In study 2, we replicate these findings, and we also test selective adherence by manipulating the teachers ethnic background. We find a propensity for adherence when the advice predicts low performance for a teacher of a negatively stereotyped ethnic minority, with no significant differences between algorithmic and human advice. Overall, our findings of selective, biased adherence belie the promise of neutrality that has propelled algorithm use in the public sector.


Confidence, uncertainty, and trust in AI affect how humans make decisions

#artificialintelligence

In 2019, as the Department of Defense considered adopting AI ethics principles, the Defense Innovation Unit held a series of meetings across the U.S. to gather opinions from experts and the public. At one such meeting in Silicon Valley, Stanford University professor Herb Lin argued that he was concerned about people trusting AI too easily and said any application of AI should include a confidence score indicating the algorithm's degree of certainty. "AI systems should not only be the best possible. Sometimes they should say'I have no idea what I'm doing here, don't trust me.' That's going to be really important," he said. The concern Lin raised is an important one: People can be manipulated by artificial intelligence, with cute robots a classic example of the human tendency to trust machines.