Goto

Collaborating Authors

 studying


Studying the Effects of Robot Intervention on School Shooters in Virtual Reality

arXiv.org Artificial Intelligence

We advance the understanding of robotic intervention in high-risk scenarios by examining their potential to distract and impede a school shooter. To evaluate this concept, we conducted a virtual reality study with 150 university participants role-playing as a school shooter. Within the simulation, an autonomous robot predicted the shooter's movements and positioned itself strategically to interfere and distract. The strategy the robot used to approach the shooter was manipulated -- either moving directly in front of the shooter (aggressive) or maintaining distance (passive) -- and the distraction method, ranging from no additional cues (low), to siren and lights (medium), to siren, lights, and smoke to impair visibility (high). An aggressive, high-distraction robot reduced the number of victims by 46.6% relative to a no-robot control. This outcome underscores both the potential of robotic intervention to enhance safety and the pressing ethical questions surrounding their use in school environments.


The Surprising Soupability of Documents in State Space Models

arXiv.org Artificial Intelligence

We investigate whether hidden states from Structured State Space Models (SSMs) can be merged post-hoc to support downstream reasoning. Inspired by model souping, we propose a strategy where documents are encoded independently and their representations are pooled -- via simple operations like averaging -- into a single context state. This approach, which we call document souping, enables modular encoding and reuse without reprocessing the full input for each query. We finetune Mamba2 models to produce soupable representations and find that they support multi-hop QA, sparse retrieval, and long-document reasoning with strong accuracy. On HotpotQA, souping ten independently encoded documents nearly matches the performance of a cross-encoder trained on the same inputs.


Studying Up Public Sector AI: How Networks of Power Relations Shape Agency Decisions Around AI Design and Use

arXiv.org Artificial Intelligence

As public sector agencies rapidly introduce new AI tools in high-stakes domains like social services, it becomes critical to understand how decisions to adopt these tools are made in practice. We borrow from the anthropological practice to ``study up'' those in positions of power, and reorient our study of public sector AI around those who have the power and responsibility to make decisions about the role that AI tools will play in their agency. Through semi-structured interviews and design activities with 16 agency decision-makers, we examine how decisions about AI design and adoption are influenced by their interactions with and assumptions about other actors within these agencies (e.g., frontline workers and agency leaders), as well as those above (legal systems and contracted companies), and below (impacted communities). By centering these networks of power relations, our findings shed light on how infrastructural, legal, and social factors create barriers and disincentives to the involvement of a broader range of stakeholders in decisions about AI design and adoption. Agency decision-makers desired more practical support for stakeholder involvement around public sector AI to help overcome the knowledge and power differentials they perceived between them and other stakeholders (e.g., frontline workers and impacted community members). Building on these findings, we discuss implications for future research and policy around actualizing participatory AI approaches in public sector contexts.


Studying the Effects of Sex-related Differences on Brain Age Prediction using brain MR Imaging

arXiv.org Artificial Intelligence

While utilizing machine learning models, one of the most crucial aspects is how bias and fairness affect model outcomes for diverse demographics. This becomes especially relevant in the context of machine learning for medical imaging applications as these models are increasingly being used for diagnosis and treatment planning. In this paper, we study biases related to sex when developing a machine learning model based on brain magnetic resonance images (MRI). We investigate the effects of sex by performing brain age prediction considering different experimental designs: model trained using only female subjects, only male subjects and a balanced dataset. We also perform evaluation on multiple MRI datasets (Calgary-Campinas(CC359) and CamCAN) to assess the generalization capability of the proposed models. We found disparities in the performance of brain age prediction models when trained on distinct sex subgroups and datasets, in both final predictions and decision making (assessed using interpretability models). Our results demonstrated variations in model generalizability across sex-specific subgroups, suggesting potential biases in models trained on unbalanced datasets. This underlines the critical role of careful experimental design in generating fair and reliable outcomes.


Studying the impacts of pre-training using ChatGPT-generated text on downstream tasks

arXiv.org Artificial Intelligence

In recent times, significant advancements have been witnessed in the field of language models, particularly with the emergence of Large Language Models (LLMs) that are trained on vast amounts of data extracted from internet archives. These LLMs, such as ChatGPT, have become widely accessible, allowing users to generate text for various purposes including articles, essays, jokes, and poetry. Given that LLMs are trained on a diverse range of text sources, encompassing platforms like Reddit and Twitter, it is foreseeable that future training datasets will also incorporate text generated by previous iterations of the models themselves. In light of this development, our research aims to investigate the influence of artificial text in the pre-training phase of language models. Specifically, we conducted a comparative analysis between a language model, RoBERTa, pre-trained using CNN/DailyMail news articles, and ChatGPT, which employed the same articles for its training and evaluated their performance on three downstream tasks as well as their potential gender bias, using sentiment analysis as a metric. Through a series of experiments, we demonstrate that the utilization of artificial text during pre-training does not have a significant impact on either the performance of the models in downstream tasks or their gender bias. In conclusion, our findings suggest that the inclusion of text generated by LLMs in their own pre-training process does not yield substantial effects on the subsequent performance of the models in downstream tasks or their potential gender bias.


Studying the explanations for the automated prediction of bug and non-bug issues using LIME and SHAP

arXiv.org Artificial Intelligence

Context: The identification of bugs within the reported issues in an issue tracker is crucial for the triage of issues. Machine learning models have shown promising results regarding the performance of automated issue type prediction. However, we have only limited knowledge beyond our assumptions how such models identify bugs. LIME and SHAP are popular technique to explain the predictions of classifiers. Objective: We want to understand if machine learning models provide explanations for the classification that are reasonable to us as humans and align with our assumptions of what the models should learn. We also want to know if the prediction quality is correlated with the quality of explanations. Method: We conduct a study where we rate LIME and SHAP explanations based on their quality of explaining the outcome of an issue type prediction model. For this, we rate the quality of the explanations themselves, i.e., if they align with our expectations and if they help us to understand the underlying machine learning model.


Studying the Practices of Deploying Machine Learning Projects on Docker

#artificialintelligence

Docker is a containerization service that allows for convenient deployment of websites, databases, applications' APIs, and machine learning (ML) models with a few lines of code. Studies have recently explored the use of Docker for deploying general software projects with no specific focus on how Docker is used to deploy ML-based projects. In this study, we conducted an exploratory study to understand how Docker is being used to deploy ML-based projects. As the initial step, we examined the categories of ML-based projects that use Docker. We then examined why and how these projects use Docker, and the characteristics of the resulting Docker images. Our results indicate that six categories of ML-based projects use Docker for deployment, including ML Applications, MLOps/ AIOps, Toolkits, DL Frameworks, Models, and Documentation. We derived the taxonomy of 21 major categories representing the purposes of using Docker, including those specific to models such as model management tasks (e.g., testing, training). We then showed that ML engineers use Docker images mostly to help with the platform portability, such as transferring the software across the operating systems, runtimes such as GPU, and language constraints. However, we also found that more resources may be required to run the Docker images for building ML-based software projects due to the large number of files contained in the image layers with deeply nested directories. We hope to shed light on the emerging practices of deploying ML software projects using containers and highlight aspects that should be improved.


Studying the brain to build AI that processes language as people do

#artificialintelligence

AI has made impressive strides in recent years, but it's still far from learning language as efficiently as humans. For instance, children learn that "orange" can refer to both a fruit and color from a few examples, but modern AI systems can't do this nearly as efficiently as people. This has led many researchers to wonder: Can studying the human brain help to build AI systems that can learn and reason like people do? Today, Meta AI is announcing a long-term research initiative to better understand how the human brain processes language. In collaboration with neuroimaging center Neurospin (CEA) and INRIA we're comparing how AI language models and the brain respond to the same spoken or written sentences.


Mysterious X particle spotted in quarkโ€“gluon plasma at CERN โ€“ Physics World

#artificialintelligence

A mysterious "X" particle comprising four quarks and first seen in 2003, has been found in the quarkโ€“gluon plasma produced in heavy ion collisions at the Large Hadron Collider (LHC). The observation was made by physicists working on CERN's Compact Muon Solenoid (CMS) experiment and if confirmed, it could help researchers understand the structure of the exotic particle. Further studies of the particle could help explain how familiar hadrons such as protons and neutrons formed from the quarkโ€“gluon plasma believed to have been present in the early universe. The exotic X particle โ€“ formally known as X(3872) because of it 3872 MeV mass โ€“ was first spotted by the Belle experiment in Japan. It has subsequently been studied by other experiments at electronโ€“positron colliders and hadron colliders, but its nature is not fully understood.


Studying the Big Bang With Artificial Intelligence - Neuroscience News

#artificialintelligence

Summary: A new machine-learning algorithm is helping researchers uncover the secrets of the quark-gluon plasma. It could hardly be more complicated: tiny particles whir around wildly with extremely high energy, countless interactions occur in the tangled mess of quantum particles, and this results in a state of matter known as "quark-gluon plasma". Immediately after the Big Bang, the entire universe was in this state; today it is produced by high-energy atomic nucleus collisions, for example at CERN. Such processes can only be studied using high-performance computers and highly complex computer simulations whose results are difficult to evaluate. Therefore, using artificial intelligence or machine learning for this purpose seems like an obvious idea.