Goto

Collaborating Authors

 anova


Who Writes What: Unveiling the Impact of Author Roles on AI-generated Text Detection

Li, Jiatao, Wan, Xiaojun

arXiv.org Artificial Intelligence

The rise of Large Language Models (LLMs) necessitates accurate AI-generated text detection. However, current approaches largely overlook the influence of author characteristics. We investigate how sociolinguistic attributes-gender, CEFR proficiency, academic field, and language environment-impact state-of-the-art AI text detectors. Using the ICNALE corpus of human-authored texts and parallel AI-generated texts from diverse LLMs, we conduct a rigorous evaluation employing multi-factor ANOVA and weighted least squares (WLS). Our results reveal significant biases: CEFR proficiency and language environment consistently affected detector accuracy, while gender and academic field showed detector-dependent effects. These findings highlight the crucial need for socially aware AI text detection to avoid unfairly penalizing specific demographic groups. We offer novel empirical evidence, a robust statistical framework, and actionable insights for developing more equitable and reliable detection systems in real-world, out-of-domain contexts. This work paves the way for future research on bias mitigation, inclusive evaluation benchmarks, and socially responsible LLM detectors.


Causal Inference Tools for a Better Evaluation of Machine Learning

Soumm, Michaël

arXiv.org Artificial Intelligence

We present a comprehensive framework for applying rigorous statistical techniques from econometrics to analyze and improve machine learning systems. We introduce key statistical methods such as Ordinary Least Squares (OLS) regression, Analysis of Variance (ANOVA), and logistic regression, explaining their theoretical foundations and practical applications in machine learning evaluation. The document serves as a guide for researchers and practitioners, detailing how these techniques can provide deeper insights into model behavior, performance, and fairness. We cover the mathematical principles behind each method, discuss their assumptions and limitations, and provide step-by-step instructions for their implementation. The paper also addresses how to interpret results, emphasizing the importance of statistical significance and effect size. Through illustrative examples, we demonstrate how these tools can reveal subtle patterns and interactions in machine learning models that are not apparent from traditional evaluation metrics. By connecting the fields of econometrics and machine learning, this work aims to equip readers with powerful analytical tools for more rigorous and comprehensive evaluation of AI systems. The framework presented here contributes to developing more robust, interpretable, and fair machine learning technologies.


Future You: A Conversation with an AI-Generated Future Self Reduces Anxiety, Negative Emotions, and Increases Future Self-Continuity

Pataranutaporn, Pat, Winson, Kavin, Yin, Peggy, Lapapirojn, Auttasak, Ouppaphan, Pichayoot, Lertsutthiwong, Monchai, Maes, Pattie, Hershfield, Hal

arXiv.org Artificial Intelligence

We introduce "Future You," an interactive, brief, single-session, digital chat intervention designed to improve future self-continuity--the degree of connection an individual feels with a temporally distant future self--a characteristic that is positively related to mental health and wellbeing. Our system allows users to chat with a relatable yet AI-powered virtual version of their future selves that is tuned to their future goals and personal qualities. To make the conversation realistic, the system generates a "synthetic memory"--a unique backstory for each user--that creates a throughline between the user's present age (between 18-30) and their life at age 60. The "Future You" character also adopts the persona of an age-progressed image of the user's present self. After a brief interaction with the "Future You" character, users reported decreased anxiety, and increased future self-continuity. This is the first study successfully demonstrating the use of personalized AI-generated characters to improve users' future self-continuity and wellbeing.


Disentangling Quantum and Classical Contributions in Hybrid Quantum Machine Learning Architectures

Kölle, Michael, Maurer, Jonas, Altmann, Philipp, Sünkel, Leo, Stein, Jonas, Linnhoff-Popien, Claudia

arXiv.org Artificial Intelligence

Quantum computing offers the potential for superior computational capabilities, particularly for data-intensive tasks. However, the current state of quantum hardware puts heavy restrictions on input size. To address this, hybrid transfer learning solutions have been developed, merging pre-trained classical models, capable of handling extensive inputs, with variational quantum circuits. Yet, it remains unclear how much each component -- classical and quantum -- contributes to the model's results. We propose a novel hybrid architecture: instead of utilizing a pre-trained network for compression, we employ an autoencoder to derive a compressed version of the input data. This compressed data is then channeled through the encoder part of the autoencoder to the quantum component. We assess our model's classification capabilities against two state-of-the-art hybrid transfer learning architectures, two purely classical architectures and one quantum architecture. Their accuracy is compared across four datasets: Banknote Authentication, Breast Cancer Wisconsin, MNIST digits, and AudioMNIST. Our research suggests that classical components significantly influence classification in hybrid transfer learning, a contribution often mistakenly ascribed to the quantum element. The performance of our model aligns with that of a variational quantum circuit using amplitude embedding, positioning it as a feasible alternative.


How Do Human Users Teach a Continual Learning Robot in Repeated Interactions?

Ayub, Ali, Mehta, Jainish, De Francesco, Zachary, Holthaus, Patrick, Dautenhahn, Kerstin, Nehaniv, Chrystopher L.

arXiv.org Artificial Intelligence

Continual learning (CL) has emerged as an important avenue of research in recent years, at the intersection of Machine Learning (ML) and Human-Robot Interaction (HRI), to allow robots to continually learn in their environments over long-term interactions with humans. Most research in continual learning, however, has been robot-centered to develop continual learning algorithms that can quickly learn new information on static datasets. In this paper, we take a human-centered approach to continual learning, to understand how humans teach continual learning robots over the long term and if there are variations in their teaching styles. We conducted an in-person study with 40 participants that interacted with a continual learning robot in 200 sessions. In this between-participant study, we used two different CL models deployed on a Fetch mobile manipulator robot. An extensive qualitative and quantitative analysis of the data collected in the study shows that there is significant variation among the teaching styles of individual users indicating the need for personalized adaptation to their distinct teaching styles. The results also show that although there is a difference in the teaching styles between expert and non-expert users, the style does not have an effect on the performance of the continual learning robot. Finally, our analysis shows that the constrained experimental setups that have been widely used to test most continual learning techniques are not adequate, as real users interact with and teach continual learning robots in a variety of ways. Our code is available at https://github.com/aliayub7/cl_hri.


A Closer Look at Parameter-Efficient Tuning in Diffusion Models

Xiang, Chendong, Bao, Fan, Li, Chongxuan, Su, Hang, Zhu, Jun

arXiv.org Artificial Intelligence

Large-scale diffusion models like Stable Diffusion are powerful and find various real-world applications while customizing such models by fine-tuning is both memory and time inefficient. Motivated by the recent progress in natural language processing, we investigate parameter-efficient tuning in large diffusion models by inserting small learnable modules (termed adapters). In particular, we decompose the design space of adapters into orthogonal factors -- the input position, the output position as well as the function form, and perform Analysis of Variance (ANOVA), a classical statistical approach for analyzing the correlation between discrete (design options) and continuous variables (evaluation metrics). Our analysis suggests that the input position of adapters is the critical factor influencing the performance of downstream tasks. Then, we carefully study the choice of the input position, and we find that putting the input position after the cross-attention block can lead to the best performance, validated by additional visualization analyses. Finally, we provide a recipe for parameter-efficient tuning in diffusion models, which is comparable if not superior to the fully fine-tuned baseline (e.g., DreamBooth) with only 0.75 \% extra parameters, across various customized tasks.


Statistics (III) ANOVA in Data Science & Machine Learning

#artificialintelligence

For the last part of the Statistics series, we will cover the ANOVA, Post-hoc Pairwise Comparison, Two-way ANOVA, and R-squared. Previously, our study focused on one or two groups of subjects. How can we handle the concept of multiple groups with multiple factors? For example, the dose level and gender may impact the effectiveness of a vaccine. How can we determine whether it is statistically significant for particular combinations?


Statistical Tests

#artificialintelligence

Statistics is pretty nice and smooth until we come across "Inferential Statistics" because of so many things happening there. I must say, it stands right for its name as using it is "Inferential"as well! And with all of it, come the several Statistical Tests we conduct when we formulate a Statistical Hypothesis! It seems a boring step while working on a Data Science project but is relevant for what it stands as it speaks about how good you're going using the sample, to know the whole Population. Before jumping in directly into the tests, let's know some Introductory basis behind it all.


Statistical Tests in Machine Learning

#artificialintelligence

When it comes to statistics in machine learning, a common approach to accept or reject a null hypothesis is to check for the p-values and give a result without really having an idea of what goes on in the background. Without getting into any kind of fancy jargons or mathematical technicalities, this article attempts to sum up the intuition behind statistics using some real life examples especially for people from a non-statistics background. Why do we need hypothesis testing? But what if suddenly, Dunkin' happens to shut down because Krispe Kreme claims the weight of their donuts is less than what Dunkin' claims. How do we choose sides?


What causes the test error? Going beyond bias-variance via ANOVA

#artificialintelligence

Modern machine learning methods are often overparametrized, allowing adaptation to the data at a fine level. This can seem puzzling; in the worst case, such models do not need to generalize. This puzzle inspired a great amount of work, arguing when overparametrization reduces test error, in a phenomenon called "double descent". Recent work aimed to understand in greater depth why overparametrization is helpful for generalization. This leads to discovering the unimodality of variance as a function of the level of parametrization, and to decomposing the variance into that arising from label noise, initialization, and randomness in the training data to understand the sources of the error.