AITopics | South America

Collaborating Authors

South America

Conformal Prediction with Cellwise Outliers: A Detect-then-Impute Approach

Peng, Qian, Bao, Yajie, Ren, Haojie, Wang, Zhaojun, Zou, Changliang

arXiv.org Machine LearningMay-9-2025

Conformal prediction is a powerful tool for constructing prediction intervals for black-box models, providing a finite sample coverage guarantee for exchangeable data. However, this exchangeability is compromised when some entries of the test feature are contaminated, such as in the case of cellwise outliers. To address this issue, this paper introduces a novel framework called detect-then-impute conformal prediction. This framework first employs an outlier detection procedure on the test feature and then utilizes an imputation method to fill in those cells identified as outliers. To quantify the uncertainty in the processed test feature, we adaptively apply the detection and imputation procedures to the calibration set, thereby constructing exchangeable features for the conformal prediction interval of the test label. We develop two practical algorithms, PDI-CP and JDI-CP, and provide a distribution-free coverage analysis under some commonly used detection and imputation procedures. Notably, JDI-CP achieves a finite sample $1-2α$ coverage guarantee. Numerical experiments on both synthetic and real datasets demonstrate that our proposed algorithms exhibit robust coverage properties and comparable efficiency to the oracle baseline.

artificial intelligence, data mining, machine learning, (13 more...)

arXiv.org Machine Learning

2505.04986

Country:

Asia > China > Shanghai > Shanghai (0.04)
South America > Brazil (0.04)
North America > United States > California > Orange County > Irvine (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

A Proposal for Evaluating the Operational Risk for ChatBots based on Large Language Models

Pinacho-Davidson, Pedro, Gutierrez, Fernando, Zapata, Pablo, Vergara, Rodolfo, Aqueveque, Pablo

arXiv.org Artificial IntelligenceMay-9-2025

The emergence of Generative AI (Gen AI) and Large Language Models (LLMs) has enabled more advanced chatbots capable of human-like interactions. However, these conversational agents introduce a broader set of operational risks that extend beyond traditional cybersecurity considerations. In this work, we propose a novel, instrumented risk-assessment metric that simultaneously evaluates potential threats to three key stakeholders: the service-providing organization, end users, and third parties. Our approach incorporates the technical complexity required to induce erroneous behaviors in the chatbot--ranging from non-induced failures to advanced prompt-injection attacks--as well as contextual factors such as the target industry, user age range, and vulnerability severity. To validate our metric, we leverage Garak, an open-source framework for LLM vulnerability testing. We further enhance Garak to capture a variety of threat vectors (e.g., misinformation, code hallucinations, social engineering, and malicious code generation). Our methodology is demonstrated in a scenario involving chatbots that employ retrieval-augmented generation (RAG), showing how the aggregated risk scores guide both short-term mitigation and longer-term improvements in model design and deployment. The results underscore the importance of multi-dimensional risk assessments in operationalizing secure, reliable AI-driven conversational systems.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.04784

Country:

South America > Chile (0.14)
North America > United States (0.14)

Genre: Research Report > New Finding (0.87)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

Improving Failure Prediction in Aircraft Fastener Assembly Using Synthetic Data in Imbalanced Datasets

Lahr, Gustavo J. G., Godoy, Ricardo V., Segreto, Thiago H., Savazzi, Jose O., Ajoudani, Arash, Boaventura, Thiago, Caurin, Glauco A. P.

arXiv.org Artificial IntelligenceMay-9-2025

Automating aircraft manufacturing still relies heavily on human labor due to the complexity of the assembly processes and customization requirements. One key challenge is achieving precise positioning, especially for large aircraft structures, where errors can lead to substantial maintenance costs or part rejection. Existing solutions often require costly hardware or lack flexibility. Used in aircraft by the thousands, threaded fasteners, e.g., screws, bolts, and collars, are traditionally executed by fixed-base robots and usually have problems in being deployed in the mentioned manufacturing sites. This paper emphasizes the importance of error detection and classification for efficient and safe assembly of threaded fasteners, especially aeronautical collars. Safe assembly of threaded fasteners is paramount since acquiring sufficient data for training deep learning models poses challenges due to the rarity of failure cases and imbalanced datasets. The paper addresses this by proposing techniques like class weighting and data augmentation, specifically tailored for temporal series data, to improve classification performance. Furthermore, the paper introduces a novel problem-modeling approach, emphasizing metrics relevant to collar assembly rather than solely focusing on accuracy. This tailored approach enhances the models' capability to handle the challenges of threaded fastener assembly effectively.

artificial intelligence, deep learning, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2505.03917

Country:

Europe (0.46)
South America > Brazil (0.15)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.68)

Industry:

Aerospace & Defense > Aircraft (1.00)
Transportation > Air (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

2025 AI Index Report

AIHubMay-8-2025, 10:47:42 GMT

AI performance on demanding benchmarks continues to improve. Performance of advanced AI systems on new benchmarks introduced in 2023 has increased sharply. AI systems also made major strides in generating high-quality video. AI is increasingly embedded in everyday life. In 2023, the FDA (in the US) approved 223 AI-enabled medical devices, up from just six in 2015.

artificial intelligence, china, machine learning, (17 more...)

AIHub

Country:

North America > United States (0.93)
Asia > China (0.13)
South America (0.06)
(9 more...)

Genre: Personal > Honors (0.53)

Industry:

Health & Medicine (1.00)
Government > Regional Government (0.37)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Add feedback

YABLoCo: Yet Another Benchmark for Long Context Code Generation

Valeev, Aidar, Garaev, Roman, Lomshakov, Vadim, Piontkovskaya, Irina, Ivanov, Vladimir, Adewuyi, Israel

arXiv.org Artificial IntelligenceMay-8-2025

Research Center of the Artificial Intelligence Institute Innopolis University, Russia ai.valeev@innopolis.ru Abstract --Large Language Models (LLMs) demonstrate the ability to solve various programming tasks, including code generation. Typically, the performance of LLMs is measured on benchmarks with small or medium-sized context windows of thousands of lines of code (LoC). This paper closes this gap by contributing to the long context code generation benchmark (Y ABLoCo). The benchmark featured a test set of 215 functions selected from four large repositories with thousands of functions. The dataset contained metadata of functions, contexts of the functions with different levels of dependencies, docstrings, functions' bodies, and call graphs for each repository. This paper presents three key aspects of the contribution. First, the benchmark aims at function body generation in large repositories in C and C++, two languages not covered by previous benchmarks. Second, the benchmark contains large repositories from 200K to 2,000K LoC. Third, we contribute a scalable evaluation pipeline for efficient computing of the target metrics and a tool for visual analysis of generated code. Overall, these three aspects allow for evaluating code generation in large repositories in C/C++. Large Language Models (LLMs) have recently demonstrated abilities to solve a wide set of software engineering tasks in various settings [9], [19].

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2505.04406

Country:

Europe > Russia (0.25)
Asia > Russia (0.25)
South America > Colombia > Meta Department > Villavicencio (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

In-situ and Non-contact Etch Depth Prediction in Plasma Etching via Machine Learning (ANN & BNN) and Digital Image Colorimetry

Kang, Minji, Kim, Seongho, Go, Eunseo, Paek, Donghyeon, Lim, Geon, Kim, Muyoung, Kim, Soyeun, Jang, Sung Kyu, Choi, Min Sup, Kang, Woo Seok, Kim, Jaehyun, Kim, Jaekwang, Kim, Hyeong-U

arXiv.org Artificial IntelligenceMay-8-2025

Precise monitoring of etch depth and the thickness of insulating materials, such as Silicon dioxide and silicon nitride, is critical to ensuring device performance and yield in semiconductor manufacturing. While conventional ex-situ analysis methods are accurate, they are constrained by time delays and contamination risks. To address these limitations, this study proposes a non-contact, in-situ etch depth prediction framework based on machine learning (ML) techniques. Two scenarios are explored. In the first scenario, an artificial neural network (ANN) is trained to predict average etch depth from process parameters, achieving a significantly lower mean squared error (MSE) compared to a linear baseline model. The approach is then extended to incorporate variability from repeated measurements using a Bayesian Neural Network (BNN) to capture both aleatoric and epistemic uncertainty. Coverage analysis confirms the BNN's capability to provide reliable uncertainty estimates. In the second scenario, we demonstrate the feasibility of using RGB data from digital image colorimetry (DIC) as input for etch depth prediction, achieving strong performance even in the absence of explicit process parameters. These results suggest that the integration of DIC and ML offers a viable, cost-effective alternative for real-time, in-situ, and non-invasive monitoring in plasma etching processes, contributing to enhanced process stability, and manufacturing efficiency.

artificial intelligence, machine learning, thickness, (18 more...)

arXiv.org Artificial Intelligence

2505.03826

Country:

Asia > South Korea > Daejeon > Daejeon (0.05)
South America > Uruguay > Maldonado > Maldonado (0.04)
North America > United States > Nebraska > Lancaster County > Lincoln (0.04)
(5 more...)

Genre: Research Report > New Finding (0.88)

Industry:

Semiconductors & Electronics (1.00)
Media > Photography (0.84)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

R^3-VQA: "Read the Room" by Video Social Reasoning

Niu, Lixing, Li, Jiapeng, Yu, Xingping, Wang, Shu, Feng, Ruining, Wu, Bo, Wei, Ping, Wang, Yisen, Fan, Lifeng

arXiv.org Artificial IntelligenceMay-8-2025

"Read the room" is a significant social reasoning capability in human daily life. Humans can infer others' mental states from subtle social cues. Previous social reasoning tasks and datasets lack complexity (e.g., simple scenes, basic interactions, incomplete mental state variables, single-step reasoning, etc.) and fall far short of the challenges present in real-life social interactions. In this paper, we contribute a valuable, high-quality, and comprehensive video dataset named R^3-VQA with precise and fine-grained annotations of social events and mental states (i.e., belief, intent, desire, and emotion) as well as corresponding social causal chains in complex social scenarios. Moreover, we include human-annotated and model-generated QAs. Our task R^3-VQA includes three aspects: Social Event Understanding, Mental State Estimation, and Social Causal Reasoning. As a benchmark, we comprehensively evaluate the social reasoning capabilities and consistencies of current state-of-the-art large vision-language models (LVLMs). Comprehensive experiments show that (i) LVLMs are still far from human-level consistent social reasoning in complex social scenarios; (ii) Theory of Mind (ToM) prompting can help LVLMs perform better on social reasoning tasks. We provide some of our dataset and codes in supplementary material and will release our full dataset and codes upon acceptance.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2505.04147

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
South America > Chile (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Social Events (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

MIHRaGe: A Mixed-Reality Interface for Human-Robot Interaction via Gaze-Oriented Control

Baptista, Rafael R., Gerszberg, Nina R., Godoy, Ricardo V., Lahr, Gustavo J. G.

arXiv.org Artificial IntelligenceMay-8-2025

Individuals with upper limb mobility impairments often require assistive technologies to perform activities of daily living. While gaze-tracking has emerged as a promising method for robotic assistance, existing solutions lack sufficient feedback mechanisms, leading to uncertainty in user intent recognition and reduced adaptability. This paper presents the MIHRAGe interface, an integrated system that combines gaze-tracking, robotic assistance, and a mixed-reality to create an immersive environment for controlling the robot using only eye movements. The system was evaluated through an experimental protocol involving four participants, assessing gaze accuracy, robotic positioning precision, and the overall success of a pick and place task. Results showed an average gaze fixation error of 1.46 cm, with individual variations ranging from 1.28 cm to 2.14 cm. The robotic arm demonstrated an average positioning error of +-1.53 cm, with discrepancies attributed to interface resolution and calibration constraints. In a pick and place task, the system achieved a success rate of 80%, highlighting its potential for improving accessibility in human-robot interaction with visual feedback to the user.

artificial intelligence, interface, robot, (15 more...)

arXiv.org Artificial Intelligence

2505.03929

Country:

South America > Brazil > São Paulo (0.05)
North America > United States (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology: Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.37)

Add feedback

The Influence of Text Variation on User Engagement in Cross-Platform Content Sharing

Hu, Yibo, Jin, Yiqiao, Ye, Meng, Divakaran, Ajay, Kumar, Srijan

arXiv.org Artificial IntelligenceMay-8-2025

In today's cross-platform social media landscape, understanding factors that drive engagement for multimodal content, especially text paired with visuals, remains complex. This study investigates how rewriting Reddit post titles adapted from YouTube video titles affects user engagement. First, we build and analyze a large dataset of Reddit posts sharing YouTube videos, revealing that 21% of post titles are minimally modified. Statistical analysis demonstrates that title rewrites measurably improve engagement. Second, we design a controlled, multi-phase experiment to rigorously isolate the effects of textual variations by neutralizing confounding factors like video popularity, timing, and community norms. Comprehensive statistical tests reveal that effective title rewrites tend to feature emotional resonance, lexical richness, and alignment with community-specific norms. Lastly, pairwise ranking prediction experiments using a fine-tuned BERT classifier achieves 74% accuracy, significantly outperforming near-random baselines, including GPT-4o. These results validate that our controlled dataset effectively minimizes confounding effects, allowing advanced models to both learn and demonstrate the impact of textual features on engagement. By bridging quantitative rigor with qualitative insights, this study uncovers engagement dynamics and offers a robust framework for future cross-platform, multimodal content strategies.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2505.03769

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > California > San Mateo County > Foster City (0.04)
(12 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Research Report > Strength High (0.68)

Industry:

Media > News (1.00)
Health & Medicine (1.00)
Information Technology (0.93)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Insulin Resistance Prediction From Wearables and Routine Blood Biomarkers

Metwally, Ahmed A., Heydari, A. Ali, McDuff, Daniel, Solot, Alexandru, Esmaeilpour, Zeinab, Faranesh, Anthony Z, Zhou, Menglian, Savage, David B., Heneghan, Conor, Patel, Shwetak, Speed, Cathy, Prieto, Javier L.

arXiv.org Artificial IntelligenceMay-8-2025

Insulin resistance, a precursor to type 2 diabetes, is characterized by impaired insulin action in tissues. Current methods for measuring insulin resistance, while effective, are expensive, inaccessible, not widely available and hinder opportunities for early intervention. In this study, we remotely recruited the largest dataset to date across the US to study insulin resistance (N=1,165 participants, with median BMI=28 kg/m2, age=45 years, HbA1c=5.4%), incorporating wearable device time series data and blood biomarkers, including the ground-truth measure of insulin resistance, homeostatic model assessment for insulin resistance (HOMA-IR). We developed deep neural network models to predict insulin resistance based on readily available digital and blood biomarkers. Our results show that our models can predict insulin resistance by combining both wearable data and readily available blood biomarkers better than either of the two data sources separately (R2=0.5, auROC=0.80, Sensitivity=76%, and specificity 84%). The model showed 93% sensitivity and 95% adjusted specificity in obese and sedentary participants, a subpopulation most vulnerable to developing type 2 diabetes and who could benefit most from early intervention. Rigorous evaluation of model performance, including interpretability, and robustness, facilitates generalizability across larger cohorts, which is demonstrated by reproducing the prediction performance on an independent validation cohort (N=72 participants). Additionally, we demonstrated how the predicted insulin resistance can be integrated into a large language model agent to help understand and contextualize HOMA-IR values, facilitating interpretation and safe personalized recommendations. This work offers the potential for early detection of people at risk of type 2 diabetes and thereby facilitate earlier implementation of preventative strategies.

insulin resistance, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2505.03784

Country:

South America > Uruguay > Maldonado > Maldonado (0.04)
North America > United States > Maryland > Howard County > Columbia (0.04)
North America > United States > Arizona (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback