AITopics | Overview

Collaborating Authors

Overview

The State of Data Curation at NeurIPS: An Assessment of Dataset Development Practices in the Datasets and Benchmarks Track

Neural Information Processing SystemsMay-27-2025, 03:02:52 GMT

Data curation is a field with origins in librarianship and archives, whose scholarship and thinking on data issues go back centuries, if not millennia. The field of machine learning is increasingly observing the importance of data curation to the advancement of both applications and fundamental understanding of machine learning models -- evidenced not least by the creation of the Datasets and Benchmarks track itself. This work provides an analysis of recent dataset development practices at NeurIPS through the lens of data curation. We present an evaluation framework for dataset documentation, consisting of a rubric and toolkit developed through a thorough literature review of data curation principles. We use the framework to systematically assess the strengths and weaknesses in current dataset development practices of 60 datasets published in the NeurIPS Datasets and Benchmarks track from 2021-2023.

artificial intelligence, data quality, machine learning, (6 more...)

Neural Information Processing Systems

Genre: Overview (0.40)

Technology:

Information Technology > Data Science > Data Quality > Data Cleaning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Online Statistical Inference of Constrained Stochastic Optimization via Random Scaling

Du, Xinchen, Zhu, Wanrong, Wu, Wei Biao, Na, Sen

arXiv.org Machine LearningMay-27-2025

Constrained stochastic nonlinear optimization problems have attracted significant attention for their ability to model complex real-world scenarios in physics, economics, and biology. As datasets continue to grow, online inference methods have become crucial for enabling real-time decision-making without the need to store historical data. In this work, we develop an online inference procedure for constrained stochastic optimization by leveraging a method called Sketched Stochastic Sequential Quadratic Programming (SSQP). As a direct generalization of sketched Newton methods, SSQP approximates the objective with a quadratic model and the constraints with a linear model at each step, then applies a sketching solver to inexactly solve the resulting subproblem. Building on this design, we propose a new online inference procedure called random scaling. In particular, we construct a test statistic based on SSQP iterates whose limiting distribution is free of any unknown parameters. Compared to existing online inference procedures, our approach offers two key advantages: (i) it enables the construction of asymptotically valid confidence intervals; and (ii) it is matrix-free, i.e. the computation involves only primal-dual SSQP iterates $(\boldsymbol{x}_t, \boldsymbolλ_t)$ without requiring any matrix inversions. We validate our theory through numerical experiments on nonlinearly constrained regression problems and demonstrate the superior performance of our random scaling method over existing inference procedures.

artificial intelligence, machine learning, survey article, (18 more...)

arXiv.org Machine Learning

2505.18327

Genre:

Research Report (0.66)
Workflow (0.48)
Overview (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)

Add feedback

Optimal community detection in dense bipartite graphs

Chhor, Julien, Knight, Parker

arXiv.org Machine LearningMay-27-2025

We consider the problem of detecting a community of densely connected vertices in a high-dimensional bipartite graph of size $n_1 \times n_2$. Under the null hypothesis, the observed graph is drawn from a bipartite Erdős-Renyi distribution with connection probability $p_0$. Under the alternative hypothesis, there exists an unknown bipartite subgraph of size $k_1 \times k_2$ in which edges appear with probability $p_1 = p_0 + δ$ for some $δ> 0$, while all other edges outside the subgraph appear with probability $p_0$. Specifically, we provide non-asymptotic upper and lower bounds on the smallest signal strength $δ^*$ that is both necessary and sufficient to ensure the existence of a test with small enough type one and type two errors. We also derive novel minimax-optimal tests achieving these fundamental limits when the underlying graph is sufficiently dense. Our proposed tests involve a combination of hard-thresholded nonlinear statistics of the adjacency matrix, the analysis of which may be of independent interest. In contrast with previous work, our non-asymptotic upper and lower bounds match for any configuration of $n_1,n_2, k_1,k_2$.

artificial intelligence, log null 1, machine learning, (17 more...)

arXiv.org Machine Learning

2505.18372

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)

Genre:

Overview (0.92)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.92)

Add feedback

Explanation User Interfaces: A Systematic Literature Review

Cappuccio, Eleonora, Esposito, Andrea, Greco, Francesco, Desolda, Giuseppe, Lanzilotti, Rosa, Rinzivillo, Salvatore

arXiv.org Artificial IntelligenceMay-27-2025

Artificial Intelligence (AI) is one of the major technological advancements of this century, bearing incredible potential for users through AI-powered applications and tools in numerous domains. Being often black-box (i.e., its decision-making process is unintelligible), developers typically resort to eXplainable Artificial Intelligence (XAI) techniques to interpret the behaviour of AI models to produce systems that are transparent, fair, reliable, and trustworthy. However, presenting explanations to the user is not trivial and is often left as a secondary aspect of the system's design process, leading to AI systems that are not useful to end-users. This paper presents a Systematic Literature Review on Explanation User Interfaces (XUIs) to gain a deeper understanding of the solutions and design guidelines employed in the academic literature to effectively present explanations to users. To improve the contribution and real-world impact of this survey, we also present a framework for Human-cEnteRed developMent of Explainable user interfaceS (HERMES) to guide practitioners and academics in the design and evaluation of XUIs.

explanation, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2505.20085

Country:

Asia (1.00)
North America > United States > California (0.67)
Europe > United Kingdom > England (0.67)
North America > United States > Texas (0.46)

Genre:

Overview (1.00)
Research Report > New Finding (0.68)
Research Report > Experimental Study (0.47)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Education (0.92)
Government > Military (0.67)
Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
(3 more...)

Add feedback

MVP: Multi-source Voice Pathology detection

Koudounas, Alkis, La Quatra, Moreno, Ciravegna, Gabriele, Fantini, Marco, Crosetti, Erika, Succo, Giovanni, Cerquitelli, Tania, Siniscalchi, Sabato Marco, Baralis, Elena

arXiv.org Artificial IntelligenceMay-27-2025

Voice disorders significantly impact patient quality of life, yet non-invasive automated diagnosis remains under-explored due to both the scarcity of pathological voice data, and the variability in recording sources. This work introduces MVP (Multi-source Voice Pathology detection), a novel approach that leverages transformers operating directly on raw voice signals. We explore three fusion strategies to combine sentence reading and sustained vowel recordings: waveform concatenation, intermediate feature fusion, and decision-level combination. Empirical validation across the German, Portuguese, and Italian languages shows that intermediate feature fusion using transformers best captures the complementary characteristics of both recording types. Our approach achieves up to +13% AUC improvement over single-source methods.

detection, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2505.2005

Country: Europe > Italy (0.29)

Genre:

Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Industry:

Health & Medicine > Therapeutic Area (0.95)
Health & Medicine > Diagnostic Medicine (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.68)
Information Technology > Artificial Intelligence > Natural Language (0.68)

Add feedback

Multi-Agent Reinforcement Learning in Cybersecurity: From Fundamentals to Applications

Landolt, Christoph R., Würsch, Christoph, Meier, Roland, Mermoud, Alain, Jang-Jaccard, Julian

arXiv.org Artificial IntelligenceMay-27-2025

Multi-Agent Reinforcement Learning (MARL) has shown great potential as an adaptive solution for addressing modern cybersecurity challenges. MARL enables decentralized, adaptive, and collaborative defense strategies and provides an automated mechanism to combat dynamic, coordinated, and sophisticated threats. This survey investigates the current state of research in MARL applications for automated cyber defense (ACD), focusing on intruder detection and lateral movement containment. Additionally, it examines the role of Autonomous Intelligent Cyber-defense Agents (AICA) and Cyber Gyms in training and validating MARL agents. Finally, the paper outlines existing challenges, such as scalability and adversarial robustness, and proposes future research directions. This also discusses how MARL integrates in AICA to provide adaptive, scalable, and dynamic solutions to counter the increasingly sophisticated landscape of cyber threats. It highlights the transformative potential of MARL in areas like intrusion detection and lateral movement containment, and underscores the value of Cyber Gyms for training and validation of AICA.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2505.19837

Country:

Europe (0.68)
North America > United States > California (0.28)

Genre: Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

Large Language Models for Planning: A Comprehensive and Systematic Survey

Cao, Pengfei, Men, Tianyi, Liu, Wencan, Zhang, Jingwen, Li, Xuzhao, Lin, Xixun, Sui, Dianbo, Cao, Yanan, Liu, Kang, Zhao, Jun

arXiv.org Artificial IntelligenceMay-27-2025

Planning represents a fundamental capability of intelligent agents, requiring comprehensive environmental understanding, rigorous logical reasoning, and effective sequential decision-making. While Large Language Models (LLMs) have demonstrated remarkable performance on certain planning tasks, their broader application in this domain warrants systematic investigation. This paper presents a comprehensive review of LLM-based planning. Specifically, this survey is structured as follows: First, we establish the theoretical foundations by introducing essential definitions and categories about automated planning. Next, we provide a detailed taxonomy and analysis of contemporary LLM-based planning methodologies, categorizing them into three principal approaches: 1) External Module Augmented Methods that combine LLMs with additional components for planning, 2) Finetuning-based Methods that involve using trajectory data and feedback signals to adjust LLMs in order to improve their planning abilities, and 3) Searching-based Methods that break down complex tasks into simpler components, navigate the planning space, or enhance decoding strategies to find the best solutions. Subsequently, we systematically summarize existing evaluation frameworks, including benchmark datasets, evaluation metrics and performance comparisons between representative planning methods. Finally, we discuss the underlying mechanisms enabling LLM-based planning and outline promising research directions for this rapidly evolving field. We hope this survey will serve as a valuable resource to inspire innovation and drive progress in this field.

arxiv preprint arxiv, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2505.19683

Country: Asia > China (0.28)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Leisure & Entertainment > Games (0.93)
Education (0.92)
Information Technology (0.67)
Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)

Add feedback

Vibe Coding vs. Agentic Coding: Fundamentals and Practical Implications of Agentic AI

Sapkota, Ranjan, Roumeliotis, Konstantinos I., Karkee, Manoj

arXiv.org Artificial IntelligenceMay-27-2025

This review presents a comprehensive analysis of two emerging paradigms in AI-assisted software development: vibe coding and agentic coding. While both leverage large language models (LLMs), they differ fundamentally in autonomy, architectural design, and the role of the developer. Vibe coding emphasizes intuitive, human-in-the-loop interaction through prompt-based, conversational workflows that support ideation, experimentation, and creative exploration. In contrast, agentic coding enables autonomous software development through goal-driven agents capable of planning, executing, testing, and iterating tasks with minimal human intervention. We propose a detailed taxonomy spanning conceptual foundations, execution models, feedback loops, safety mechanisms, debugging strategies, and real-world tool ecosystems. Through comparative workflow analysis and 20 detailed use cases, we illustrate how vibe systems thrive in early-stage prototyping and education, while agentic systems excel in enterprise-grade automation, codebase refactoring, and CI/CD integration. We further examine emerging trends in hybrid architectures, where natural language interfaces are coupled with autonomous execution pipelines. Finally, we articulate a future roadmap for agentic AI, outlining the infrastructure needed for trustworthy, explainable, and collaborative systems. Our findings suggest that successful AI software engineering will rely not on choosing one paradigm, but on harmonizing their strengths within a unified, human-centered development lifecycle.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2505.19443

Country: North America > United States (1.00)

Genre:

Workflow (1.00)
Overview (1.00)
Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Education (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

Add feedback

Guided by Guardrails: Control Barrier Functions as Safety Instructors for Robotic Learning

Guerrier, Maeva, Soma, Karthik, Fouad, Hassan, Beltrame, Giovanni

arXiv.org Artificial IntelligenceMay-27-2025

Safety stands as the primary obstacle preventing the widespread adoption of learning-based robotic systems in our daily lives. While reinforcement learning (RL) shows promise as an effective robot learning paradigm, conventional RL frameworks often model safety by using single scalar negative rewards with immediate episode termination, failing to capture the temporal consequences of unsafe actions (e.g., sustained collision damage). In this work, we introduce a novel approach that simulates these temporal effects by applying continuous negative rewards without episode termination. Our experiments reveal that standard RL methods struggle with this model, as the accumulated negative values in unsafe zones create learning barriers. To address this challenge, we demonstrate how Control Barrier Functions (CBFs), with their proven safety guarantees, effectively help robots avoid catastrophic regions while enhancing learning outcomes. We present three CBF-based approaches, each integrating traditional RL methods with Control Barrier Functions, guiding the agent to learn safe behavior. Our empirical analysis, conducted in both simulated environments and real-world settings using a four-wheel differential drive robot, explores the possibilities of employing these approaches for safe robotic learning.

machine learning, obstacle, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2505.18858

Country: North America > Canada > Quebec (0.14)

Genre:

Research Report (0.84)
Overview > Innovation (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Multi-Party Conversational Agents: A Survey

Sapkota, Sagar, Hasan, Mohammad Saqib, Shah, Mubarak, Karmaker, Santu

arXiv.org Artificial IntelligenceMay-27-2025

Multi-party Conversational Agents (MPCAs) are systems designed to engage in dialogue with more than two participants simultaneously. Unlike traditional two-party agents, designing MPCAs faces additional challenges due to the need to interpret both utterance semantics and social dynamics. This survey explores recent progress in MPCAs by addressing three key questions: 1) Can agents model each participants' mental states? (State of Mind Modeling); 2) Can they properly understand the dialogue content? (Semantic Understanding); and 3) Can they reason about and predict future conversation flow? (Agent Action Modeling). We review methods ranging from classical machine learning to Large Language Models (LLMs) and multi-modal systems. Our analysis underscores Theory of Mind (ToM) as essential for building intelligent MPCAs and highlights multi-modal understanding as a promising yet underexplored direction. Finally, this survey offers guidance to future researchers on developing more capable MPCAs.

computational linguistic, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2505.18845

Country:

Europe (1.00)
Asia > Middle East > UAE (0.46)
North America > United States > New Mexico (0.28)
North America > United States > Minnesota (0.28)

Genre: Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.45)
Health & Medicine > Consumer Health (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

Add feedback