AITopics | ml pipeline

Collaborating Authors

ml pipeline

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Probabilistic Matrix Factorization for Automated Machine Learning

Nicolo Fusi, Rishit Sheth, Melih Elibol

Neural Information Processing SystemsNov-21-2025, 03:56:14 GMT

In order to achieve state-of-the-art performance, modern machine learning techniques require careful data pre-processing and hyperparameter tuning.

artificial intelligence, machine learning, pipeline, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Utilizing Large Language Models for Machine Learning Explainability

Vassiliades, Alexandros, Polatidis, Nikolaos, Samaras, Stamatios, Diplaris, Sotiris, Martin, Ignacio Cabrera, Manolopoulos, Yannis, Vrochidis, Stefanos, Kompatsiaris, Ioannis

arXiv.org Artificial IntelligenceOct-9-2025

This study explores the explainability capabilities of large language models (LLMs), when employed to autonomously generate machine learning (ML) solutions. We examine two classification tasks: (i) a binary classification problem focused on predicting driver alertness states, and (ii) a multilabel classification problem based on the yeast dataset. Three state-of-the-art LLMs (i.e. OpenAI GPT, Anthropic Claude, and DeepSeek) are prompted to design training pipelines for four common classifiers: Random Forest, XGBoost, Multilayer Perceptron, and Long Short-Term Memory networks. The generated models are evaluated in terms of predictive performance (recall, precision, and F1-score) and explainability using SHAP (SHapley Additive exPlanations). Specifically, we measure Average SHAP Fidelity (Mean Squared Error between SHAP approximations and model outputs) and Average SHAP Sparsity (number of features deemed influential). The results reveal that LLMs are capable of producing effective and interpretable models, achieving high fidelity and consistent sparsity, highlighting their potential as automated tools for interpretable ML pipeline generation. The results show that LLMs can produce effective, interpretable pipelines with high fidelity and consistent sparsity, closely matching manually engineered baselines.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2510.06912

Country: Europe (0.93)

Genre: Research Report > New Finding (0.88)

Industry: Information Technology > Security & Privacy (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On Aligning Prediction Models with Clinical Experiential Learning: A Prostate Cancer Case Study

Vallon, Jacqueline J., Overman, William, Xu, Wanqiao, Panjwani, Neil, Ling, Xi, Vij, Sushmita, Bagshaw, Hilary P., Leppert, John T., Shah, Sumit, Sonn, Geoffrey, Srinivas, Sandy, Pollom, Erqi, Buyyounouski, Mark K., Bayati, Mohsen

arXiv.org Artificial IntelligenceSep-5-2025

Over the past decade, the use of machine learning (ML) models in healthcare applications has rapidly increased. Despite high performance, modern ML models do not always capture patterns the end user requires. For example, a model may predict a non-monotonically decreasing relationship between cancer stage and survival, keeping all other features fixed. In this paper, we present a reproducible framework for investigating this misalignment between model behavior and clinical experiential learning, focusing on the effects of underspecification of modern ML pipelines. In a prostate cancer outcome prediction case study, we first identify and address these inconsistencies by incorporating clinical knowledge, collected by a survey, via constraints into the ML model, and subsequently analyze the impact on model performance and behavior across degrees of underspecification. The approach shows that aligning the ML model with clinical experiential learning is possible without compromising performance. Motivated by recent literature in generative AI, we further examine the feasibility of a feedback-driven alignment approach in non-generative AI clinical risk prediction models through a randomized experiment with clinicians. Our findings illustrate that, by eliciting clinicians' model preferences using our proposed methodology, the larger the difference in how the constrained and unconstrained models make predictions for a patient, the more apparent the difference is in clinical interpretation.

artificial intelligence, clinician, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2509.04053

Country: North America > United States > California (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology > Prostate Cancer (0.61)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.46)

Add feedback

ExeKGLib: A Platform for Machine Learning Analytics based on Knowledge Graphs

Klironomos, Antonis, Zhou, Baifan, Tan, Zhipeng, Zheng, Zhuoxun, Gad-Elrab, Mohamed H., Paulheim, Heiko, Kharlamov, Evgeny

arXiv.org Artificial IntelligenceAug-4-2025

Nowadays machine learning (ML) practitioners have access to numerous ML libraries available online. Such libraries can be used to create ML pipelines that consist of a series of steps where each step may invoke up to several ML libraries that are used for various data-driven analytical tasks. Development of high-quality ML pipelines is non-trivial; it requires training, ML expertise, and careful development of each step. At the same time, domain experts in science and engineering may not possess such ML expertise and training while they are in pressing need of ML-based analytics. In this paper, we present our ExeKGLib, a Python library enhanced with a graphical interface layer that allows users with minimal ML knowledge to build ML pipelines. This is achieved by relying on knowledge graphs that encode ML knowledge in simple terms accessible to non-ML experts. ExeKGLib also allows improving the transparency and reusability of the built ML workflows and ensures that they are executable. We show the usability and usefulness of ExeKGLib by presenting real use cases.

artificial intelligence, machine learning, pipeline, (17 more...)

arXiv.org Artificial Intelligence

2508.00394

Country:

Europe (0.93)
North America > United States (0.14)

Genre:

Research Report (0.64)
Workflow (0.49)
Questionnaire & Opinion Survey (0.47)

Industry:

Energy (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Web > Semantic Web (0.94)

Add feedback

LightAutoDS-Tab: Multi-AutoML Agentic System for Tabular Data

Lapin, Aleksey, Hromov, Igor, Chumakov, Stanislav, Mitrovic, Mile, Simakov, Dmitry, Nikitin, Nikolay O., Savchenko, Andrey V.

arXiv.org Artificial IntelligenceJul-21-2025

AutoML has advanced in handling complex tasks using the integration of LLMs, yet its efficiency remains limited by dependence on specific underlying tools. In this paper, we introduce LightAutoDS-Tab, a multi-AutoML agentic system for tasks with tabular data, which combines an LLM-based code generation with several AutoML tools. Our approach improves the flexibility and robustness of pipeline design, outperforming state-of-the-art open-source solutions on several data science tasks from Kaggle. The code of LightAutoDS-Tab is available in the open repository https://github.com/sb-ai-lab/LADS

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2507.13413

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.94)

Add feedback

VirnyFlow: A Design Space for Responsible Model Development

Herasymuk, Denys, Protsiv, Nazar, Stoyanovich, Julia

arXiv.org Artificial IntelligenceJun-3-2025

Developing machine learning (ML) models requires a deep understanding of real-world problems, which are inherently multi-objective. In this paper, we present VirnyFlow, the first design space for responsible model development, designed to assist data scientists in building ML pipelines that are tailored to the specific context of their problem. Unlike conventional AutoML frameworks, VirnyFlow enables users to define customized optimization criteria, perform comprehensive experimentation across pipeline stages, and iteratively refine models in alignment with real-world constraints. Our system integrates evaluation protocol definition, multi-objective Bayesian optimization, cost-aware multi-armed bandits, query optimization, and distributed parallelism into a unified architecture. We show that VirnyFlow significantly outperforms state-of-the-art AutoML systems in both optimization quality and scalability across five real-world benchmarks, offering a flexible, efficient, and responsible alternative to black-box automation in ML development.

artificial intelligence, data mining, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2506.01584

Country: North America > United States > New York (0.28)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.88)

Add feedback

Stress-Testing ML Pipelines with Adversarial Data Corruption

Zhu, Jiongli, Xu, Geyang, Lorenzi, Felipe, Glavic, Boris, Salimi, Babak

arXiv.org Artificial IntelligenceJun-3-2025

Structured data-quality issues, such as missing values correlated with demographics, culturally biased labels, or systemic selection biases, routinely degrade the reliability of machine-learning pipelines. Regulators now increasingly demand evidence that high-stakes systems can withstand these realistic, interdependent errors, yet current robustness evaluations typically use random or overly simplistic corruptions, leaving worst-case scenarios unexplored. We introduce SAVAGE, a causally inspired framework that (i) formally models realistic data-quality issues through dependency graphs and flexible corruption templates, and (ii) systematically discovers corruption patterns that maximally degrade a target performance metric. SAVAGE employs a bi-level optimization approach to efficiently identify vulnerable data subpopulations and fine-tune corruption severity, treating the full ML pipeline, including preprocessing and potentially non-differentiable models, as a black box. Extensive experiments across multiple datasets and ML tasks (data cleaning, fairness-aware learning, uncertainty quantification) demonstrate that even a small fraction (around 5 %) of structured corruptions identified by SAVAGE severely impacts model performance, far exceeding random or manually crafted errors, and invalidating core assumptions of existing techniques. Thus, SAVAGE provides a practical tool for rigorous pipeline stress-testing, a benchmark for evaluating robustness methods, and actionable guidance for designing more resilient data workflows.

artificial intelligence, data quality, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2506.0123

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (0.67)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
(2 more...)

Add feedback

Navigating Pitfalls: Evaluating LLMs in Machine Learning Programming Education

Kumar, Smitha, Lones, Michael A., Maarek, Manuel, Zantout, Hind

arXiv.org Artificial IntelligenceMay-27-2025

The rapid advancement of Large Language Models (LLMs) has opened new avenues in education. This study examines the use of LLMs in supporting learning in machine learning education; in particular, it focuses on the ability of LLMs to identify common errors of practice (pitfalls) in machine learning code, and their ability to provide feedback that can guide learning. Using a portfolio of code samples, we consider four different LLMs: one closed model and three open models. Whilst the most basic pitfalls are readily identified by all models, many common pitfalls are not. They particularly struggle to identify pitfalls in the early stages of the ML pipeline, especially those which can lead to information leaks, a major source of failure within applied ML projects. They also exhibit limited success at identifying pitfalls around model selection, which is a concept that students often struggle with when first transitioning from theory to practice. This questions the use of current LLMs to support machine learning education, and also raises important questions about their use by novice practitioners. Nevertheless, when LLMs successfully identify pitfalls in code, they do provide feedback that includes advice on how to proceed, emphasising their potential role in guiding learners. We also compare the capability of closed and open LLM models, and find that the gap is relatively small given the large difference in model sizes. This presents an opportunity to deploy, and potentially customise, smaller more efficient LLM models within education, avoiding risks around cost and data sharing associated with commercial models.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2505.1822

Country: North America > United States (1.00)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.66)

Industry:

Education > Curriculum > Subject-Specific Education (1.00)
Education > Educational Setting (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Classifying Subjective Time Perception in a Multi-robot Control Scenario Using Eye-tracking Information

Aust, Till, Kaduk, Julian, Hamann, Heiko

arXiv.org Artificial IntelligenceApr-10-2025

As automation and mobile robotics reshape work environments, rising expectations for productivity increase cognitive demands on human operators, leading to potential stress and cognitive overload. Accurately assessing an operator's mental state is critical for maintaining performance and well-being. We use subjective time perception, which can be altered by stress and cognitive load, as a sensitive, low-latency indicator of well-being and cognitive strain. Distortions in time perception can affect decision-making, reaction times, and overall task effectiveness, making it a valuable metric for adaptive human-swarm interaction systems. We study how human physiological signals can be used to estimate a person's subjective time perception in a human-swarm interaction scenario as example. A human operator needs to guide and control a swarm of small mobile robots. We obtain eye-tracking data that is classified for subjective time perception based on questionnaire data. Our results show that we successfully estimate a person's time perception from eye-tracking data. The approach can profit from individual-based pretraining using only 30 seconds of data. In future work, we aim for robots that respond to human operator needs by automatically classifying physiological data in a closed control loop.

artificial intelligence, participant, time perception, (16 more...)

arXiv.org Artificial Intelligence

2504.06442

Country: Europe (0.14)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.86)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Safeguarding Autonomy: a Focus on Machine Learning Decision Systems

Subías-Beltrán, Paula, Pujol, Oriol, de Lecuona, Itziar

arXiv.org Artificial IntelligenceMar-27-2025

As global discourse on AI regulation gains momentum, this paper focuses on delineating the impact of ML on autonomy and fostering awareness. Respect for autonomy is a basic principle in bioethics that establishes persons as decision-makers. While the concept of autonomy in the context of ML appears in several European normative publications, it remains a theoretical concept that has yet to be widely accepted in ML practice. Our contribution is to bridge the theoretical and practical gap by encouraging the practical application of autonomy in decision-making within ML practice by identifying the conditioning factors that currently prevent it. Consequently, we focus on the different stages of the ML pipeline to identify the potential effects on ML end-users' autonomy. To improve its practical utility, we propose a related question for each detected impact, offering guidance for identifying possible focus points to respect ML end-users autonomy in decision-making.

artificial intelligence, autonomy, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2503.22023

Country:

Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Government (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.48)

Add feedback