AITopics | autogluon

Collaborating Authors

autogluon

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

TabArena: A Living Benchmark for Machine Learning on Tabular Data

Erickson, Nick, Purucker, Lennart, Tschalzev, Andrej, Holzmüller, David, Desai, Prateek Mutalik, Salinas, David, Hutter, Frank

arXiv.org Artificial IntelligenceNov-4-2025

With the growing popularity of deep learning and foundation models for tabular data, the need for standardized and reliable benchmarks is higher than ever. However, current benchmarks are static. Their design is not updated even if flaws are discovered, model versions are updated, or new models are released. To address this, we introduce TabArena, the first continuously maintained living tabular benchmarking system. To launch TabArena, we manually curate a representative collection of datasets and well-implemented models, conduct a large-scale benchmarking study to initialize a public leaderboard, and assemble a team of experienced maintainers. Our results highlight the influence of validation method and ensembling of hyperparameter configurations to benchmark models at their full potential. While gradient-boosted trees are still strong contenders on practical tabular datasets, we observe that deep learning methods have caught up under larger time budgets with ensembling. At the same time, foundation models excel on smaller datasets. Finally, we show that ensembles across models advance the state-of-the-art in tabular machine learning. We observe that some deep learning models are overrepresented in cross-model ensembles due to validation set overfitting, and we encourage model developers to address this issue. We launch TabArena with a public leaderboard, reproducible code, and maintenance protocols to create a living benchmark available at https://tabarena.ai.

artificial intelligence, deep learning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2506.16791

Country: Europe > Germany > Baden-Württemberg (0.27)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (1.00)
Banking & Finance (1.00)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Enhancing Transformer-Based Foundation Models for Time Series Forecasting via Bagging, Boosting and Statistical Ensembles

Modi, Dhruv D., Pan, Rong

arXiv.org Machine LearningAug-26-2025

Time series foundation models (TSFMs) such as Lag-Llama, TimeGPT, Chronos, MOMENT, UniTS, and TimesFM have shown strong generalization and zero-shot capabilities for time series forecasting, anomaly detection, classification, and imputation. Despite these advantages, their predictions still suffer from variance, domain-specific bias, and limited uncertainty quantification when deployed on real operational data. This paper investigates a suite of statistical and ensemble-based enhancement techniques, including bootstrap-based bagging, regression-based stacking, prediction interval construction, statistical residual modeling, and iterative error feedback, to improve robustness and accuracy. Using the Belgium Electricity Short-Term Load Forecasting dataset as a case study, we demonstrate that the proposed hybrids consistently outperform standalone foundation models across multiple horizons. Regression-based ensembles achieve the lowest mean squared error; bootstrap aggregation markedly reduces long-context errors; residual modeling corrects systematic bias; and the resulting prediction intervals achieve near nominal coverage with widths shrinking as context length increases. The results indicate that integrating statistical reasoning with modern foundation models yields measurable gains in accuracy, reliability, and interpretability for real-world time series applications.

data mining, large language model, machine learning, (19 more...)

arXiv.org Machine Learning

2508.16641

Country:

Europe > Belgium (0.27)
North America > United States > Arizona (0.05)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
(2 more...)

Genre: Research Report (0.90)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.96)

Add feedback

Quickly Tuning Foundation Models for Image Segmentation

Das, Breenda, Purucker, Lennart, Carstensen, Timur, Hutter, Frank

arXiv.org Artificial IntelligenceAug-26-2025

Foundation models like SAM (Segment Anything Model) exhibit strong zero-shot image segmentation performance, but often fall short on domain-specific tasks. Fine-tuning these models typically requires significant manual effort and domain expertise. In this work, we introduce QTT-SEG, a meta-learning-driven approach for automating and accelerating the fine-tuning of SAM for image segmentation. Built on the Quick-Tune hyperparameter optimization framework, QTT-SEG predicts high-performing configurations using meta-learned cost and performance models, efficiently navigating a search space of over 200 million possibilities. We evaluate QTT-SEG on eight binary and five multiclass segmentation datasets under tight time constraints. Our results show that QTT-SEG consistently improves upon SAM's zero-shot performance and surpasses AutoGluon Multimodal, a strong AutoML baseline, on most binary tasks within three minutes. On multiclass datasets, QTT-SEG delivers consistent gains as well. These findings highlight the promise of meta-learning in automating model adaptation for specialized segmentation tasks. Code available at: https://github.com/ds-brx/QTT-SEG/

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2508.17283

Country:

Europe (0.68)
North America > Canada > Quebec (0.14)

Genre: Research Report > New Finding (0.86)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback

AutoML Benchmark with shorter time constraints and early stopping

Jurado, Israel Campero, Gijsbers, Pieter, Vanschoren, Joaquin

arXiv.org Artificial IntelligenceApr-16-2025

Automated Machine Learning (AutoML) automatically builds machine learning (ML) models on data. The de facto standard for evaluating new AutoML frameworks for tabular data is the AutoML Benchmark (AMLB). AMLB proposed to evaluate AutoML frameworks using 1-and 4-hour time budgets across 104 tasks. We argue that shorter time constraints should be considered for the benchmark because of their practical value, such as when models need to be retrained with high frequency, and to make AMLB more accessible. This work considers two ways in which to reduce the overall computation used in the benchmark: smaller time constraints and the use of early stopping. We conduct evaluations of 11 AutoML frameworks on 104 tasks with different time constraints and find the relative ranking of AutoML frameworks is fairly consistent across time constraints, but that using early-stopping leads to a greater variety in model performance. In Machine Learning (ML), manually creating good models is time-consuming and knowledge-intensive. Automated Machine Learning (AutoML) employs efficient automated search methods to create models for new data, often reducing the computational costs in the process Hutter et al. (2019); Hollmann et al. (2022). The AutoML Benchmark (AMLB, Gijsbers et al. 2024) has become the standard for the evaluation of AutoML frameworks on tabular data, greatly increasing reproducibility and comparability in AutoML research. We identified that the time budgets proposed by Gijsbers et al. (2024) were based on what seemed "practically reasonable" at the time, as signified by many frameworks' default time budget of one hour. While the authors motivate evaluating methods on two time budgets as a proxy for anytime performance, they do not motivate the particular choice of 1 hour and 4 hours. AutoML frameworks behave under different time constraints. We conduct similar experiments and analyses for frameworks with early-stopping, offering insights into its potential to reduce energy consumption in AutoML systems. However, we often see that the original benchmarking suite or time constraints (1 hour and 4 hour) are not used as proposed.

artificial intelligence, machine learning, optimization problem, (18 more...)

arXiv.org Artificial Intelligence

2504.01222

Country: Europe > Netherlands (0.14)

Genre: Research Report > Experimental Study (0.68)

Industry: Energy (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.46)

Add feedback

COPA: Comparing the Incomparable to Explore the Pareto Front

Javaloy, Adrián, Vergari, Antonio, Valera, Isabel

arXiv.org Artificial IntelligenceMar-18-2025

In machine learning (ML), it is common to account for multiple objectives when, e.g., selecting a model to deploy. However, it is often unclear how one should compare, aggregate and, ultimately, trade-off these objectives, as they might be measured in different units or scales. For example, when deploying large language models (LLMs), we might not only care about their performance, but also their CO2 consumption. In this work, we investigate how objectives can be sensibly compared and aggregated to navigate their Pareto front. To do so, we propose to make incomparable objectives comparable via their CDFs, approximated by their relative rankings. This allows us to aggregate them while matching user-specific preferences, allowing practitioners to meaningfully navigate and search for models in the Pareto front. We demonstrate the potential impact of our methodology in diverse areas such as LLM selection, domain generalization, and AutoML benchmarking, where classical ways to aggregate and normalize objectives fail.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2503.14321

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Maryland > Baltimore (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(12 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Tree Species Classification using Machine Learning and 3D Tomographic SAR -- a case study in Northern Europe

Grace, Colverd, Laura, Schade, Jumpei, Takami, Karol, Bot, Joseph, Gallego

arXiv.org Artificial IntelligenceNov-19-2024

Tree species classification plays an important role in nature conservation, forest inventories, forest management, and the protection of endangered species. Over the past four decades, remote sensing technologies have been extensively utilized for tree species classification, with Synthetic Aperture Radar (SAR) emerging as a key technique. In this study, we employed TomoSense, a 3D tomographic dataset, which utilizes a stack of single-look complex (SLC) images, a byproduct of SAR, captured at different incidence angles to generate a three-dimensional representation of the terrain. Our research focuses on evaluating multiple tabular machine-learning models using the height information derived from the tomographic image intensities to classify eight distinct tree species. The SLC data and tomographic imagery were analyzed across different polarimetric configurations and geosplit configurations. We investigated the impact of these variations on classification accuracy, comparing the performance of various tabular machine-learning models and optimizing them using Bayesian optimization. Additionally, we incorporated a proxy for actual tree height using point cloud data from Light Detection and Ranging (LiDAR) to provide height statistics associated with the model's predictions. This comparison offers insights into the reliability of tomographic data in predicting tree species classification based on height.

artificial intelligence, classification, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2411.12897

Country:

Europe > Northern Europe (0.40)
Europe > Germany > North Rhine-Westphalia (0.14)

Genre: Research Report > New Finding (0.89)

Industry: Energy > Oil & Gas (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.35)

Add feedback

UniAutoML: A Human-Centered Framework for Unified Discriminative and Generative AutoML with Large Language Models

Guo, Jiayi, Chen, Zan, Ji, Yingrui, Zhang, Liyun, Luo, Daqin, Li, Zhigang, Shen, Yiqin

arXiv.org Artificial IntelligenceOct-17-2024

Automated Machine Learning (AutoML) has simplified complex ML processes such as data pre-processing, model selection, and hyper-parameter searching. However, traditional AutoML frameworks focus solely on discriminative tasks, often falling short in tackling AutoML for generative models. Additionally, these frameworks lack interpretability and user engagement during the training process, primarily due to the absence of human-centered design. It leads to a lack of transparency in final decision-making and limited user control, potentially reducing trust and adoption of AutoML methods. To address these limitations, we introduce UniAutoML, a human-centered AutoML framework that leverages Large Language Models (LLMs) to unify AutoML for both discriminative (e.g., Transformers and CNNs for classification or regression tasks) and generative tasks (e.g., fine-tuning diffusion models or LLMs). The human-centered design of UniAutoML innovatively features a conversational user interface (CUI) that facilitates natural language interactions, providing users with real-time guidance, feedback, and progress updates for better interpretability. This design enhances transparency and user control throughout the AutoML training process, allowing users to seamlessly break down or modify the model being trained. To mitigate potential risks associated with LLM generated content, UniAutoML incorporates a safety guardline that filters inputs and censors outputs. We evaluated UniAutoML's performance and usability through experiments on eight diverse datasets and user studies involving 25 participants, demonstrating that UniAutoML not only enhances performance but also improves user control and trust. Our human-centered design bridges the gap between AutoML capabilities and user understanding, making ML more accessible to a broader audience.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.12841

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > China > Sichuan Province > Chengdu (0.04)
Asia > China > Shanghai > Shanghai (0.04)
(3 more...)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > New Finding (0.93)
Research Report > Experimental Study (0.68)

Industry:

Information Technology (0.67)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PORTAL: Scalable Tabular Foundation Models via Content-Specific Tokenization

Spinaci, Marco, Polewczyk, Marek, Hoffart, Johannes, Kohler, Markus C., Thelin, Sam, Klein, Tassilo

arXiv.org Artificial IntelligenceOct-17-2024

Self-supervised learning on tabular data seeks to apply advances from natural language and image domains to the diverse domain of tables. However, current techniques often struggle with integrating multi-domain data and require data cleaning or specific structural requirements, limiting the scalability of pre-training datasets. We introduce PORTAL (Pretraining One-Row-at-a-Time for All tabLes), a framework that handles various data modalities without the need for cleaning or preprocessing. This simple yet powerful approach can be effectively pre-trained on online-collected datasets and fine-tuned to match state-of-the-art methods on complex classification and regression tasks. This work offers a practical advancement in self-supervised learning for large-scale tabular data.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.13516

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > France (0.04)
Europe > Croatia > Dubrovnik-Neretva County > Dubrovnik (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)

Add feedback

AutoM3L: An Automated Multimodal Machine Learning Framework with Large Language Models

Luo, Daqin, Feng, Chengjian, Nong, Yuxuan, Shen, Yiqing

arXiv.org Artificial IntelligenceAug-1-2024

Automated Machine Learning (AutoML) offers a promising approach to streamline the training of machine learning models. However, existing AutoML frameworks are often limited to unimodal scenarios and require extensive manual configuration. Recent advancements in Large Language Models (LLMs) have showcased their exceptional abilities in reasoning, interaction, and code generation, presenting an opportunity to develop a more automated and user-friendly framework. To this end, we introduce AutoM3L, an innovative Automated Multimodal Machine Learning framework that leverages LLMs as controllers to automatically construct multimodal training pipelines. AutoM3L comprehends data modalities and selects appropriate models based on user requirements, providing automation and interactivity. By eliminating the need for manual feature engineering and hyperparameter optimization, our framework simplifies user engagement and enables customization through directives, addressing the limitations of previous rule-based AutoML approaches. We evaluate the performance of AutoM3L on six diverse multimodal datasets spanning classification, regression, and retrieval tasks, as well as a comprehensive set of unimodal datasets. The results demonstrate that AutoM3L achieves competitive or superior performance compared to traditional rule-based AutoML methods. Furthermore, a user study highlights the user-friendliness and usability of our framework, compared to the rule-based AutoML methods.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2408.00665

Country:

Oceania > Australia > Victoria > Melbourne (0.05)
Asia > China > Guangdong Province > Shenzhen (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > Experimental Study (0.93)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Squeezing Lemons with Hammers: An Evaluation of AutoML and Tabular Deep Learning for Data-Scarce Classification Applications

Knauer, Ricardo, Rodner, Erik

arXiv.org Artificial IntelligenceMay-13-2024

Many industry verticals are confronted with small-sized tabular data. In this lowdata regime, it is currently unclear whether the best performance can be expected from simple baselines, or more complex machine learning approaches that leverage meta-learning and ensembling. On 44 tabular classification datasets with sample sizes 500, we find that L2-regularized logistic regression performs similar to state-of-the-art automated machine learning (AutoML) frameworks (AutoPrognosis, AutoGluon) and off-the-shelf deep neural networks (TabPFN, HyperFast) on the majority of the benchmark datasets. We therefore recommend to consider logistic regression as the first choice for data-scarce applications with tabular data and provide practitioners with best practices for further method selection. Machine learning algorithms thrive on data (Banko & Brill, 2001; Halevy et al., 2009).

dataset, logistic regression, regression, (16 more...)

arXiv.org Artificial Intelligence

2405.07662

Country: North America > Montserrat (0.04)

Genre:

Research Report > New Finding (0.75)
Research Report > Experimental Study (0.61)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback