AITopics | metafeature

Collaborating Authors

metafeature

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

When Do Neural Nets Outperform Boosted Trees on Tabular Data?

Neural Information Processing SystemsApr-30-2026, 06:26:33 GMT

data mining, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Information Technology (0.92)
Law (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

When Do Neural Nets Outperform Boosted Trees on Tabular Data?

Neural Information Processing SystemsFeb-17-2026, 21:42:45 GMT

data mining, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Maryland (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > California (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Information Technology (0.92)
Law (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Describing Nonstationary Data Streams in Frequency Domain

Komorniczak, Joanna

arXiv.org Artificial IntelligenceFeb-7-2025

Concept drift is among the primary challenges faced by the data stream processing methods. The drift detection strategies, designed to counteract the negative consequences of such changes, often rely on analyzing the problem metafeatures. This work presents the Frequency Filtering Metadescriptor -- a tool for characterizing the data stream that searches for the informative frequency components visible in the sample's feature vector. The frequencies are filtered according to their variance across all available data batches. The presented solution is capable of generating a metadescription of the data stream, separating chunks into groups describing specific concepts on its basis, and visualizing the frequencies in the original spatial domain. The experimental analysis compared the proposed solution with two state-of-the-art strategies and with the PCA baseline in the post-hoc concept identification task. The research is followed by the identification of concepts in the real-world data streams. The generalization in the frequency domain adapted in the proposed solution allows to capture the complex feature dependencies as a reduced number of frequency components, while maintaining the semantic meaning of data.

artificial intelligence, data mining, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2502.04813

Country:

Europe > Poland > Lower Silesia Province > Wroclaw (0.04)
South America > Brazil > Maranhão (0.04)
North America > United States > California (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

HyperQ-Opt: Q-learning for Hyperparameter Optimization

Hasan, Md. Tarek

arXiv.org Artificial IntelligenceDec-23-2024

Hyperparameter optimization (HPO) is critical for enhancing the performance of machine learning models, yet it often involves a computationally intensive search across a large parameter space. Traditional approaches such as Grid Search and Random Search suffer from inefficiency and limited scalability, while surrogate models like Sequential Model-based Bayesian Optimization (SMBO) rely heavily on heuristic predictions that can lead to suboptimal results. This paper presents a novel perspective on HPO by formulating it as a sequential decision-making problem and leveraging Q-learning, a reinforcement learning technique, to optimize hyperparameters. The study explores the works of H.S. Jomaa et al. and Qi et al., which model HPO as a Markov Decision Process (MDP) and utilize Q-learning to iteratively refine hyperparameter settings. The approaches are evaluated for their ability to find optimal or near-optimal configurations within a limited number of trials, demonstrating the potential of reinforcement learning to outperform conventional methods. Additionally, this paper identifies research gaps in existing formulations, including the limitations of discrete search spaces and reliance on heuristic policies, and suggests avenues for future exploration. By shifting the paradigm toward policy-based optimization, this work contributes to advancing HPO methods for scalable and efficient machine learning applications.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2412.17765

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

LMEMs for post-hoc analysis of HPO Benchmarking

Geburek, Anton, Mallik, Neeratyoy, Stoll, Danny, Bouthillier, Xavier, Hutter, Frank

arXiv.org Artificial IntelligenceAug-5-2024

The importance of tuning hyperparameters in Machine Learning (ML) and Deep Learning (DL) is established through empirical research and applications, evident from the increase in new hyperparameter optimization (HPO) algorithms and benchmarks steadily added by the community. However, current benchmarking practices using averaged performance across many datasets may obscure key differences between HPO methods, especially for pairwise comparisons. In this work, we apply Linear Mixed-Effect Models-based (LMEMs) significance testing for post-hoc analysis of HPO benchmarking runs. LMEMs allow flexible and expressive modeling on the entire experiment data, including information such as benchmark meta-features, offering deeper insights than current analysis practices. We demonstrate this through a case study on the PriorBand paper's experiment data to find insights not reported in the original work.

algorithm, benchmark, experiment data, (16 more...)

arXiv.org Artificial Intelligence

2408.02533

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
Europe > Germany > Baden-Württemberg > Freiburg (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

When Do Neural Nets Outperform Boosted Trees on Tabular Data?

McElfresh, Duncan, Khandagale, Sujay, Valverde, Jonathan, C, Vishak Prasad, Feuer, Benjamin, Hegde, Chinmay, Ramakrishnan, Ganesh, Goldblum, Micah, White, Colin

arXiv.org Machine LearningOct-30-2023

Tabular data is one of the most commonly used types of data in machine learning. Despite recent advances in neural nets (NNs) for tabular data, there is still an active discussion on whether or not NNs generally outperform gradient-boosted decision trees (GBDTs) on tabular data, with several recent works arguing either that GBDTs consistently outperform NNs on tabular data, or vice versa. In this work, we take a step back and question the importance of this debate. To this end, we conduct the largest tabular data analysis to date, comparing 19 algorithms across 176 datasets, and we find that the 'NN vs. GBDT' debate is overemphasized: for a surprisingly high number of datasets, either the performance difference between GBDTs and NNs is negligible, or light hyperparameter tuning on a GBDT is more important than choosing between NNs and GBDTs. A remarkable exception is the recently-proposed prior-data fitted network, TabPFN: although it is effectively limited to training sets of size 3000, we find that it outperforms all other algorithms on average, even when randomly sampling 3000 training datapoints. Next, we analyze dozens of metafeatures to determine what properties of a dataset make NNs or GBDTs better-suited to perform well. For example, we find that GBDTs are much better than NNs at handling skewed or heavy-tailed feature distributions and other forms of dataset irregularities. Our insights act as a guide for practitioners to determine which techniques may work best on their dataset. Finally, with the goal of accelerating tabular data research, we release the TabZilla Benchmark Suite: a collection of the 36 'hardest' of the datasets we study. Our benchmark suite, codebase, and all raw results are available at https://github.com/naszilla/tabzilla.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Machine Learning

2305.02997

Country:

North America > United States > New York (0.04)
North America > United States > Maryland (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > California (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.92)

Industry: Information Technology (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Metafeatures-based Rule-Extraction for Classifiers on Behavioral and Textual Data

Ramon, Yanou, Martens, David, Evgeniou, Theodoros, Praet, Stiene

arXiv.org Artificial IntelligenceMar-10-2020

Machine learning using behavioral and text data can result in highly accurate prediction models, but these are often very difficult to interpret. Linear models require investigating thousands of coefficients, while the opaqueness of nonlinear models makes things even worse. Rule-extraction techniques have been proposed to combine the desired predictive behaviour of complex "black-box" models with explainability. However, rule-extraction in the context of ultra-high-dimensional and sparse data can be challenging, and has thus far received scant attention. Because of the sparsity and massive dimensionality, rule-extraction might fail in their primary explainability goal as the black-box model may need to be replaced by many rules, leaving the user again with an incomprehensible model. To address this problem, we develop and test a rule-extraction methodology based on higher-level, less-sparse "metafeatures". We empirically validate the quality of the rules in terms of fidelity, explanation stability and accuracy over a collection of data sets, and benchmark their performance against rules extracted using the original features. Our analysis points to key trade-offs between explainability, fidelity, accuracy, and stability that Machine Learning researchers and practitioners need to consider. Results indicate that the proposed metafeatures approach leads to better trade-offs between these, and is better able to mimic the black-box model. There is an average decrease of the loss in fidelity, accuracy, and stability from using metafeatures instead of the original fine-grained features by respectively 18.08%, 20.15% and 17.73%, all statistically significant at a 5% significance level. Metafeatures thus improve a key "cost of explainability", which we define as the loss in fidelity when replacing a black-box with an explainable model.

fidelity, metafeature, stability, (17 more...)

arXiv.org Artificial Intelligence

2003.04792

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Orange County > Irvine (0.14)
Europe > Belgium > Flanders > Antwerp Province > Antwerp (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Air (1.00)
Law (1.00)
Information Technology > Services (1.00)
Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Associated Learning: Decomposing End-to-end Backpropagation based on Auto-encoders and Target Propagation

Kao, Yu-Wei, Chen, Hung-Hsuan

arXiv.org Machine LearningJun-13-2019

Backpropagation has been widely used in deep learning approaches, but it is inefficient and sometimes unstable because of backward locking and vanishing/exploding gradient problems, especially when the gradient flow is long. Additionally, updating all edge weights based on a single objective seems biologically implausible. In this paper, we introduce a novel biologically motivated learning structure called Associated Learning, which modularizes the network into smaller components, each of which has a local objective. Because the objectives are mutually independent, Associated Learning can learn the parameters independently and simultaneously when these parameters belong to different components. Surprisingly, training deep models by Associated Learning yields comparable accuracies to models trained using typical backpropagation methods, which aims at fitting the target variable directly. Moreover, probably because the gradient flow of each component is short, deep networks can still be trained with Associated Learning even when some of the activation functions are sigmoid-a situation that usually results in the vanishing gradient problem when using typical backpropagation. We also found that the Associated Learning generates better metafeatures, which we demonstrated both quantitatively (via inter-class and intra-class distance comparisons in the hidden layers) and qualitatively (by visualizing the hidden layers using t-SNE).

backpropagation, deep learning, neural network, (17 more...)

arXiv.org Machine Learning

1906.0556

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (1.00)

Add feedback

Scalable Meta-Learning for Bayesian Optimization

Feurer, Matthias, Letham, Benjamin, Bakshy, Eytan

arXiv.org Machine LearningFeb-6-2018

Bayesian optimization has become a standard technique for hyperparameter optimization, including data-intensive models such as deep neural networks that may take days or weeks to train. We consider the setting where previous optimization runs are available, and we wish to use their results to warm-start a new optimization run. We develop an ensemble model that can incorporate the results of past optimization runs, while avoiding the poor scaling that comes with putting all results into a single Gaussian process model. The ensemble combines models from past runs according to estimates of their generalization performance on the current optimization. Results from a large collection of hyperparameter optimization benchmark problems and from optimization of a production computer vision platform at Facebook show that the ensemble can substantially reduce the time it takes to obtain near-optimal configurations, and is useful for warm-starting expensive searches or running quick re-optimizations.

artificial intelligence, machine learning, optimization, (18 more...)

arXiv.org Machine Learning

1802.02219

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

autoBagging: Learning to Rank Bagging Workflows with Metalearning

Pinto, Fábio, Cerqueira, Vítor, Soares, Carlos, Mendes-Moreira, João

arXiv.org Machine LearningJun-28-2017

Machine Learning (ML) has been successfully applied to a wide range of domains and applications. One of the techniques behind most of these successful applications is Ensemble Learning (EL), the field of ML that gave birth to methods such as Random Forests or Boosting. The complexity of applying these techniques together with the market scarcity on ML experts, has created the need for systems that enable a fast and easy drop-in replacement for ML libraries. Automated machine learning (autoML) is the field of ML that attempts to answers these needs. Typically, these systems rely on optimization techniques such as bayesian optimization to lead the search for the best model. Our approach differs from these systems by making use of the most recent advances on metalearning and a learning to rank approach to learn from metadata. We propose autoBagging, an autoML system that automatically ranks 63 bagging workflows by exploiting past performance and dataset characterization. Results on 140 classification datasets from the OpenML platform show that autoBagging can yield better performance than the Average Rank method and achieve results that are not statistically different from an ideal model that systematically selects the best workflow for each dataset. For the purpose of reproducibility and generalizability, autoBagging is publicly available as an R package on CRAN.

artificial intelligence, machine learning, workflow, (14 more...)

arXiv.org Machine Learning

1706.09367

Genre:

Research Report (1.00)
Workflow (0.98)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback