AITopics | lightgbm

LightGBM: A Highly Efficient Gradient Boosting Decision Tree

Neural Information Processing SystemsMar-17-2026, 14:37:40 GMT

Gradient Boosting Decision Tree (GBDT) is a popular machine learning algorithm, and has quite a few effective implementations such as XGBoost and pGBRT. Although many engineering optimizations have been adopted in these implementations, the efficiency and scalability are still unsatisfactory when the feature dimension is high and data size is large. A major reason is that for each feature, they need to scan all the data instances to estimate the information gain of all possible split points, which is very time consuming. To tackle this problem, we propose two novel techniques: \emph{Gradient-based One-Side Sampling} (GOSS) and \emph{Exclusive Feature Bundling} (EFB). With GOSS, we exclude a significant proportion of data instances with small gradients, and only use the rest to estimate the information gain. We prove that, since the data instances with larger gradients play a more important role in the computation of information gain, GOSS can obtain quite accurate estimation of the information gain with a much smaller data size. With EFB, we bundle mutually exclusive features (i.e., they rarely take nonzero values simultaneously), to reduce the number of features. We prove that finding the optimal bundling of exclusive features is NP-hard, but a greedy algorithm can achieve quite good approximation ratio (and thus can effectively reduce the number of features without hurting the accuracy of split point determination by much).

artificial intelligence, machine learning, proceedings, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.61)

Add feedback

HW-Aware-GPT-Bench

Neural Information Processing SystemsFeb-15-2026, 18:08:49 GMT

evolutionary algorithm, large language model, machine learning, (21 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Saxony > Leipzig (0.04)
Europe > Germany > Baden-Württemberg > Freiburg (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
(3 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Energy (0.45)
Government (0.45)
Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(6 more...)

Add feedback

3ffebb08d23c609875d7177ee769a3e9-Paper.pdf

Neural Information Processing SystemsFeb-12-2026, 00:13:35 GMT

dataset, sparrow, weak rule, (16 more...)

Neural Information Processing Systems

Country:

Asia > Japan (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.97)
Information Technology > Data Science > Data Mining (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

Add feedback

3ffebb08d23c609875d7177ee769a3e9-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-12-2026, 00:13:19 GMT

dataset, suggestion, training time, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

146f7dd4c91bc9d80cf4458ad6d6cd1b-Supplemental.pdf

Neural Information Processing SystemsFeb-7-2026, 13:54:30 GMT

generalization, prediction, probability, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.04)
Europe > Denmark (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

MarginsareInsufficientforExplainingGradient Boosting

Neural Information Processing SystemsFeb-7-2026, 13:54:22 GMT

The success of boosted classifiers is most often attributed to improvements in margins.

artificial intelligence, generalization, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Denmark (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.47)

Add feedback

The Challenger: When Do New Data Sources Justify Switching Machine Learning Models?

Digalakis, Vassilis Jr, Pérignon, Christophe, Saurin, Sébastien, Sentenac, Flore

arXiv.org Machine LearningDec-23-2025

We study the problem of deciding whether, and when an organization should replace a trained incumbent model with a challenger relying on newly available features. We develop a unified economic and statistical framework that links learning-curve dynamics, data-acquisition and retraining costs, and discounting of future gains. First, we characterize the optimal switching time in stylized settings and derive closed-form expressions that quantify how horizon length, learning-curve curvature, and cost differentials shape the optimal decision. Second, we propose three practical algorithms--a one-shot baseline, a greedy sequential method, and a look-ahead sequential method. Using a real-world credit-scoring dataset with gradually arriving alternative data, we show that (i) optimal switching times vary systematically with cost parameters and learning-curve behavior, and (ii) the look-ahead sequential method outperforms other methods and is able to approach in value an oracle with full foresight. Finally, we establish finite-sample guarantees, including conditions under which the sequential look-ahead method achieve sublinear regret relative to that oracle. Our results provide an operational blueprint for economically sound model transitions as new data sources become available.

algorithm, challenger, erignon and saurin and sentenac, (13 more...)

arXiv.org Machine Learning

2512.1839

Country: North America > United States (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology (1.00)
Banking & Finance > Loans (1.00)
Health & Medicine (0.92)
Banking & Finance > Credit (0.88)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Supervised learning pays attention

Craig, Erin, Tibshirani, Robert

arXiv.org Machine LearningDec-11-2025

In-context learning with attention enables large neural networks to make context-specific predictions by selectively focusing on relevant examples. Here, we adapt this idea to supervised learning procedures such as lasso regression and gradient boosting, for tabular data. Our goals are to (1) flexibly fit personalized models for each prediction point and (2) retain model simplicity and interpretability. Our method fits a local model for each test observation by weighting the training data according to attention, a supervised similarity measure that emphasizes features and interactions that are predictive of the outcome. Attention weighting allows the method to adapt to heterogeneous data in a data-driven way, without requiring cluster or similarity pre-specification. Further, our approach is uniquely interpretable: for each test observation, we identify which features are most predictive and which training observations are most relevant. We then show how to use attention weighting for time series and spatial data, and we present a method for adapting pretrained tree-based models to distributional shift using attention-weighted residual corrections. Across real and simulated datasets, attention weighting improves predictive performance while preserving interpretability, and theory shows that attention-weighting linear models attain lower mean squared error than the standard linear model under mixture-of-models data-generating processes with known subgroup structure.

attention lasso, lasso, prediction, (15 more...)

arXiv.org Machine Learning

2512.09912

Country:

North America > United States > Michigan (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Beyond the Hype: Comparing Lightweight and Deep Learning Models for Air Quality Forecasting

Gondal, Moazzam Umer, Qudous, Hamad ul, Farhan, Asma Ahmad

arXiv.org Machine LearningDec-11-2025

Accurate forecasting of urban air pollution is essential for protecting public health and guiding mitigation policies. While Deep Learning (DL) and hybrid pipelines dominate recent research, their complexity and limited interpretability hinder operational use. This study investigates whether lightweight additive models -- Facebook Prophet (FBP) and NeuralProphet (NP) -- can deliver competitive forecasts for particulate matter (PM$_{2.5}$, PM$_{10}$) in Beijing, China. Using multi-year pollutant and meteorological data, we applied systematic feature selection (correlation, mutual information, mRMR), leakage-safe scaling, and chronological data splits. Both models were trained with pollutant and precursor regressors, with NP additionally leveraging lagged dependencies. For context, two machine learning baselines (LSTM, LightGBM) and one traditional statistical model (SARIMAX) were also implemented. Performance was evaluated on a 7-day holdout using MAE, RMSE, and $R^2$. Results show that FBP consistently outperformed NP, SARIMAX, and the learning-based baselines, achieving test $R^2$ above 0.94 for both pollutants. These findings demonstrate that interpretable additive models remain competitive with both traditional and complex approaches, offering a practical balance of accuracy, transparency, and ease of deployment.

forecasting, pm 10, pm 2, (14 more...)

arXiv.org Machine Learning

2512.09076

Country:

Asia > China > Beijing > Beijing (0.26)
Asia > Middle East > UAE (0.14)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
(8 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

An Improved Ensemble-Based Machine Learning Model with Feature Optimization for Early Diabetes Prediction

Islam, Md. Najmul, Rimon, Md. Miner Hossain, Shamim, Shah Sadek-E-Akbor, Fahad, Zarif Mohaimen, Mony, Md. Jehadul Islam, Chowdhury, Md. Jalal Uddin

arXiv.org Artificial IntelligenceDec-10-2025

Diabetes is a serious worldwide health issue, and successful intervention depends on early detection. However, overlapping risk factors and data asymmetry make prediction difficult. To use extensive health survey data to create a machine learning framework for diabetes classification that is both accurate and comprehensible, to produce results that will aid in clinical decision-making. Using the BRFSS dataset, we assessed a number of supervised learning techniques. SMOTE and Tomek Links were used to correct class imbalance. To improve prediction performance, both individual models and ensemble techniques such as stacking were investigated. The 2015 BRFSS dataset, which includes roughly 253,680 records with 22 numerical features, is used in this study. Strong ROC-AUC performance of approximately 0.96 was attained by the individual models Random Forest, XGBoost, CatBoost, and LightGBM.The stacking ensemble with XGBoost and KNN yielded the best overall results with 94.82\% accuracy, ROC-AUC of 0.989, and PR-AUC of 0.991, indicating a favourable balance between recall and precision. In our study, we proposed and developed a React Native-based application with a Python Flask backend to support early diabetes prediction, providing users with an accessible and efficient health monitoring tool.

artificial intelligence, dataset, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2512.02023

Country: