Goto

Collaborating Authors

 backward feature elimination


Subject-Adaptive Sparse Linear Models for Interpretable Personalized Health Prediction from Multimodal Lifelog Data

Bu, Dohyun, Han, Jisoo, Kwon, Soohwa, So, Yulim, Lee, Jong-Seok

arXiv.org Artificial Intelligence

Improved prediction of personalized health outcomes -- such as sleep quality and stress -- from multimodal lifelog data could have meaningful clinical and practical implications. However, state-of-the-art models, primarily deep neural networks and gradient-boosted ensembles, sacrifice interpretability and fail to adequately address the significant inter-individual variability inherent in lifelog data. To overcome these challenges, we propose the Subject-Adaptive Sparse Linear (SASL) framework, an interpretable modeling approach explicitly designed for personalized health prediction. SASL integrates ordinary least squares regression with subject-specific interactions, systematically distinguishing global from individual-level effects. We employ an iterative backward feature elimination method based on nested $F$-tests to construct a sparse and statistically robust model. Additionally, recognizing that health outcomes often represent discretized versions of continuous processes, we develop a regression-then-thresholding approach specifically designed to maximize macro-averaged F1 scores for ordinal targets. For intrinsically challenging predictions, SASL selectively incorporates outputs from compact LightGBM models through confidence-based gating, enhancing accuracy without compromising interpretability. Evaluations conducted on the CH-2025 dataset -- which comprises roughly 450 daily observations from ten subjects -- demonstrate that the hybrid SASL-LightGBM framework achieves predictive performance comparable to that of sophisticated black-box methods, but with significantly fewer parameters and substantially greater transparency, thus providing clear and actionable insights for clinicians and practitioners.


Backward Feature Elimination and its Implementation

#artificialintelligence

In the previous article, we saw another feature selection technique, the Low Variance Filter. So far we've seen Missing Value Ratio and Low Variance Filter techniques, In this article, I'm going to cover one more technique use for feature selection know as Backward Feature Elimination. Note: If you are more interested in learning concepts in an Audio-Visual format, We have this entire article explained in the video below. If not, you may continue reading. Let's say we have the same problem statement where we want to predict the fitness level based on the given feature- Let's assume we don't have any missing values in the dataset.


Comprehensive Guide to 12 Dimensionality Reduction Techniques

#artificialintelligence

Have you ever worked on a dataset with more than a thousand features? I have, and let me tell you it's a very challenging task, especially if you don't know where to start! Having a high number of variables is both a boon and a curse. It's great that we have loads of data for analysis, but it is challenging due to size. It's not feasible to analyze each and every variable at a microscopic level. It might take us days or months to perform any meaningful analysis and we'll lose a ton of time and money for our business! Not to mention the amount of computational power this will take. We need a better way to deal with high dimensional data so that we can quickly extract patterns and insights from it. So how do we approach such a dataset?


Data Dimensionality Reduction in the Age of Machine Learning

#artificialintelligence

Machine Learning is all the rage as companies try to make sense of the mountains of data they are collecting. Data is everywhere and proliferating at unprecedented speed. But, more data is not always better. In fact, large amounts of data can not only considerably slow down the system execution but can sometimes even produce worse performances in Data Analytics applications. We have found, through years of formal and informal testing, that data dimensionality reduction -- or the process of reducing the number of attributes under consideration when running analytics -- is useful not only for speeding up algorithm execution but also for improving overall model performance. This doesn't mean minimizing the volume of data being analyzed per se but rather being smarter about how data sets are constructed.


Beginners Guide To Learn Dimension Reduction Techniques

@machinelearnbot

This powerful quote by William Shakespeare applies well to techniques used in data science & analytics as well. Allow me to prove it using a short story. In May ' 2015, we conducted a Data Hackathon ( a data science competition) in Delhi-NCR, India. We gave participants the challenge to identify Human Activity Recognition Using Smartphones Data Set. The data set had 561 variables for training model used for the identification of Human activity in test data set.