robust framework
LLM-Lasso: A Robust Framework for Domain-Informed Feature Selection and Regularization
Zhang, Erica, Goto, Ryunosuke, Sagan, Naomi, Mutter, Jurik, Phillips, Nick, Alizadeh, Ash, Lee, Kangwook, Blanchet, Jose, Pilanci, Mert, Tibshirani, Robert
We introduce LLM-Lasso, a novel framework that leverages large language models (LLMs) to guide feature selection in Lasso $\ell_1$ regression. Unlike traditional methods that rely solely on numerical data, LLM-Lasso incorporates domain-specific knowledge extracted from natural language, enhanced through a retrieval-augmented generation (RAG) pipeline, to seamlessly integrate data-driven modeling with contextual insights. Specifically, the LLM generates penalty factors for each feature, which are converted into weights for the Lasso penalty using a simple, tunable model. Features identified as more relevant by the LLM receive lower penalties, increasing their likelihood of being retained in the final model, while less relevant features are assigned higher penalties, reducing their influence. Importantly, LLM-Lasso has an internal validation step that determines how much to trust the contextual knowledge in our prediction pipeline. Hence it addresses key challenges in robustness, making it suitable for mitigating potential inaccuracies or hallucinations from the LLM. In various biomedical case studies, LLM-Lasso outperforms standard Lasso and existing feature selection baselines, all while ensuring the LLM operates without prior access to the datasets. To our knowledge, this is the first approach to effectively integrate conventional feature selection techniques directly with LLM-based domain-specific reasoning.
PRISM: A Robust Framework for Skill-based Meta-Reinforcement Learning with Noisy Demonstrations
Lee, Sanghyeon, Bae, Sangjun, Park, Yisak, Han, Seungyul
Meta-reinforcement learning (Meta-RL) facilitates rapid adaptation to unseen tasks but faces challenges in long-horizon environments. Skill-based approaches tackle this by decomposing state-action sequences into reusable skills and employing hierarchical decision-making. However, these methods are highly susceptible to noisy offline demonstrations, resulting in unstable skill learning and degraded performance. To overcome this, we propose Prioritized Refinement for Skill-Based Meta-RL (PRISM), a robust framework that integrates exploration near noisy data to generate online trajectories and combines them with offline data. Through prioritization, PRISM extracts high-quality data to learn task-relevant skills effectively. By addressing the impact of noise, our method ensures stable skill learning and achieves superior performance in long-horizon tasks, even with noisy and sub-optimal data.
A Review and A Robust Framework of Data-Efficient 3D Scene Parsing with Traditional/Learned 3D Descriptors
Existing state-of-the-art 3D point cloud understanding methods merely perform well in a fully supervised manner. To the best of our knowledge, there exists no unified framework that simultaneously solves the downstream high-level understanding tasks including both segmentation and detection, especially when labels are extremely limited. This work presents a general and simple framework to tackle point cloud understanding when labels are limited. The first contribution is that we have done extensive methodology comparisons of traditional and learned 3D descriptors for the task of weakly supervised 3D scene understanding, and validated that our adapted traditional PFH-based 3D descriptors show excellent generalization ability across different domains. The second contribution is that we proposed a learning-based region merging strategy based on the affinity provided by both the traditional/learned 3D descriptors and learned semantics. The merging process takes both low-level geometric and high-level semantic feature correlations into consideration. Experimental results demonstrate that our framework has the best performance among the three most important weakly supervised point clouds understanding tasks including semantic segmentation, instance segmentation, and object detection even when very limited number of points are labeled. Our method, termed Region Merging 3D (RM3D), has superior performance on ScanNet data-efficient learning online benchmarks and other four large-scale 3D understanding benchmarks under various experimental settings, outperforming current arts by a margin for various 3D understanding tasks without complicated learning strategies such as active learning.
Robust Framework for Explanation Evaluation in Time Series Classification
Nguyen, Thu Trang, Nguyen, Thach Le, Ifrim, Georgiana
Time series classification is a task which deals with a prevalent data type, temporal sequences, common in domains such as human activity recognition, sports analytics and general healthcare. This paper provides a framework to quantitatively evaluate and rank explanation methods for time series classification. The recent interest in explanation methods for time series has provided a great variety of explanation techniques. Nevertheless, when the explanations disagree on a specific problem, it remains unclear which of them to use. Comparing multiple explanations to find the right answer is non-trivial. Two key challenges remain: how to quantitatively and robustly evaluate the informativeness of a given explanation method (i.e., relevance for the classification task), and how to compare explanation methods side-by-side. We propose AMEE, a robust Model-Agnostic Explanation Evaluation framework for evaluating and comparing multiple saliency-based explanations for time series classification. In this approach, data perturbation is added to the input time series guided by each explanation. The impact of perturbation on classification accuracy is then measured and used for explanation evaluation. The results show that perturbing discriminative parts of the time series leads to significant changes in classification accuracy which can be used to evaluate each explanation. To be robust to different types of perturbations and different types of classifiers, we aggregate the accuracy loss across perturbations and classifiers. This novel approach allows us to quantify and rank different explanation methods. We provide a quantitative and qualitative analysis for synthetic datasets, a variety of time-series datasets, as well as a real-world dataset with known expert ground truth.
A view on Machine Learning operations infrastructure
Generating a working (value-generating) machine learning model is not an easy task. It usually involves advanced modelling techniques and teams with scarce skills. However, this is only the first step on an even more complex task: deploying the model into production and preventing its degradation. Even being alleviated by the cloud shift, at least two-thirds of IT spent is still concentrated on maintenance-mode tasks. There is still little research about where this split holds for ML related projects or not, but my take is that this percentage will even increase significantly due to the fact that an ML workload has more "liquid" inputs and fewer control levers as shown below: In essence, maintenance is mainly driven by the level of variability and control we may have over the different components on the system.