archive
Safe Distributionally Robust Feature Selection under Covariate Shift
Hanada, Hiroyuki, Akahane, Satoshi, Hashimoto, Noriaki, Takeno, Shion, Takeuchi, Ichiro
In practical machine learning, the environments encountered during the model development and deployment phases often differ, especially when a model is used by many users in diverse settings. Learning models that maintain reliable performance across plausible deployment environments is known as distributionally robust (DR) learning. In this work, we study the problem of distributionally robust feature selection (DRFS), with a particular focus on sparse sensing applications motivated by industrial needs. In practical multi-sensor systems, a shared subset of sensors is typically selected prior to deployment based on performance evaluations using many available sensors. At deployment, individual users may further adapt or fine-tune models to their specific environments. When deployment environments differ from those anticipated during development, this strategy can result in systems lacking sensors required for optimal performance. To address this issue, we propose safe-DRFS, a novel approach that extends safe screening from conventional sparse modeling settings to a DR setting under covariate shift. Our method identifies a feature subset that encompasses all subsets that may become optimal across a specified range of input distribution shifts, with finite-sample theoretical guarantees of no false feature elimination.
An Enhanced Projection Pursuit Tree Classifier with Visual Methods for Assessing Algorithmic Improvements
da Silva, Natalia, Cook, Dianne, Lee, Eun-Kyung
This paper presents enhancements to the projection pursuit tree classifier and visual diagnostic methods for assessing their impact in high dimensions. The original algorithm uses linear combinations of variables in a tree structure where depth is constrained to be less than the number of classes -- a limitation that proves too rigid for complex classification problems. Our extensions improve performance in multi-class settings with unequal variance-covariance structures and nonlinear class separations by allowing more splits and more flexible class groupings in the projection pursuit computation. Proposing algorithmic improvements is straightforward; demonstrating their actual utility is not. We therefore develop two visual diagnostic approaches to verify that the enhancements perform as intended. Using high-dimensional visualization techniques, we examine model fits on benchmark datasets to assess whether the algorithm behaves as theorized. An interactive web application enables users to explore the behavior of both the original and enhanced classifiers under controlled scenarios. The enhancements are implemented in the R package PPtreeExt.
- Europe > Switzerland > Zürich > Zürich (1.00)
- North America > United States > New York > Kings County > New York City (0.40)
- Europe > Germany (0.04)
- (8 more...)
Arbitrarily Scalable Environment Generators via Neural Cellular Automata
We study the problem of generating arbitrarily large environments to improve the throughput of multi-robot systems. Prior work proposes Quality Diversity (QD) algorithms as an effective method for optimizing the environments of automated warehouses. However, these approaches optimize only relatively small environments, falling short when it comes to replicating real-world warehouse sizes. The challenge arises from the exponential increase in the search space as the environment size increases.
- North America > United States > California (0.14)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.94)
- Pacific Ocean (0.04)
- North America > United States > Oregon (0.04)
- North America > Canada > Quebec (0.04)
- (7 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Information Technology > Security & Privacy (1.00)
- Government > Military (1.00)
- Law (0.67)
- Asia > China > Guangdong Province > Guangzhou (0.04)
- Asia > China > Liaoning Province > Dalian (0.04)
- Asia > China > Guangdong Province > Guangzhou (0.04)
- Asia > China > Liaoning Province > Dalian (0.04)