dimension reduction
- North America > United States > California > Alameda County > Berkeley (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Arizona > Maricopa County > Phoenix (0.04)
- (22 more...)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- Europe > Netherlands > South Holland > Leiden (0.05)
- North America > United States > California (0.04)
- (2 more...)
Dimension-reduced outcome-weighted learning for estimating individualized treatment regimes in observational studies
Son, Sungtaek, Lila, Eardi, Chan, Kwun Chuen Gary
Individualized treatment regimes (ITRs) aim to improve clinical outcomes by assigning treatment based on patient-specific characteristics. However, existing methods often struggle with high-dimensional covariates, limiting accuracy, interpretability, and real-world applicability. We propose a novel sufficient dimension reduction approach that directly targets the contrast between potential outcomes and identifies a low-dimensional subspace of the covariates capturing treatment effect heterogeneity. This reduced representation enables more accurate estimation of optimal ITRs through outcome-weighted learning. To accommodate observational data, our method incorporates kernel-based covariate balancing, allowing treatment assignment to depend on the full covariate set and avoiding the restrictive assumption that the subspace sufficient for modeling heterogeneous treatment effects is also sufficient for confounding adjustment. We show that the proposed method achieves universal consistency, i.e., its risk converges to the Bayes risk, under mild regularity conditions. We demonstrate its finite sample performance through simulations and an analysis of intensive care unit sepsis patient data to determine who should receive transthoracic echocardiography.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Middle East > Jordan (0.04)
- (2 more...)
Nonlinear Sufficient Dimension Reduction with a Stochastic Neural Network
Sufficient dimension reduction is a powerful tool to extract core information hidden in the high-dimensional data and has potentially many important applications in machine learning tasks. However, the existing nonlinear sufficient dimension reduction methods often lack the scalability necessary for dealing with large-scale data. We propose a new type of stochastic neural network under a rigorous probabilistic framework and show that it can be used for sufficient dimension reduction for large-scale data. The proposed stochastic neural network is trained using an adaptive stochastic gradient Markov chain Monte Carlo algorithm, whose convergence is rigorously studied in the paper as well. Through extensive experiments on real-world classification and regression problems, we show that the proposed method compares favorably with the existing state-of-the-art sufficient dimension reduction methods and is computationally more efficient for large-scale data.
Structure-Preserving Nonlinear Sufficient Dimension Reduction for Tensors
Lin, Dianjun, Li, Bing, Xue, Lingzhou
We introduce two nonlinear sufficient dimension reduction methods for regressions with tensor-valued predictors. Our goal is two-fold: the first is to preserve the tensor structure when performing dimension reduction, particularly the meaning of the tensor modes, for improved interpretation; the second is to substantially reduce the number of parameters in dimension reduction, thereby achieving model parsimony and enhancing estimation accuracy. Our two tensor dimension reduction methods echo the two commonly used tensor decomposition mechanisms: one is the Tucker decomposition, which reduces a larger tensor to a smaller one; the other is the CP-decomposition, which represents an arbitrary tensor as a sequence of rank-one tensors. We developed the Fisher consistency of our methods at the population level and established their consistency and convergence rates. Both methods are easy to implement numerically: the Tucker-form can be implemented through a sequence of least-squares steps, and the CP-form can be implemented through a sequence of singular value decompositions. We investigated the finite-sample performance of our methods and showed substantial improvement in accuracy over existing methods in simulations and two data applications.
- North America > United States > Pennsylvania (0.40)
- Africa > Senegal > Kolda Region > Kolda (0.04)
- Health & Medicine > Health Care Technology (0.67)
- Health & Medicine > Therapeutic Area (0.46)
Sufficient dimension reduction for classification using principal optimal transport direction
Sufficient dimension reduction is used pervasively as a supervised dimension reduction approach. Most existing sufficient dimension reduction methods are developed for data with a continuous response and may have an unsatisfactory performance for the categorical response, especially for the binary-response. To address this issue, we propose a novel estimation method of sufficient dimension reduction subspace (SDR subspace) using optimal transport. The proposed method, named principal optimal transport direction (POTD), estimates the basis of the SDR subspace using the principal directions of the optimal transport coupling between the data respecting different response categories. The proposed method also reveals the relationship among three seemingly irrelevant topics, i.e., sufficient dimension reduction, support vector machine, and optimal transport. We study the asymptotic properties of POTD and show that in the cases when the class labels contain no error, POTD estimates the SDR subspace exclusively. Empirical studies show POTD outperforms most of the state-of-the-art linear dimension reduction methods.
On Conditional Stochastic Interpolation for Generative Nonlinear Sufficient Dimension Reduction
Xu, Shuntuo, Yu, Zhou, Huang, Jian
Identifying low-dimensional sufficient structures in nonlinear sufficient dimension reduction (SDR) has long been a fundamental yet challenging problem. Most existing methods lack theoretical guarantees of exhaustiveness in identifying lower dimensional structures, either at the population level or at the sample level. We tackle this issue by proposing a new method, generative sufficient dimension reduction (GenSDR), which leverages modern generative models. We show that GenSDR is able to fully recover the information contained in the central $σ$-field at both the population and sample levels. In particular, at the sample level, we establish a consistency property for the GenSDR estimator from the perspective of conditional distributions, capitalizing on the distributional learning capabilities of deep generative models. Moreover, by incorporating an ensemble technique, we extend GenSDR to accommodate scenarios with non-Euclidean responses, thereby substantially broadening its applicability. Extensive numerical results demonstrate the outstanding empirical performance of GenSDR and highlight its strong potential for addressing a wide range of complex, real-world tasks.
- Oceania > Australia > Western Australia > North West Shelf (0.04)
- North America > United States > Texas > Kleberg County (0.04)
- North America > United States > Texas > Chambers County (0.04)
- Asia > China > Hong Kong (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
- Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.91)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)
Fourier-Invertible Neural Encoder (FINE) for Homogeneous Flows
Ouyang, Anqiao, Ke, Hongyi, Wang, Qi
We present the Fourier-Invertible Neural Encoder (FINE), a compact and interpretable architecture for dimension reduction in translation-equivariant datasets. FINE integrates reversible filters and monotonic activation functions with a Fourier truncation bottleneck, achieving information-preserving compression that respects translational symmetry. This design offers a new perspective on symmetry-aware learning, linking spectral truncation to group-equivariant representations. The proposed FINE architecture is tested on one-dimensional nonlinear wave interaction, one-dimensional Kuramoto-Sivashinsky turbulence dataset, and a two-dimensional turbulence dataset. FINE achieves an overall 4.9-9.1 times lower reconstruction error than convolutional autoencoders while using only 13-21% of their parameters. The results highlight FINE's effectiveness in representing complex physical systems with minimal dimension in the latent space. The proposed framework provides a principled framework for interpretable, low-parameter, and symmetry-preserving dimensional reduction, bridging the gap between Fourier representations and modern neural architectures for scientific and physics-informed learning.
- North America > United States > California > San Diego County > San Diego (0.05)
- North America > Canada > British Columbia > Regional District of Central Okanagan > Kelowna (0.04)
- Europe > Switzerland (0.04)
- (2 more...)
A PLS-Integrated LASSO Method with Application in Index Tracking
Tang, Shiqin, Dong, Yining, Qin, S. Joe
In traditional multivariate data analysis, dimension reduction and regression have been treated as distinct endeavors. Established techniques such as principal component regression (PCR) and partial least squares (PLS) regression traditionally compute latent components as intermediary steps -- although with different underlying criteria -- before proceeding with the regression analysis. In this paper, we introduce an innovative regression methodology named PLS-integrated Lasso (PLS-Lasso) that integrates the concept of dimension reduction directly into the regression process. We present two distinct formulations for PLS-Lasso, denoted as PLS-Lasso-v1 and PLS-Lasso-v2, along with clear and effective algorithms that ensure convergence to global optima. PLS-Lasso-v1 and PLS-Lasso-v2 are compared with Lasso on the task of financial index tracking and show promising results.
- Asia > China > Hong Kong (0.77)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
Classification EM-PCA for clustering and embedding
Tighidet, Zineddine, Labiod, Lazhar, Nadif, Mohamed
The mixture model is undoubtedly one of the greatest contributions to clustering. For continuous data, Gaussian models are often used and the Expectation-Maximization (EM) algorithm is particularly suitable for estimating parameters from which clustering is inferred. If these models are particularly popular in various domains including image clustering, they however suffer from the dimensionality and also from the slowness of convergence of the EM algorithm. However, the Classification EM (CEM) algorithm, a classifying version, offers a fast convergence solution while dimensionality reduction still remains a challenge. Thus we propose in this paper an algorithm combining simultaneously and non-sequentially the two tasks --Data embedding and Clustering-- relying on Principal Component Analysis (PCA) and CEM. We demonstrate the interest of such approach in terms of clustering and data embedding. We also establish different connections with other clustering approaches.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Europe > France > Île-de-France > Paris > Paris (0.04)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- Asia > China (0.04)