AITopics

2503.14473

Country: North America > United States (1.00)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

arXiv.org Artificial IntelligenceMar-10-2025

Kernel-based estimators for functional causal effects

Raykov, Yordan P., Luo, Hengrui, Strait, Justin D., KhudaBukhsh, Wasiur R.

We propose causal effect estimators based on empirical Fr\'{e}chet means and operator-valued kernels, tailored to functional data spaces. These methods address the challenges of high-dimensionality, sequential ordering, and model complexity while preserving robustness to treatment misspecification. Using structural assumptions, we obtain compact representations of potential outcomes, enabling scalable estimation of causal effects over time and across covariates. We provide both theoretical, regarding the consistency of functional causal effects, as well as empirical comparison of a range of proposed causal effect estimators. Applications to binary treatment settings with functional outcomes illustrate the framework's utility in biomedical monitoring, where outcomes exhibit complex temporal dynamics. Our estimators accommodate scenarios with registered covariates and outcomes, aligning them to the Fr\'{e}chet means, as well as cases requiring higher-order representations to capture intricate covariate-outcome interactions. These advancements extend causal inference to dynamic and non-linear domains, offering new tools for understanding complex treatment effects in functional data settings.

artificial intelligence, estimator, machine learning, (18 more...)

2503.05024

Country:

North America > United States (1.00)
Europe > United Kingdom > England > Nottinghamshire > Nottingham (0.14)

Genre: Research Report > Experimental Study (0.68)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Parkinson's Disease (0.67)
Health & Medicine > Therapeutic Area > Musculoskeletal (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.45)

arXiv.org Machine LearningFeb-18-2025

Asymptotic Optimism of Random-Design Linear and Kernel Regression Models

Luo, Hengrui, Zhu, Yunzhang

We derived the closed-form asymptotic optimism of linear regression models under random designs, and generalizes it to kernel ridge regression. Using scaled asymptotic optimism as a generic predictive model complexity measure, we studied the fundamental different behaviors of linear regression model, tangent kernel (NTK) regression model and three-layer fully connected neural networks (NN). Our contribution is two-fold: we provided theoretical ground for using scaled optimism as a model predictive complexity measure; and we show empirically that NN with ReLUs behaves differently from kernel models under this measure. With resampling techniques, we can also compute the optimism for regression models with real data.

artificial intelligence, machine learning, optimism, (17 more...)

2502.12999

Country: North America > United States (0.67)

Genre: Research Report > New Finding (0.67)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

arXiv.org Machine LearningNov-7-2024

Compactly-supported nonstationary kernels for computing exact Gaussian processes on big data

Risser, Mark D., Noack, Marcus M., Luo, Hengrui, Pandolfi, Ronald

The Gaussian process (GP) is a widely used probabilistic machine learning method for stochastic function approximation, stochastic modeling, and analyzing real-world measurements of nonlinear processes. Unlike many other machine learning methods, GPs include an implicit characterization of uncertainty, making them extremely useful across many areas of science, technology, and engineering. Traditional implementations of GPs involve stationary kernels (also termed covariance functions) that limit their flexibility and exact methods for inference that prevent application to data sets with more than about ten thousand points. Modern approaches to address stationarity assumptions generally fail to accommodate large data sets, while all attempts to address scalability focus on approximating the Gaussian likelihood, which can involve subjectivity and lead to inaccuracies. In this work, we explicitly derive an alternative kernel that can discover and encode both sparsity and nonstationarity. We embed the kernel within a fully Bayesian GP model and leverage high-performance computing resources to enable the analysis of massive data sets. We demonstrate the favorable performance of our novel kernel relative to existing exact and approximate GP methods across a variety of synthetic data examples. Furthermore, we conduct space-time prediction based on more than one million measurements of daily maximum temperature and verify that our results outperform state-of-the-art methods in the Earth sciences. More broadly, having access to exact GPs that use ultra-scalable, sparsity-discovering, nonstationary kernels allows GP methods to truly compete with a wide variety of machine learning methods.

artificial intelligence, bayesian inference, machine learning, (17 more...)

2411.05869

Country:

North America > United States > Texas > Harris County > Houston (0.14)
North America > United States > California (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report > New Finding (0.34)

Industry:

Energy (1.00)
Government > Regional Government > North America Government > United States Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

arXiv.org Machine LearningOct-3-2024

Ranking Perspective for Tree-based Methods with Applications to Symbolic Feature Selection

Luo, Hengrui, Li, Meng

Tree-based methods are powerful nonparametric techniques in statistics and machine learning. However, their effectiveness, particularly in finite-sample settings, is not fully understood. Recent applications have revealed their surprising ability to distinguish transformations (which we call symbolic feature selection) that remain obscure under current theoretical understanding. This work provides a finite-sample analysis of tree-based methods from a ranking perspective. We link oracle partitions in tree methods to response rankings at local splits, offering new insights into their finite-sample behavior in regression and feature selection tasks. Building on this local ranking perspective, we extend our analysis in two ways: (i) We examine the global ranking performance of individual trees and ensembles, including Classification and Regression Trees (CART) and Bayesian Additive Regression Trees (BART), providing finite-sample oracle bounds, ranking consistency, and posterior contraction results. (ii) Inspired by the ranking perspective, we propose concordant divergence statistics $\mathcal{T}_0$ to evaluate symbolic feature mappings and establish their properties. Numerical experiments demonstrate the competitive performance of these statistics in symbolic feature selection tasks compared to existing methods.

artificial intelligence, machine learning, partition, (16 more...)

2410.02623

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Energy (0.45)
Government > Regional Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

arXiv.org Artificial IntelligenceOct-30-2023

Topological Learning for Motion Data via Mixed Coordinates

Luo, Hengrui, Kim, Jisu, Patania, Alice, Vejdemo-Johansson, Mikael

Topology can extract the structural information in a dataset efficiently. In this paper, we attempt to incorporate topological information into a multiple output Gaussian process model for transfer learning purposes. To achieve this goal, we extend the framework of circular coordinates into a novel framework of mixed valued coordinates to take linear trends in the time series into consideration. One of the major challenges to learn from multiple time series effectively via a multiple output Gaussian process model is constructing a functional kernel. We propose to use topologically induced clustering to construct a cluster based kernel in a multiple output Gaussian process model. This kernel not only incorporates the topological structural information, but also allows us to put forward a unified framework using topological information in time and motion series.

artificial intelligence, circular coordinate, machine learning, (18 more...)

doi: 10.1109/BigData52589.2021.9671525

2310.1996

Country:

North America > United States (0.93)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

arXiv.org Machine LearningSep-18-2023

A Unifying Perspective on Non-Stationary Kernels for Deeper Gaussian Processes

Noack, Marcus M., Luo, Hengrui, Risser, Mark D.

The Gaussian process (GP) is a popular statistical technique for stochastic function approximation and uncertainty quantification from data. GPs have been adopted into the realm of machine learning in the last two decades because of their superior prediction abilities, especially in data-sparse scenarios, and their inherent ability to provide robust uncertainty estimates. Even so, their performance highly depends on intricate customizations of the core methodology, which often leads to dissatisfaction among practitioners when standard setups and off-the-shelf software tools are being deployed. Arguably the most important building block of a GP is the kernel function which assumes the role of a covariance operator. Stationary kernels of the Mat\'ern class are used in the vast majority of applied studies; poor prediction performance and unrealistic uncertainty quantification are often the consequences. Non-stationary kernels show improved performance but are rarely used due to their more complicated functional form and the associated effort and expertise needed to define and tune them optimally. In this perspective, we want to help ML practitioners make sense of some of the most common forms of non-stationarity for Gaussian processes. We show a variety of kernels in action using representative datasets, carefully study their properties, and compare their performances. Based on our findings, we propose a new kernel that combines some of the identified advantages of existing kernels.

artificial intelligence, machine learning, modeling & simulation, (20 more...)

2309.10068

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.34)

Industry:

Energy (1.00)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
(2 more...)

arXiv.org Artificial IntelligenceAug-29-2023

Surrogate-based Autotuning for Randomized Sketching Algorithms in Regression Problems

Cho, Younghyun, Demmel, James W., Dereziński, Michał, Li, Haoyun, Luo, Hengrui, Mahoney, Michael W., Murray, Riley J.

Algorithms from Randomized Numerical Linear Algebra (RandNLA) are known to be effective in handling high-dimensional computational problems, providing high-quality empirical performance as well as strong probabilistic guarantees. However, their practical application is complicated by the fact that the user needs to set various algorithm-specific tuning parameters which are different than those used in traditional NLA. This paper demonstrates how a surrogate-based autotuning approach can be used to address fundamental problems of parameter selection in RandNLA algorithms. In particular, we provide a detailed investigation of surrogate-based autotuning for sketch-and-precondition (SAP) based randomized least squares methods, which have been one of the great success stories in modern RandNLA. Empirical results show that our surrogate-based autotuning approach can achieve near-optimal performance with much less tuning cost than a random search (up to about 4x fewer trials of different parameter configurations). Moreover, while our experiments focus on least squares, our results demonstrate a general-purpose autotuning pipeline applicable to any kind of RandNLA algorithm.

artificial intelligence, randomized sketching algorithm, surrogate-based autotuning, (1 more...)

2308.1572

Genre: Research Report > New Finding (0.87)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.53)

arXiv.org Artificial IntelligenceAug-18-2023

Hybrid Models for Mixed Variables in Bayesian Optimization

Luo, Hengrui, Cho, Younghyun, Demmel, James W., Li, Xiaoye S., Liu, Yang

This paper presents a new type of hybrid models for Bayesian optimization (BO) adept at managing mixed variables, encompassing both quantitative (continuous and integer) and qualitative (categorical) types. Our proposed new hybrid models merge Monte Carlo Tree Search structure (MCTS) for categorical variables with Gaussian Processes (GP) for continuous ones. Addressing efficiency in searching phase, we juxtapose the original (frequentist) upper confidence bound tree search (UCTS) and the Bayesian Dirichlet search strategies, showcasing the tree architecture's integration into Bayesian optimization. Central to our innovation in surrogate modeling phase is online kernel selection for mixed-variable BO. Our innovations, including dynamic kernel selection, unique UCTS (hybridM) and Bayesian update strategies (hybridD), position our hybrid models as an advancement in mixed-variable surrogate models. Numerical experiments underscore the hybrid models' superiority, highlighting their potential in Bayesian optimization. Keywords: Gaussian processes, Monte Carlo tree search, categorical variables, online kernel selection. The discussion of different types of encodings can be found in Cerda et al. (2018). 1 Introduction Our motivating problem is to optimize a "black-box" function with "mixed" variables, lacking an analytic expression. "Mixed" signifies the function's input variables comprise both continuous (quantitative) and categorical (qualitative) variables, common in machine learning and scientific computing tasks like performance tuning of mathematical libraries and application codes at runtime and compile-time (Balaprakash et al., 2018). Bayesian optimization (BO) with Gaussian process (GP) surrogate models is a prevalent method for optimizing noisy, expensive black-box functions, primarily designed for continuous-variable functions (Shahriari et al., 2016; Sid-Lakhdar et al., 2020). Extending BO to mixed-variable functions presents theoretical and computational challenges due to variable type differences (Table 1). Continuous variables have uncountably many values with magnitudes and intrinsic ordering, allowing natural gradient definition. In contrast, categorical variables, having finitely many values without intrinsic ordering or magnitude, require encoding in the GP context, potentially inducing discontinuity and degrading GP performance (Luo et al., 2021). The empirical rule of thumb for handling an integer variable (Karlsson et al., 2020) is to treat it as a categorical variable if the number of integer values (i.e., number of categorical values) is small, or as a continuous variable with embedding (a.k.a.

artificial intelligence, bayesian inference, machine learning, (17 more...)

2206.01409

Country: North America > United States > California > Alameda County > Berkeley (0.14)

Genre: Research Report (0.81)

Industry:

Energy (1.00)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

arXiv.org Artificial IntelligenceJun-1-2023

Efficient and Robust Bayesian Selection of Hyperparameters in Dimension Reduction for Visualization

Liao, Yin-Ting, Luo, Hengrui, Ma, Anna

We introduce an efficient and robust auto-tuning framework for hyperparameter selection in dimension reduction (DR) algorithms, focusing on large-scale datasets and arbitrary performance metrics. By leveraging Bayesian optimization (BO) with a surrogate model, our approach enables efficient hyperparameter selection with multi-objective trade-offs and allows us to perform data-driven sensitivity analysis. By incorporating normalization and subsampling, the proposed framework demonstrates versatility and efficiency, as shown in applications to visualization techniques such as t-SNE and UMAP. We evaluate our results on various synthetic and real-world datasets using multiple quality metrics, providing a robust and efficient solution for hyperparameter selection in DR algorithms.

artificial intelligence, machine learning, optimization problem, (17 more...)

2306.00357

Country:

North America > United States > California (0.14)
Africa > Middle East > Morocco (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.61)