AITopics | Liu, Jun S.

Collaborating Authors

Liu, Jun S.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Towards a Statistical Understanding of Neural Networks: Beyond the Neural Tangent Kernel Theories

Zhang, Haobo, Lai, Jianfa, Li, Yicheng, Lin, Qian, Liu, Jun S.

arXiv.org Artificial IntelligenceDec-24-2024

A primary advantage of neural networks lies in their feature learning characteristics, which is challenging to theoretically analyze due to the complexity of their training dynamics. We propose a new paradigm for studying feature learning and the resulting benefits in generalizability. After reviewing the neural tangent kernel (NTK) theory and recent results in kernel regression, which address the generalization issue of sufficiently wide neural networks, we examine limitations and implications of the fixed kernel theory (as the NTK theory) and review recent theoretical advancements in feature learning. Moving beyond the fixed kernel/feature theory, we consider neural networks as adaptive feature models. Finally, we propose an over-parameterized Gaussian sequence model as a prototype model to study the feature learning characteristics of neural networks.

artificial intelligence, machine learning, neural network, (17 more...)

arXiv.org Artificial Intelligence

2412.18756

Country:

Europe (0.67)
North America > United States > Massachusetts (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Multi-Response Heteroscedastic Gaussian Process Models and Their Inference

Lee, Taehee, Liu, Jun S.

arXiv.org Machine LearningAug-30-2023

Despite the widespread utilization of Gaussian process models for versatile nonparametric modeling, they exhibit limitations in effectively capturing abrupt changes in function smoothness and accommodating relationships with heteroscedastic errors. Addressing these shortcomings, the heteroscedastic Gaussian process (HeGP) regression seeks to introduce flexibility by acknowledging the variability of residual variances across covariates in the regression model. In this work, we extend the HeGP concept, expanding its scope beyond regression tasks to encompass classification and state-space models. To achieve this, we propose a novel framework where the Gaussian process is coupled with a covariate-induced precision matrix process, adopting a mixture formulation. This approach enables the modeling of heteroscedastic covariance functions across covariates. To mitigate the computational challenges posed by sampling, we employ variational inference to approximate the posterior and facilitate posterior predictive modeling. Additionally, our training process leverages an EM algorithm featuring closed-form M-step updates to efficiently evaluate the heteroscedastic covariance function. A notable feature of our model is its consistent performance on multivariate responses, accommodating various types (continuous or categorical) seamlessly. Through a combination of simulations and real-world applications in climatology, we illustrate the model's prowess and advantages. By overcoming the limitations of traditional Gaussian process models, our proposed framework offers a robust and versatile tool for a wide array of applications.

artificial intelligence, machine learning, modeling & simulation, (14 more...)

arXiv.org Machine Learning

2308.1537

Country: North America > United States > Arizona (0.14)

Genre: Research Report (0.82)

Industry:

Transportation > Infrastructure & Services > Airport (1.00)
Transportation > Air (1.00)
Government > Regional Government > North America Government > United States Government (0.45)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)

Add feedback

Neural Gaussian Mirror for Controlled Feature Selection in Neural Networks

Xing, Xin, Gui, Yu, Dai, Chenguang, Liu, Jun S.

arXiv.org Machine LearningOct-13-2020

Deep neural networks (DNNs) have become increasingly popular and achieved outstanding performance in predictive tasks. However, the DNN framework itself cannot inform the user which features are more or less relevant for making the prediction, which limits its applicability in many scientific fields. We introduce neural Gaussian mirrors (NGMs), in which mirrored features are created, via a structured perturbation based on a kernel-based conditional dependence measure, to help evaluate feature importance. We design two modifications of the DNN architecture for incorporating mirrored features and providing mirror statistics to measure feature importance. As shown in simulated and real data examples, the proposed method controls the feature selection error rate at a predefined level and maintains a high selection power even with the presence of highly correlated features.

artificial intelligence, neural network, statistics, (17 more...)

arXiv.org Machine Learning

2010.06175

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Generative Parameter Sampler For Scalable Uncertainty Quantification

Shin, Minsuk, Lee, Young, Liu, Jun S.

arXiv.org Machine LearningJun-2-2019

Uncertainty quantification has been a core of the statistical machine learning, but its computational bottleneck has been a serious challenge for both Bayesians and frequentists. We propose a model-based framework in quantifying uncertainty, called predictive-matching Generative Parameter Sampler (GPS). This procedure considers an Uncertainty Quantification (UQ) distribution, on the targeted parameter, which matches the corresponding predictive distribution to the observed data. This framework adopts a hierarchical modeling perspective such that each observation is modeled by an individual parameter. This individual parameterization permits the resulting inference to be computationally scalable and robust to outliers. Our approach is illustrated for linear models, Poisson processes, and deep neural networks for classification. The results show that the GPS is successful in providing uncertainty quantification as well as additional flexibility beyond what is allowed by classical statistical procedures under the postulated statistical models.

deep learning, gps, neural network, (17 more...)

arXiv.org Machine Learning

1905.1244

Country: North America (0.14)

Genre: Research Report (0.70)

Add feedback

IMMIGRATE: A Margin-based Feature Selection Method with Interaction Terms

Zhao, Ruzhang, Hong, Pengyu, Liu, Jun S.

arXiv.org Machine LearningOct-5-2018

By balancing margin-quantity maximization and margin-quality maximization, the proposed IMMIGRATE algorithm considers both local and global information when using margin-based frameworks. We here derive a new mathematical interpretation of margin-based cost function by using the quadratic form distance (QFD) and applying both the large-margin and max-min entropy principles. We also design a new principle for classifying new samples and propose a Bayesian framework to iteratively minimize the cost function. We demonstrate the power of our new method by comparing it with 16 widely used classifiers (e.g. Support Vector Machine, k-nearest neighbors, RELIEF, etc.) including some classifiers that are capable of identifying interaction terms (e.g. SODA, hierNet, etc.) on synthetic dataset, five gene expression datasets, and twenty UCI machine learning datasets. Our method is able to outperform other methods in most cases.

algorithm, health & medicine, oncology, (18 more...)

arXiv.org Machine Learning

1810.02658

Country: North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.70)
Health & Medicine > Therapeutic Area > Oncology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)

Add feedback

A Unified Theory of Confidence Regions and Testing for High Dimensional Estimating Equations

Neykov, Matey, Ning, Yang, Liu, Jun S., Liu, Han

arXiv.org Machine LearningJun-22-2016

We propose a new inferential framework for constructing confidence regions and testing hypotheses in statistical models specified by a system of high dimensional estimating equations. We construct an influence function by projecting the fitted estimating equations to a sparse direction obtained by solving a large-scale linear program. Our main theoretical contribution is to establish a unified Z-estimation theory of confidence regions for high dimensional problems. Different from existing methods, all of which require the specification of the likelihood or pseudo-likelihood, our framework is likelihood-free. As a result, our approach provides valid inference for a broad class of high dimensional constrained estimating equation problems, which are not covered by existing methods. Such examples include, noisy compressed sensing, instrumental variable regression, undirected graphical models, discriminant analysis and vector autoregressive models. We present detailed theoretical results for all these examples. Finally, we conduct thorough numerical simulations, and a real dataset analysis to back up the developed theoretical results.

artificial intelligence, assumption, health & medicine, (17 more...)

arXiv.org Machine Learning

1510.08986

Country: North America > United States (0.92)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Signed Support Recovery for Single Index Models in High-Dimensions

Neykov, Matey, Lin, Qian, Liu, Jun S.

arXiv.org Machine LearningJun-22-2016

In this paper we study the support recovery problem for single index models $Y=f(\boldsymbol{X}^{\intercal} \boldsymbol{\beta},\varepsilon)$, where $f$ is an unknown link function, $\boldsymbol{X}\sim N_p(0,\mathbb{I}_{p})$ and $\boldsymbol{\beta}$ is an $s$-sparse unit vector such that $\boldsymbol{\beta}_{i}\in \{\pm\frac{1}{\sqrt{s}},0\}$. In particular, we look into the performance of two computationally inexpensive algorithms: (a) the diagonal thresholding sliced inverse regression (DT-SIR) introduced by Lin et al. (2015); and (b) a semi-definite programming (SDP) approach inspired by Amini & Wainwright (2008). When $s=O(p^{1-\delta})$ for some $\delta>0$, we demonstrate that both procedures can succeed in recovering the support of $\boldsymbol{\beta}$ as long as the rescaled sample size $\kappa=\frac{n}{s\log(p-s)}$ is larger than a certain critical threshold. On the other hand, when $\kappa$ is smaller than a critical value, any algorithm fails to recover the support with probability at least $\frac{1}{2}$ asymptotically. In other words, we demonstrate that both DT-SIR and the SDP approach are optimal (up to a scalar) for recovering the support of $\boldsymbol{\beta}$ in terms of sample size. We provide extensive simulations, as well as a real dataset application to help verify our theoretical observations.

artificial intelligence, health & medicine, inequality, (15 more...)

arXiv.org Machine Learning

1511.0227

Country: North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

L1-Regularized Least Squares for Support Recovery of High Dimensional Single Index Models with Gaussian Designs

Neykov, Matey, Liu, Jun S., Cai, Tianxi

arXiv.org Machine LearningJun-22-2016

It is known that for a certain class of single index models (SIMs) $Y = f(\boldsymbol{X}_{p \times 1}^\intercal\boldsymbol{\beta}_0, \varepsilon)$, support recovery is impossible when $\boldsymbol{X} \sim \mathcal{N}(0, \mathbb{I}_{p \times p})$ and a model complexity adjusted sample size is below a critical threshold. Recently, optimal algorithms based on Sliced Inverse Regression (SIR) were suggested. These algorithms work provably under the assumption that the design $\boldsymbol{X}$ comes from an i.i.d. Gaussian distribution. In the present paper we analyze algorithms based on covariance screening and least squares with $L_1$ penalization (i.e. LASSO) and demonstrate that they can also enjoy optimal (up to a scalar) rescaled sample size in terms of support recovery, albeit under slightly different assumptions on $f$ and $\varepsilon$ compared to the SIR based algorithms. Furthermore, we show more generally, that LASSO succeeds in recovering the signed support of $\boldsymbol{\beta}_0$ if $\boldsymbol{X} \sim \mathcal{N}(0, \boldsymbol{\Sigma})$, and the covariance $\boldsymbol{\Sigma}$ satisfies the irrepresentable condition. Our work extends existing results on the support recovery of LASSO for the linear model, to a more general class of SIMs.

artificial intelligence, health & medicine, probability, (18 more...)

arXiv.org Machine Learning

1511.08102

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report > New Finding (0.45)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.93)

Add feedback

Interpretable Selection and Visualization of Features and Interactions Using Bayesian Forests

Krakovna, Viktoriya, Du, Jiong, Liu, Jun S.

arXiv.org Machine LearningFeb-7-2016

It is becoming increasingly important for machine learning methods to make predictions that are interpretable as well as accurate. In many practical applications, it is of interest which features and feature interactions are relevant to the prediction task. We present a novel method, Selective Bayesian Forest Classifier, that strikes a balance between predictive power and interpretability by simultaneously performing classification, feature selection, feature interaction detection and visualization. It builds parsimonious yet flexible models using tree-structured Bayesian networks, and samples an ensemble of such models using Markov chain Monte Carlo. We build in feature selection by dividing the trees into two groups according to their relevance to the outcome of interest. Our method performs competitively on classification and feature selection benchmarks in low and high dimensions, and includes a visualization tool that provides insight into relevant features and interactions.

artificial intelligence, health & medicine, interpretable selection and visualization, (14 more...)

arXiv.org Machine Learning

1506.02371

Country: North America > United States > California > San Francisco County > San Francisco (0.15)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback