AITopics | Wahba, Grace

Collaborating Authors

Wahba, Grace

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning Higher-Order Graph Structure with Features by Structure Penalty

Ding, Shilin, Wahba, Grace, Zhu, Jerry

Neural Information Processing SystemsFeb-14-2020, 21:41:51 GMT

In discrete undirected graphical models, the conditional independence of node labels Y is specified by the graph structure. We study the case where there is another input random vector X (e.g. The main contribution of this paper is to learn the graph structure and the functions conditioned on X at the same time. We prove that discrete undirected graphical models with feature X are equivalent to mul- tivariate discrete models. The reparameterization of the potential functions in graphical models by conditional log odds ratios of the latter offers advantages in representation of the conditional independence structure.

artificial intelligence, graph structure, higher-order graph structure, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

When can Multi-Site Datasets be Pooled for Regression? Hypothesis Tests, $\ell_2$-consistency and Neuroscience Applications

Zhou, Hao Henry, Zhang, Yilin, Ithapu, Vamsi K., Johnson, Sterling C., Wahba, Grace, Singh, Vikas

arXiv.org Machine LearningSep-2-2017

Many studies in biomedical and health sciences involve small sample sizes due to logistic or financial constraints. Often, identifying weak (but scientifically interesting) associations between a set of predictors and a response necessitates pooling datasets from multiple diverse labs or groups. While there is a rich literature in statistical machine learning to address distributional shifts and inference in multi-site datasets, it is less clear ${\it when}$ such pooling is guaranteed to help (and when it does not) -- independent of the inference algorithms we use. In this paper, we present a hypothesis test to answer this question, both for classical and high dimensional linear regression. We precisely identify regimes where pooling datasets across multiple sites is sensible, and how such policy decisions can be made via simple checks executable on each site before any data transfer ever happens. With a focus on Alzheimer's disease studies, we present empirical results showing that in regimes suggested by our analysis, pooling a local dataset with data from an international study improves power.

dataset, neurology, survey article, (22 more...)

arXiv.org Machine Learning

1709.0064

Country: North America > United States > Wisconsin (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.88)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Add feedback

Hypothesis Testing in Unsupervised Domain Adaptation with Applications in Alzheimer's Disease

Zhou, Hao, Ithapu, Vamsi K., Ravi, Sathya Narayanan, Singh, Vikas, Wahba, Grace, Johnson, Sterling C.

Neural Information Processing SystemsDec-31-2016

Consider samples from two different data sources $\{\mathbf{x_s^i}\} \sim P_{\rm source}$ and $\{\mathbf{x_t^i}\} \sim P_{\rm target}$. We only observe their transformed versions $h(\mathbf{x_s^i})$ and $g(\mathbf{x_t^i})$, for some known function class $h(\cdot)$ and $g(\cdot)$. Our goal is to perform a statistical test checking if $P_{\rm source}$ = $P_{\rm target}$ while removing the distortions induced by the transformations. This problem is closely related to concepts underlying numerous domain adaptation algorithms, and in our case, is motivated by the need to combine clinical and imaging based biomarkers from multiple sites and/or batches, where this problem is fairly common and an impediment in the conduct of analyses with much larger sample sizes. We develop a framework that addresses this problem using ideas from hypothesis testing on the transformed measurements, where in the distortions need to be estimated {\it in tandem} with the testing. We derive a simple algorithm and study its convergence and consistency properties in detail, and we also provide lower-bound strategies based on recent work in continuous optimization. On a dataset of individuals at risk for neurological disease, our results are competitive with alternative procedures that are twice as expensive and in some cases operationally infeasible to implement.

optimization problem, scientific discovery, transformation, (21 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin (0.14)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.48)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.61)

Add feedback

Distance Shrinkage and Euclidean Embedding via Regularized Kernel Estimation

Zhang, Luwan, Wahba, Grace, Yuan, Ming

arXiv.org Machine LearningSep-17-2014

Although recovering an Euclidean distance matrix from noisy observations is a common problem in practice, how well this could be done remains largely unknown. To fill in this void, we study a simple distance matrix estimate based upon the so-called regularized kernel estimate. We show that such an estimate can be characterized as simply applying a constant amount of shrinkage to all observed pairwise distances. This fact allows us to establish risk bounds for the estimate implying that the true distances can be estimated consistently in an average sense as the number of objects increases. In addition, such a characterization suggests an efficient algorithm to compute the distance matrix estimator, as an alternative to the usual second order cone programming known not to scale well for large problems. Numerical experiments and an application in visualizing the diversity of Vpu protein sequences from a recent HIV-1 study further demonstrate the practical merits of the proposed method.

distance matrix, immunology, internal medicine, (17 more...)

arXiv.org Machine Learning

1409.5009

Country: North America > United States > Wisconsin > Dane County > Madison (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.87)
Health & Medicine > Therapeutic Area > Immunology (0.87)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Kernel Methods (0.40)

Add feedback

Multivariate Bernoulli distribution

Dai, Bin, Ding, Shilin, Wahba, Grace

arXiv.org Machine LearningNov-12-2013

In this paper, we consider the multivariate Bernoulli distribution as a model to estimate the structure of graphs with binary nodes. This distribution is discussed in the framework of the exponential family, and its statistical properties regarding independence of the nodes are demonstrated. Importantly the model can estimate not only the main effects and pairwise interactions among the nodes but also is capable of modeling higher order interactions, allowing for the existence of complex clique effects. We compare the multivariate Bernoulli model with existing graphical inference models - the Ising model and the multivariate Gaussian model, where only the pairwise interactions are considered. On the other hand, the multivariate Bernoulli distribution has an interesting property in that independence and uncorrelatedness of the component random variables are equivalent. Both the marginal and conditional distributions of a subset of variables in the multivariate Bernoulli distribution still follow the multivariate Bernoulli distribution. Furthermore, the multivariate Bernoulli logistic model is developed under generalized linear model theory by utilizing the canonical link function in order to include covariate information on the nodes, edges and cliques. We also consider variable selection techniques such as LASSO in the logistic model to impose sparsity structure on the graph. Finally, we discuss extending the smoothing spline ANOVA approach to the multivariate Bernoulli logistic model to enable estimation of non-linear effects of the predictor variables.

artificial intelligence, bernoulli distribution, machine learning, (13 more...)

arXiv.org Machine Learning

doi: 10.3150/12-BEJSP10

1206.1874

Country: North America > United States > Wisconsin (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.67)

Add feedback

Learning Higher-Order Graph Structure with Features by Structure Penalty

Ding, Shilin, Wahba, Grace, Zhu, Jerry

Neural Information Processing SystemsDec-31-2011

In discrete undirected graphical models, the conditional independence of node labels Y is specified by the graph structure. We study the case where there is another input random vector X (e.g. observed features) such that the distribution P (Y | X) is determined by functions of X that characterize the (higher-order) interactions among the Y ’s. The main contribution of this paper is to learn the graph structure and the functions conditioned on X at the same time. We prove that discrete undirected graphical models with feature X are equivalent to mul- tivariate discrete models. The reparameterization of the potential functions in graphical models by conditional log odds ratios of the latter offers advantages in representation of the conditional independence structure. The functional spaces can be flexibly determined by kernels. Additionally, we impose a Structure Lasso (SLasso) penalty on groups of functions to learn the graph structure. These groups with overlaps are designed to enforce hierarchical function selection. In this way, we are able to shrink higher order interactions to obtain a sparse graph structure.

artificial intelligence, interaction, us government, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin > Dane County > Madison (0.14)

Industry:

Government > Voting & Elections (0.46)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.49)

Add feedback

The Bias-Variance Tradeoff and the Randomized GACV

Wahba, Grace, Lin, Xiwu, Gao, Fangyu, Xiang, Dong, Klein, Ronald, Klein, Barbara

Neural Information Processing SystemsDec-31-1999

We propose a new in-sample cross validation based method (randomized GACV) for choosing smoothing or bandwidth parameters that govern the bias-variance or fit-complexity tradeoff in'soft' classification. Soft classification refers to a learning procedure which estimates the probability that an example with a given attribute vector is in class 1 vs class O. The target for optimizing the the tradeoff is the Kullback-Liebler distance between the estimated probability distribution and the'true' probability distribution, representing knowledge of an infinite population. The method uses a randomized estimate of the trace of a Hessian and mimics cross validation at the cost of a single relearning with perturbed outcome data.

artificial intelligence, health & medicine, wahba, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin > Dane County > Madison (0.15)

Industry: Health & Medicine > Therapeutic Area (0.97)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

The Bias-Variance Tradeoff and the Randomized GACV

Wahba, Grace, Lin, Xiwu, Gao, Fangyu, Xiang, Dong, Klein, Ronald, Klein, Barbara

Neural Information Processing SystemsDec-31-1999

artificial intelligence, health & medicine, wahba, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin > Dane County > Madison (0.15)

Industry: Health & Medicine > Therapeutic Area (0.97)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

The Bias-Variance Tradeoff and the Randomized GACV

Wahba, Grace, Lin, Xiwu, Gao, Fangyu, Xiang, Dong, Klein, Ronald, Klein, Barbara

Neural Information Processing SystemsDec-31-1999

We propose a new in-sample cross validation based method (randomized GACV) for choosing smoothing or bandwidth parameters that govern the bias-variance or fit-complexity tradeoff in'soft' classification. Soft classification refersto a learning procedure which estimates the probability that an example with a given attribute vector is in class 1 vs class O. The target for optimizing the the tradeoff is the Kullback-Liebler distance between the estimated probability distribution and the'true' probability distribution,representing knowledge of an infinite population.

artificial intelligence, health & medicine, wahba, (20 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin > Dane County > Madison (0.15)

Industry: Health & Medicine > Therapeutic Area (0.97)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Structured Machine Learning for 'Soft' Classification with Smoothing Spline ANOVA and Stacked Tuning, Testing and Evaluation

Wahba, Grace, Wang, Yuedong, Gu, Chong, Ronald Klein, MD, Barbara Klein, MD

Neural Information Processing SystemsDec-31-1994

We describe the use of smoothing spline analysis of variance (SS ANOVA) in the penalized log likelihood context, for learning (estimating) the probability p of a '1' outcome, given a training set with attribute vectors and outcomes.

artificial intelligence, health & medicine, wahba, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Wisconsin > Dane County > Madison (0.15)
North America > United States > Indiana > Tippecanoe County (0.14)

Industry: Health & Medicine > Therapeutic Area > Endocrinology (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback