AITopics | Wang, Yue

Collaborating Authors

Wang, Yue

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Convergence Analysis of Distributed Stochastic Gradient Descent with Shuffling

Meng, Qi, Chen, Wei, Wang, Yue, Ma, Zhi-Ming, Liu, Tie-Yan

arXiv.org Machine LearningSep-29-2017

When using stochastic gradient descent to solve large-scale machine learning problems, a common practice of data processing is to shuffle the training data, partition the data across multiple machines if needed, and then perform several epochs of training on the re-shuffled (either locally or globally) data. The above procedure makes the instances used to compute the gradients no longer independently sampled from the training data set. Then does the distributed SGD method have desirable convergence properties in this practical situation? In this paper, we give answers to this question. First, we give a mathematical formulation for the practical data processing procedure in distributed machine learning, which we call data partition with global/local shuffling. We observe that global shuffling is equivalent to without-replacement sampling if the shuffling operations are independent. We prove that SGD with global shuffling has convergence guarantee in both convex and non-convex cases. An interesting finding is that, the non-convex tasks like deep learning are more suitable to apply shuffling comparing to the convex tasks. Second, we conduct the convergence analysis for SGD with local shuffling. The convergence rate for local shuffling is slower than that for global shuffling, since it will lose some information if there's no communication between partitioned data. Finally, we consider the situation when the permutation after shuffling is not uniformly distributed (insufficient shuffling), and discuss the condition under which this insufficiency will not influence the convergence rate. Our theoretical results provide important insights to large-scale machine learning, especially in the selection of data processing methods in order to achieve faster convergence and good speedup. Our theoretical findings are verified by extensive experiments on logistic regression and deep neural networks.

artificial intelligence, neural network, shuffling, (19 more...)

arXiv.org Machine Learning

1709.10432

Country:

North America > United States (0.14)
Asia (0.14)

Genre: Research Report > New Finding (0.88)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Generalization Error Bounds for Optimization Algorithms via Stability

AAAI ConferencesFeb-14-2017

Many machine learning tasks can be formulated as Regularized Empirical Risk Minimization (R-ERM), and solved by optimization algorithms such as gradient descent (GD), stochastic gradient descent (SGD), and stochastic variance reduction (SVRG). Conventional analysis on these optimization algorithms focuses on their convergence rates during the training process, however, people in the machine learning community may care more about the generalization performance of the learned model on unseen test data. In this paper, we investigate on this issue, by using stability as a tool. In particular, we decompose the generalization error for R-ERM, and derive its upper bound for both convex and nonconvex cases. In convex cases, we prove that the generalization error can be bounded by the convergence rate of the optimization algorithm and the stability of the R-ERM process, both in expectation (in the order of 𝒪(1/ n )+ 𝔼ρ( T )), where ρ( T ) is the convergence error and T is the number of iterations) and in high probability (in the order of 𝒪(log{1/δ / √ n + ρ( T ) with probability 1 – δ). For nonconvex cases, we can also obtain a similar expected generalization error bound. Our theorems indicate that 1) along with the training process, the generalization error will decrease for all the optimization algorithms under our investigation; 2) Comparatively speaking, SVRG has better generalization ability than GD and SGD. We have conducted experiments on both convex and nonconvex problems, and the experimental results verify our theoretical findings.

artificial intelligence, generalization error, optimization problem, (17 more...)

AAAI Conferences

Thirty-First AAAI Conference on Artificial Intelligence

Country:

Asia (0.46)
North America > United States > California (0.14)

Genre: Research Report (0.30)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.76)

Add feedback

Graphical Time Warping for Joint Alignment of Multiple Curves

Wang, Yizhi, Miller, David J., Poskanzer, Kira, Wang, Yue, Tian, Lin, Yu, Guoqiang

Neural Information Processing SystemsDec-31-2016

Dynamic time warping (DTW) is a fundamental technique in time series analysis for comparing one curve to another using a flexible time-warping function. However, it was designed to compare a single pair of curves. In many applications, such as in metabolomics and image series analysis, alignment is simultaneously needed for multiple pairs. Because the underlying warping functions are often related, independent application of DTW to each pair is a sub-optimal solution. Yet, it is largely unknown how to efficiently conduct a joint alignment with all warping functions simultaneously considered, since any given warping function is constrained by the others and dynamic programming cannot be applied. In this paper, we show that the joint alignment problem can be transformed into a network flow problem and thus can be exactly and efficiently solved by the max flow algorithm, with a guarantee of global optimality. We name the proposed approach graphical time warping (GTW), emphasizing the graphical nature of the solution and that the dependency structure of the warping functions can be represented by a graph. Modifications of DTW, such as windowing and weighting, are readily derivable within GTW. We also discuss optimal tuning of parameters and hyperparameters in GTW. We illustrate the power of GTW using both synthetic data and a real case study of an astrocyte calcium movie.

graph, neurology, optimization problem, (21 more...)

Neural Information Processing Systems

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Health & Medicine > Therapeutic Area > Neurology (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)

Add feedback

Trust-Based Symbolic Robot Motion Planning with Human-in-the-Loop

Wang, Yue (Clemson University)

AAAI ConferencesNov-19-2016

Autonomous robots are becoming increasingly popular and such systems has led to complex design and analysis which brings the necessity of validation and verification. In particular, symbolic robot motion planning based on formal methods is verifiably correct. It is the process of specifying and planning robot tasks in a discrete space, then carrying them out in a continuous space in a manner that preserves the discrete-level task specifications. Despite progress in symbolic motion planning, many challenges remain, including addressing scalability for multi-robot systems and improving solutions by incorporating human intelligence in an adaptive fashion. On the other hand, extant works in human-robot interaction (HRI) often lack quantitative models and real-time analytical approaches. Here, we summarize our recent works on symbolic robot motion planning with human-in-the-loop as a step toward addressing these challenges. We specially focus on human trust in autonomous robots and embed trust analysis into the symbolic robot motion planning.

human-in-the-loop, trust-based symbolic robot motion planning

AAAI Conferences

2016 AAAI Fall Symposium Series

Technology: Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)

Add feedback

Generalization Error Bounds for Optimization Algorithms via Stability

Meng, Qi, Wang, Yue, Chen, Wei, Wang, Taifeng, Ma, Zhi-Ming, Liu, Tie-Yan

arXiv.org Machine LearningSep-27-2016

Many machine learning tasks can be formulated as Regularized Empirical Risk Minimization (R-ERM), and solved by optimization algorithms such as gradient descent (GD), stochastic gradient descent (SGD), and stochastic variance reduction (SVRG). Conventional analysis on these optimization algorithms focuses on their convergence rates during the training process, however, people in the machine learning community may care more about the generalization performance of the learned model on unseen test data. In this paper, we investigate on this issue, by using stability as a tool. In particular, we decompose the generalization error for R-ERM, and derive its upper bound for both convex and non-convex cases. In convex cases, we prove that the generalization error can be bounded by the convergence rate of the optimization algorithm and the stability of the R-ERM process, both in expectation (in the order of $\mathcal{O}((1/n)+\mathbb{E}\rho(T))$, where $\rho(T)$ is the convergence error and $T$ is the number of iterations) and in high probability (in the order of $\mathcal{O}\left(\frac{\log{1/\delta}}{\sqrt{n}}+\rho(T)\right)$ with probability $1-\delta$). For non-convex cases, we can also obtain a similar expected generalization error bound. Our theorems indicate that 1) along with the training process, the generalization error will decrease for all the optimization algorithms under our investigation; 2) Comparatively speaking, SVRG has better generalization ability than GD and SGD. We have conducted experiments on both convex and non-convex problems, and the experimental results verify our theoretical findings.

artificial intelligence, generalization error, optimization problem, (19 more...)

arXiv.org Machine Learning

1609.08397

Country: North America > United States > California (0.14)

Genre: Research Report (0.84)

Industry: Energy > Oil & Gas (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.76)

Add feedback

Differential Privacy Preservation for Deep Auto-Encoders: an Application of Human Behavior Prediction

Phan, NhatHai (University of Oregon) | Wang, Yue (University of North Carolina at Charlotte) | Wu, Xintao (University of Arkansas) | Dou, Dejing (University of Oregon)

AAAI ConferencesApr-19-2016

In recent years, deep learning has spread beyond both academia and industry with many exciting real-world applications. The development of deep learning has presented obvious privacy issues. However, there has been lack of scientific study about privacy preservation in deep learning. In this paper, we concentrate on the auto-encoder, a fundamental component in deep learning, and propose the deep private auto-encoder (dPA). Our main idea is to enforce ε-differential privacy by perturbing the objective functions of the traditional deep auto-encoder, rather than its results. We apply the dPA to human behavior prediction in a health social network. Theoretical analysis and thorough experimental evaluations show that the dPA is highly effective and efficient, and it significantly outperforms existing solutions.

deep learning, neural network, privacy, (20 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country:

North America > United States (0.46)
North America > Canada > Quebec (0.14)

Genre: Research Report > New Finding (0.69)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Convex Analysis of Mixtures for Separating Non-negative Well-grounded Sources

Zhu, Yitan, Wang, Niya, Miller, David J., Wang, Yue

arXiv.org Machine LearningDec-10-2015

Blind Source Separation (BSS) has proven to be a powerful tool for the analysis of composite patterns in engineering and science. We introduce Convex Analysis of Mixtures (CAM) for separating non-negative well-grounded sources, which learns the mixing matrix by identifying the lateral edges of the convex data scatter plot. We prove a sufficient and necessary condition for identifying the mixing matrix through edge detection, which also serves as the foundation for CAM to be applied not only to the exact-determined and over-determined cases, but also to the under-determined case. We show the optimality of the edge detection strategy, even for cases where source well-groundedness is not strictly satisfied. The CAM algorithm integrates plug-in noise filtering using sector-based clustering, an efficient geometric convex analysis scheme, and stability-based model order selection. We demonstrate the principle of CAM on simulated data and numerically mixed natural images. The superior performance of CAM against a panel of benchmark BSS techniques is demonstrated on numerically mixed gene expression data. We then apply CAM to dissect dynamic contrast-enhanced magnetic resonance imaging data taken from breast tumors and time-course microarray gene expression data derived from in-vivo muscle regeneration in mice, both producing biologically plausible decomposition results.

matrix, oncology, optimization problem, (19 more...)

arXiv.org Machine Learning

1406.7349

Country:

North America > United States > Virginia (0.14)
North America > United States > Pennsylvania (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)
Information Technology > Biomedical Informatics > Translational Bioinformatics (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Unsupervised deconvolution of dynamic imaging reveals intratumor vascular heterogeneity

Chen, Li, Choyke, Peter L., Wang, Niya, Clarke, Robert, Bhujwalla, Zaver M., Hillman, Elizabeth M. C., Wang, Yue

arXiv.org Machine LearningSep-4-2014

Intratumor heterogeneity is often manifested by vascular compartments with distinct pharmacokinetics that cannot be resolved directly by in vivo dynamic imaging. We developed tissue-specific compartment modeling (TSCM), an unsupervised computational method of deconvolving dynamic imaging series from heterogeneous tumors that can improve vascular phenotyping in many biological contexts. Applying TSCM to dynamic contrast-enhanced MRI of breast cancers revealed characteristic intratumor vascular heterogeneity and therapeutic responses that were otherwise undetectable.

compartment, health & medicine, oncology, (19 more...)

arXiv.org Machine Learning

doi: 10.1371/journal.pone.0112143

1306.3392

Country: North America > United States > New York (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Biomedical Informatics (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

A feasible roadmap for unsupervised deconvolution of two-source mixed gene expressions

Wang, Niya, Hoffman, Eric P., Clarke, Robert, Zhang, Zhen, Herrington, David M., Shih, Ie-Ming, Levine, Douglas A., Yu, Guoqiang, Xuan, Jianhua, Wang, Yue

arXiv.org Machine LearningOct-25-2013

Tissue heterogeneity is a major confounding factor in studying individual populations that cannot be resolved directly by global profiling. Experimental solutions to mitigate tissue heterogeneity are expensive, time consuming, inapplicable to existing data, and may alter the original gene expression patterns. Here we ask whether it is possible to deconvolute two-source mixed expressions (estimating both proportions and cell-specific profiles) from two or more heterogeneous samples without requiring any prior knowledge. Supported by a well-grounded mathematical framework, we argue that both constituent proportions and cell-specific expressions can be estimated in a completely unsupervised mode when cell-specific marker genes exist, which do not have to be known a priori, for each of constituent cell types. We demonstrate the performance of unsupervised deconvolution on both simulation and real gene expression data, together with perspective discussions.

expression, health & medicine, oncology, (20 more...)

arXiv.org Machine Learning

1310.7033

Country: North America > United States > Virginia (0.14)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.49)

Technology:

Information Technology > Biomedical Informatics (0.48)
Information Technology > Artificial Intelligence > Machine Learning (0.47)

Add feedback

Learning Structural Changes of Gaussian Graphical Models in Controlled Experiments

Zhang, Bai, Wang, Yue

arXiv.org Machine LearningMar-15-2012

Graphical models are widely used in scienti fic and engineering research to represent conditional independence structures between random variables. In many controlled experiments, environmental changes or external stimuli can often alter the conditional dependence between the random variables, and potentially produce significant structural changes in the corresponding graphical models. Therefore, it is of great importance to be able to detect such structural changes from data, so as to gain novel insights into where and how the structural changes take place and help the system adapt to the new environment. Here we report an effective learning strategy to extract structural changes in Gaussian graphical model using l1-regularization based convex optimization. We discuss the properties of the problem formulation and introduce an efficient implementation by the block coordinate descent algorithm. We demonstrate the principle of the approach on a numerical simulation experiment, and we then apply the algorithm to the modeling of gene regulatory networks under different conditions and obtain promising yet biologically plausible results.

algorithm, health & medicine, optimization problem, (18 more...)

arXiv.org Machine Learning

1203.3532

Country: North America > United States > Virginia (0.14)

Genre:

Research Report > Strength High (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Systems & Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.94)

Add feedback