AITopics | Li, Xiaodong

Collaborating Authors

Li, Xiaodong

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Linear Polytree Structural Equation Models: Structural Learning and Inverse Correlation Estimation

Lou, Xingmei, Hu, Yu, Li, Xiaodong

arXiv.org Machine LearningJul-22-2021

Over the past three decades, the problem of learning directed graphical models from data has received enormous amount of attention since they provide a compact and flexible way to represent the joint distribution of the data, especially when the associated graph is a directed acyclic graph (DAG). A directed graph is called a DAG if it does not contain directed cycles. DAG models are popular in practice with applications in biology, genetics, machine learning and causal inference (Sachs et al., 2005; Zhang et al., 2013; Koller and Friedman, 2009; Spirtes et al., 2000). There exists an extensive literature on learning the graph structure from data under the assumption that the graph is a DAG. For a summary, see the survey of Drton and Maathuis (2017); Heinze-Deml et al. (2018). Existing approaches generally fall into two categories, constrain-based methods (Spirtes et al., 2000; Pearl, 2009) and score-based methods (Chickering, 2002). Constraint-based methods utilize conditional independence test to determine whether there exists an edge between two nodes and then orient the edges in the graph, such that the resulting graph is compatible with the conditional independencies seen in the data. Score-based methods formulate the structure learning task as optimizing a score function based on the unknown graph and the data. A polytree is a DAG which does not contain any cycles even if the directions of all edges are ignored.

artificial intelligence, cpdag, machine learning, (13 more...)

arXiv.org Machine Learning

2107.10955

Country:

Asia (0.14)
North America > United States > California (0.14)

Genre:

Research Report (0.64)
Overview (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Learning Primal Heuristics for Mixed Integer Programs

Shen, Yunzhuang, Sun, Yuan, Eberhard, Andrew, Li, Xiaodong

arXiv.org Artificial IntelligenceJul-2-2021

This paper proposes a novel primal heuristic for Mixed Integer Programs, by employing machine learning techniques. Mixed Integer Programming is a general technique for formulating combinatorial optimization problems. Inside a solver, primal heuristics play a critical role in finding good feasible solutions that enable one to tighten the duality gap from the outset of the Branch-and-Bound algorithm (B&B), greatly improving its performance by pruning the B&B tree aggressively. In this paper, we investigate whether effective primal heuristics can be automatically learned via machine learning. We propose a new method to represent an optimization problem as a graph, and train a Graph Convolutional Network on solved problem instances with known optimal solutions. This in turn can predict the values of decision variables in the optimal solution for an unseen problem instance of a similar type. The prediction of variable solutions is then leveraged by a novel configuration of the B&B method, Probabilistic Branching with guided Depth-first Search (PB-DFS) approach, aiming to find (near-)optimal solutions quickly. The experimental results show that this new heuristic can find better primal solutions at a much earlier stage of the solving process, compared to other state-of-the-art primal heuristics.

artificial intelligence, decision variable, optimization problem, (18 more...)

arXiv.org Artificial Intelligence

2107.00866

Country: Oceania > Australia (0.15)

Genre:

Research Report > New Finding (0.66)
Research Report > Experimental Study (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Instance Space Analysis for the Car Sequencing Problem

Sun, Yuan, Esler, Samuel, Thiruvady, Dhananjay, Ernst, Andreas T., Li, Xiaodong, Morgan, Kerri

arXiv.org Artificial IntelligenceDec-18-2020

In this paper, we investigate an important research question in the car sequencing problem, that is, what characteristics make an instance hard to solve? To do so, we carry out an Instance Space Analysis for the car sequencing problem, by extracting a vector of problem features to characterize an instance and projecting feature vectors onto a two-dimensional space using principal component analysis. The resulting two dimensional visualizations provide insights into both the characteristics of the instances used for testing and to compare how these affect different optimisation algorithms. This guides us in constructing a new set of benchmark instances with a range of instance properties. These are shown to be both more diverse than the previous benchmarks and include many hard to solve instances. We systematically compare the performance of six algorithms for solving the car sequencing problem. The methods tested include three existing algorithms from the literature and three new ones. Importantly, we build machine learning models to identify the niche in the instance space that an algorithm is expected to perform well on. Our results show that the new algorithms are state-of-the-art. This analysis helps to understand problem hardness and select an appropriate algorithm for solving a given car sequencing problem instance.

algorithm, artificial intelligence, optimization problem, (18 more...)

arXiv.org Artificial Intelligence

2012.10053

Country:

Oceania > Australia > Victoria (0.14)
North America > United States > Massachusetts (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Generalization of Machine Learning for Problem Reduction: A Case Study on Travelling Salesman Problems

Sun, Yuan, Ernst, Andreas, Li, Xiaodong, Weiner, Jake

arXiv.org Machine LearningSep-7-2020

Combinatorial optimization plays an important role in real-world problem solving. In the big data era, the dimensionality of a combinatorial optimization problem is usually very large, which poses a significant challenge to existing solution methods. In this paper, we examine the generalization capability of a machine learning model for problem reduction on the classic travelling salesman problems (TSP). We demonstrate that our method can greedily remove decision variables from an optimization problem that are predicted not to be part of an optimal solution. More specifically, we investigate our model's capability to generalize on test instances that have not been seen during the training phase. We consider three scenarios where training and test instances are different in terms of: 1) problem characteristics; 2) problem sizes; and 3) problem types. Our experiments show that this machine learning based technique can generalize reasonably well over a wide range of TSP test instances with different characteristics or sizes. While the accuracy of predicting unused variables naturally deteriorates as a test instance is further away from the training set, we observe that even when tested on a different TSP problem variant, the machine learning model still makes useful predictions about which variables can be eliminated without significantly impacting solution quality.

mlpr model, neural network, optimization problem, (20 more...)

arXiv.org Machine Learning

doi: 10.1007/s00291-020-00604-x

2005.05847

Country: Oceania > Australia (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Boosting Ant Colony Optimization via Solution Prediction and Machine Learning

Sun, Yuan, Wang, Sheng, Shen, Yunzhuang, Li, Xiaodong, Ernst, Andreas T., Kirley, Michael

arXiv.org Artificial IntelligenceJul-29-2020

This paper introduces an enhanced meta-heuristic (ML-ACO) that combines machine learning (ML) and ant colony optimization (ACO) to solve combinatorial optimization problems. To illustrate the underlying mechanism of our enhanced algorithm, we start by describing a test problem -- the orienteering problem -- used to demonstrate the efficacy of ML-ACO. In this problem, the objective is to find a route that visits a subset of vertices in a graph within a time budget to maximize the collected score. In the first phase of our ML-ACO algorithm, an ML model is trained using a set of small problem instances where the optimal solution is known. Specifically, classification models are used to classify an edge as being part of the optimal route, or not, using problem-specific features and statistical measures. We have tested several classification models including graph neural networks, logistic regression and support vector machines. The trained model is then used to predict the probability that an edge in the graph of a test problem instance belongs to the corresponding optimal route. In the second phase, we incorporate the predicted probabilities into the ACO component of our algorithm. Here, the probability values bias sampling towards favoring those predicted high-quality edges when constructing feasible routes. We empirically show that ML-ACO generates results that are significantly better than the standard ACO algorithm, especially when the computational budget is limited. Furthermore, we show our algorithm is robust in the sense that (a) its overall performance is not sensitive to any particular classification model, and (b) it generalizes well to large and real-world problem instances. Our approach integrating ML with a meta-heuristic is generic and can be applied to a wide range of combinatorial optimization problems.

algorithm, neural network, optimization problem, (23 more...)

arXiv.org Artificial Intelligence

2008.04213

Country:

North America > United States (0.46)
Europe (0.46)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Add feedback

Nonconvex Rectangular Matrix Completion via Gradient Descent without $\ell_{2,\infty}$ Regularization

Chen, Ji, Liu, Dekai, Li, Xiaodong

arXiv.org Machine LearningJan-18-2019

The analysis of nonconvex matrix completion has recently attracted much attention in the community of machine learning thanks to its computational convenience. Existing analysis on this problem, however, usually relies on $\ell_{2,\infty}$ projection or regularization that involves unknown model parameters, although they are observed to be unnecessary in numerical simulations, see, e.g. Zheng and Lafferty [2016]. In this paper, we extend the analysis of the vanilla gradient descent for positive semidefinite matrix completion proposed in Ma et al. [2017] to the rectangular case, and more significantly, improve the required sampling complexity from $\widetilde{O}(r^3)$ to $\widetilde{O}(r^2)$. Our technical ideas and contributions are potentially useful in improving the leave-one-out analysis in other related problems.

artificial intelligence, machine learning, matrix completion, (15 more...)

arXiv.org Machine Learning

1901.06116

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.62)

Add feedback

Convex Relaxation Methods for Community Detection

Li, Xiaodong, Chen, Yudong, Xu, Jiaming

arXiv.org Machine LearningSep-30-2018

This paper surveys recent theoretical advances in convex optimization approaches for community detection. We introduce some important theoretical techniques and results for establishing the consistency of convex community detection under various statistical models. In particular, we discuss the basic techniques based on the primal and dual analysis. We also present results that demonstrate several distinctive advantages of convex community detection, including robustness against outlier nodes, consistency under weak assortativity, and adaptivity to heterogeneous degrees. This survey is not intended to be a complete overview of the vast literature on this fast-growing topic. Instead, we aim to provide a big picture of the remarkable recent development in this area and to make the survey accessible to a broad audience. We hope that this expository article can serve as an introductory guide for readers who are interested in using, designing, and analyzing convex relaxation methods in network analysis.

optimization problem, relaxation, survey article, (20 more...)

arXiv.org Machine Learning

1810.00315

Country: North America > United States > California > Yolo County > Davis (0.14)

Genre: Overview (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Predicting Aesthetic Score Distribution Through Cumulative Jensen-Shannon Divergence

Jin, Xin (Beijing Electronic Science and Technology Institute) | Wu, Le (Beijing Electronic Science and Technology Institute ) | Li, Xiaodong (Beijing Electronic Science and Technology Institute) | Chen, Siyu (Beijing Electronic Science and Technology Institute) | Peng, Siwei (Beijing University of Chemical Technology) | Chi, Jingying (Beijing University of Chemical Technology) | Ge, Shiming (Chinese Academy of Sciences) | Song, Chenggen (Beijing Electronic Science and Technology Institute) | Zhao, Geng (Beijing Electronic Science and Technology Institute)

AAAI ConferencesFeb-8-2018

Aesthetic quality prediction is a challenging task in the computer vision community because of the complex interplay with semantic contents and photographic technologies. Recent studies on the powerful deep learning based aesthetic quality assessment usually use a binary high-low label or a numerical score to represent the aesthetic quality. However the scalar representation cannot describe well the underlying varieties of the human perception of aesthetics. In this work, we propose to predict the aesthetic score distribution (i.e., a score distribution vector of the ordinal basic human ratings) using Deep Convolutional Neural Network (DCNN). Conventional DCNNs which aim to minimize the difference between the predicted scalar numbers or vectors and the ground truth cannot be directly used for the ordinal basic rating distribution. Thus, a novel CNN based on the Cumulative distribution with Jensen-Shannon divergence (CJS-CNN) is presented to predict the aesthetic score distribution of human ratings, with a new reliability-sensitive learning method based on the kurtosis of the score distribution, which eliminates the requirement of the original full data of human ratings (without normalization). Experimental results on large scale aesthetic dataset demonstrate the effectiveness of our introduced CJS-CNN in this task.

deep learning, neural network, score distribution, (20 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

North America > United States (0.68)
Europe (0.68)
North America > Canada > Quebec (0.14)
(2 more...)

Genre: Research Report (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Subspace Perspective on Canonical Correlation Analysis: Dimension Reduction and Minimax Rates

Ma, Zhuang, Li, Xiaodong

arXiv.org Machine LearningJan-20-2018

Canonical correlation analysis (CCA) is a fundamental statistical tool for exploring the correlation structure between two sets of random variables. In this paper, motivated by recent success of applying CCA to learn low dimensional representations of high dimensional objects, we propose to quantify the estimation loss of CCA by the excess prediction loss defined through a prediction-after-dimension-reduction framework. Such framework suggests viewing CCA estimation as estimating the subspaces spanned by the canonical variates. Interestedly, the proposed error metrics derived from the excess prediction loss turn out to be closely related to the principal angles between the subspaces spanned by the population and sample canonical variates respectively. We characterize the non-asymptotic minimax rates under the proposed metrics, especially the dependency of the minimax rates on the key quantities including the dimensions, the condition number of the covariance matrices, the canonical correlations and the eigen-gap, with minimal assumptions on the joint covariance matrix. To the best of our knowledge, this is the first finite sample result that captures the effect of the canonical correlations on the minimax rates.

artificial intelligence, canonical loading, health & medicine, (16 more...)

arXiv.org Machine Learning

1605.03662

Genre: Research Report (0.83)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Memory-efficient Kernel PCA via Partial Matrix Sampling and Nonconvex Optimization: a Model-free Analysis of Local Minima

Chen, Ji, Li, Xiaodong

arXiv.org Machine LearningDec-26-2017

Kernel PCA is a widely used nonlinear dimension reduction technique in machine learning, but storing the kernel matrix is notoriously challenging when the sample size is large. Inspired by Yi et al. [2016], where the idea of partial matrix sampling followed by nonconvex optimization is proposed for matrix completion and robust PCA, we apply a similar approach to memory-efficient Kernel PCA. In theory, with no assumptions on the kernel matrix in terms of eigenvalues or eigenvectors, we established a model-free theory for the low-rank approximation based on any local minimum of the proposed objective function. As interesting byproducts, when the underlying positive semidefinite matrix is assumed to be low-rank and highly structured, corollaries of our main theorem improve the state-of-the-art results of Ge et al. [2016, 2017] for nonconvex matrix completion with no spurious local minima. Numerical experiments also show that our approach is competitive in terms of approximation accuracy compared to the well-known Nystr\"{o}m algorithm for Kernel PCA.

artificial intelligence, matrix, optimization problem, (16 more...)

arXiv.org Machine Learning

1711.01742

Country: North America > United States > California > Yolo County > Davis (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback