AITopics | Xu, Chen

Collaborating Authors

Xu, Chen

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Conformal prediction interval for dynamic time-series

Xu, Chen, Xie, Yao

arXiv.org Machine LearningNov-9-2020

We develop a method to build distribution-free prediction intervals in batches for time-series based on conformal inference, called \Verb|EnbPI| that wraps around any ensemble estimator to construct sequential prediction intervals. \Verb|EnbPI| is closely related to the conformal prediction (CP) framework but does not require data exchangeability. Theoretically, these intervals attain finite-sample, approximately valid average coverage for broad classes of regression functions and time-series with strongly mixing stochastic errors. Computationally, \Verb|EnbPI| requires no training of multiple ensemble estimators; it efficiently operates around an already trained ensemble estimator. In general, \Verb|EnbPI| is easy to implement, scalable to producing arbitrarily many prediction intervals sequentially, and well-suited to a wide range of regression functions. We perform extensive simulations and real-data analyses to demonstrate its effectiveness.

deep learning, neural network, prediction interval, (19 more...)

arXiv.org Machine Learning

2010.09107

Country:

North America (0.16)
Europe (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science > Data Mining (0.92)

Add feedback

Alternating Multi-bit Quantization for Recurrent Neural Networks

Xu, Chen, Yao, Jianqiang, Lin, Zhouchen, Ou, Wenwu, Cao, Yuanbin, Wang, Zhirong, Zha, Hongbin

arXiv.org Machine LearningJan-31-2018

Recurrent neural networks have achieved excellent performance in many applications. However, on portable devices with limited resources, the models are often too large to deploy. For applications on the server with large scale concurrent requests, the latency during inference can also be very critical for costly computing resources. In this work, we address these problems by quantizing the network, both weights and activations, into multiple binary codes {-1,+1}. We formulate the quantization as an optimization problem. Under the key observation that once the quantization coefficients are fixed the binary codes can be derived efficiently by binary search tree, alternating minimization is then applied. We test the quantization for two well-known RNNs, i.e., long short term memory (LSTM) and gated recurrent unit (GRU), on the language models. Compared with the full-precision counter part, by 2-bit quantization we can achieve ~16x memory saving and ~6x real inference acceleration on CPUs, with only a reasonable loss in the accuracy. By 3-bit quantization, we can achieve almost no loss in the accuracy or even surpass the original model, with ~10.5x memory saving and ~3x real inference acceleration. Both results beat the exiting quantization works with large margins. We extend our alternating quantization to image classification tasks. In both RNNs and feedforward neural networks, the method also achieves excellent performance.

deep learning, neural network, quantization, (20 more...)

arXiv.org Machine Learning

1802.0015

Country: Asia > China (0.14)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Unified Convex Surrogate for the Schatten- p Norm

Xu, Chen (Peking University and Shanghai Jiao Tong University) | Lin, Zhouchen (Peking University and Shanghai Jiao Tong University) | Zha, Hongbin (Peking University and Shanghai Jiao Tong University)

AAAI ConferencesFeb-14-2017

The Schatten- p norm (0 < p < 1) has been widely used to replace the nuclear norm for better approximating the rank function. However, existing methods are either 1) not scalable for large scale problems due to relying on singular value decomposition (SVD) in every iteration, or 2) specific to some p values, e.g., 1/2, and 2/3. In this paper, we show that for any p , p 1 , and p 2 > 0 satisfying 1/ p = 1/ p 1 + 1/ p 2 , there is an equivalence between the Schatten- p norm of one matrix and the Schatten- p 1 and the Schatten- p 2 norms of its two factor matrices. We further extend the equivalence to multiple factor matrices and show that all the factor norms can be convex and smooth for any p > 0. In contrast, the original Schatten- p norm for 0 < p < 1 is non-convex and non-smooth. As an example we conduct experiments on matrix completion. To utilize the convexity of the factor matrix norms, we adopt the accelerated proximal alternating linearized minimization algorithm and establish its sequence convergence. Experiments on both synthetic and real datasets exhibit its superior performance over the state-of-the-art methods. Its speed is also highly competitive.

algorithm, artificial intelligence, machine learning, (15 more...)

AAAI Conferences

Thirty-First AAAI Conference on Artificial Intelligence

Country:

North America > United States (0.14)
Asia > China (0.14)

Genre: Research Report > Experimental Study (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

A Unified Convex Surrogate for the Schatten-$p$ Norm

Xu, Chen, Lin, Zhouchen, Zha, Hongbin

arXiv.org Machine LearningNov-25-2016

The Schatten-$p$ norm ($00$ satisfying $1/p=1/p_1+1/p_2$, there is an equivalence between the Schatten-$p$ norm of one matrix and the Schatten-$p_1$ and the Schatten-$p_2$ norms of its two factor matrices. We further extend the equivalence to multiple factor matrices and show that all the factor norms can be convex and smooth for any $p>0$. In contrast, the original Schatten-$p$ norm for $0

artificial intelligence, machine learning, schatten-p norm, (17 more...)

arXiv.org Machine Learning

1611.08372

Country:

North America > United States (0.14)
Asia > China (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Relaxed Majorization-Minimization for Non-Smooth and Non-Convex Optimization

Xu, Chen (Peking University) | Lin, Zhouchen ( Peking University ) | Zhao, Zhenyu ( National University of Defense Technology ) | Zha, Hongbin ( Peking University )

AAAI ConferencesApr-19-2016

We propose a new majorization-minimization (MM) method for non-smooth and non-convex programs, which is general enough to include the existing MM methods. Besides the local majorization condition, we only require that the difference between the directional derivatives of the objective function and its surrogate function vanishes when the number of iterations approaches infinity, which is a very weak condition. So our method can use a surrogate function that directly approximates the non-smooth objective function. In comparison, all the existing MM methods construct the surrogate function by approximating the smooth component of the objective function. We apply our relaxed MM methods to the robust matrix factorization (RMF) problem with different regularizations, where our locally majorant algorithm shows advantages over the state-of-the-art approaches for RMF. This is the first algorithm for RMF ensuring, without extra assumptions, that any limit point of the iterates is a stationary point.

algorithm, artificial intelligence, optimization problem, (16 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country:

Asia > China (0.14)
North America > United States (0.14)

Genre:

Research Report (0.34)
Overview (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

On the Feasibility of Distributed Kernel Regression for Big Data

Xu, Chen, Zhang, Yongquan, Li, Runze

arXiv.org Machine LearningMay-4-2015

In modern scientific research, massive datasets with huge numbers of observations are frequently encountered. To facilitate the computational process, a divide-and-conquer scheme is often used for the analysis of big data. In such a strategy, a full dataset is first split into several manageable segments; the final output is then averaged from the individual outputs of the segments. Despite its popularity in practice, it remains largely unknown that whether such a distributive strategy provides valid theoretical inferences to the original data. In this paper, we address this fundamental issue for the distributed kernel regression (DKR), where the algorithmic feasibility is measured by the generalization performance of the resulting estimator. To justify DKR, a uniform convergence rate is needed for bounding the generalization error over the individual outputs, which brings new and challenging issues in the big data setup. Under mild conditions, we show that, with a proper number of segments, DKR leads to an estimator that is generalization consistent to the unknown regression function. The obtained results justify the method of DKR and shed light on the feasibility of using other distributed algorithms for processing big data. The promising preference of the method is supported by both simulation and real data examples.

artificial intelligence, big data, dkr, (17 more...)

arXiv.org Machine Learning

1505.00869

Country:

Asia (0.68)
North America > United States > Pennsylvania (0.28)

Genre: Research Report (0.83)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Add feedback

Does generalization performance of $l^q$ regularization learning depend on $q$? A negative example

Lin, Shaobo, Xu, Chen, Zeng, Jingshan, Fang, Jian

arXiv.org Machine LearningJul-24-2013

$l^q$-regularization has been demonstrated to be an attractive technique in machine learning and statistical modeling. It attempts to improve the generalization (prediction) capability of a machine (model) through appropriately shrinking its coefficients. The shape of a $l^q$ estimator differs in varying choices of the regularization order $q$. In particular, $l^1$ leads to the LASSO estimate, while $l^{2}$ corresponds to the smooth ridge regression. This makes the order $q$ a potential tuning parameter in applications. To facilitate the use of $l^{q}$-regularization, we intend to seek for a modeling strategy where an elaborative selection on $q$ is avoidable. In this spirit, we place our investigation within a general framework of $l^{q}$-regularized kernel learning under a sample dependent hypothesis space (SDHS). For a designated class of kernel functions, we show that all $l^{q}$ estimators for $0< q < \infty$ attain similar generalization error bounds. These estimated bounds are almost optimal in the sense that up to a logarithmic factor, the upper and lower bounds are asymptotically identical. This finding tentatively reveals that, in some modeling contexts, the choice of $q$ might not have a strong impact in terms of the generalization capability. From this perspective, $q$ can be arbitrarily specified, or specified merely by other no generalization criteria like smoothness, computational complexity, sparsity, etc..

artificial intelligence, evolutionary algorithm, generalization capability, (17 more...)

arXiv.org Machine Learning

1307.6616

Country: North America > United States > Pennsylvania (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback