AITopics | Peng, Hao

Collaborating Authors

Peng, Hao

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Hierarchical Taxonomy-Aware and Attentional Graph Capsule RCNNs for Large-Scale Multi-Label Text Classification

Peng, Hao, Li, Jianxin, Gong, Qiran, Wang, Senzhang, He, Lifang, Li, Bo, Wang, Lihong, Yu, Philip S.

arXiv.org Machine LearningJun-9-2019

CNNs, RNNs, GCNs, and CapsNets have shown significant insights in representation learning and are widely used in various text mining tasks such as large-scale multi-label text classification. However, most existing deep models for multi-label text classification consider either the non-consecutive and long-distance semantics or the sequential semantics, but how to consider them both coherently is less studied. In addition, most existing methods treat output labels as independent methods, but ignore the hierarchical relations among them, leading to useful semantic information loss. In this paper, we propose a novel hierarchical taxonomy-aware and attentional graph capsule recurrent CNNs framework for large-scale multi-label text classification. Specifically, we first propose to model each document as a word order preserved graph-of-words and normalize it as a corresponding words-matrix representation which preserves both the non-consecutive, long-distance and local sequential semantics. Then the words-matrix is input to the proposed attentional graph capsule recurrent CNNs for more effectively learning the semantic features. To leverage the hierarchical relations among the class labels, we propose a hierarchical taxonomy embedding method to learn their representations, and define a novel weighted margin loss by incorporating the label representation similarity. Extensive evaluations on three datasets show that our model significantly improves the performance of large-scale multi-label text classification by comparing with state-of-the-art approaches.

chess, deep learning, neural network, (23 more...)

arXiv.org Machine Learning

1906.04898

Country: North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Games > Chess (1.00)
Health & Medicine (0.67)
Government (0.67)
Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Dynamic Network Embedding via Incremental Skip-gram with Negative Sampling

Peng, Hao, Li, Jianxin, Yan, Hao, Gong, Qiran, Wang, Senzhang, Liu, Lin, Wang, Lihong, Ren, Xiang

arXiv.org Machine LearningJun-9-2019

Network representation learning, as an approach to learn low dimensional representations of vertices, has attracted considerable research attention recently. It has been proven extremely useful in many machine learning tasks over large graph. Most existing methods focus on learning the structural representations of vertices in a static network, but cannot guarantee an accurate and efficient embedding in a dynamic network scenario. To address this issue, we present an efficient incremental skip-gram algorithm with negative sampling for dynamic network embedding, and provide a set of theoretical analyses to characterize the performance guarantee. Specifically, we first partition a dynamic network into the updated, including addition/deletion of links and vertices, and the retained networks over time. Then we factorize the objective function of network embedding into the added, vanished and retained parts of the network. Next we provide a new stochastic gradient-based method, guided by the partitions of the network, to update the nodes and the parameter vectors. The proposed algorithm is proven to yield an objective function value with a bounded difference to that of the original objective function. Experimental results show that our proposal can significantly reduce the training time while preserving the comparable performance. We also demonstrate the correctness of the theoretical analysis and the practical usefulness of the dynamic network embedding. We perform extensive experiments on multiple real-world large network datasets over multi-label classification and link prediction tasks to evaluate the effectiveness and efficiency of the proposed framework, and up to 22 times speedup has been achieved.

artificial intelligence, dynamic network, optimization problem, (19 more...)

arXiv.org Machine Learning

1906.03586

Country:

Asia > China (0.30)
North America > United States > California (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.69)

Add feedback

Improving Orbit Prediction Accuracy through Supervised Machine Learning

Peng, Hao, Bai, Xiaoli

arXiv.org Machine LearningJan-15-2018

Due to the lack of information such as the space environment condition and resident space objects' (RSOs') body characteristics, current orbit predictions that are solely grounded on physics-based models may fail to achieve required accuracy for collision avoidance and have led to satellite collisions already. This paper presents a methodology to predict RSOs' trajectories with higher accuracy than that of the current methods. Inspired by the machine learning (ML) theory through which the models are learned based on large amounts of observed data and the prediction is conducted without explicitly modeling space objects and space environment, the proposed ML approach integrates physics-based orbit prediction algorithms with a learning-based process that focuses on reducing the prediction errors. Using a simulation-based space catalog environment as the test bed, the paper demonstrates three types of generalization capability for the proposed ML approach: 1) the ML model can be used to improve the same RSO's orbit information that is not available during the learning process but shares the same time interval as the training data; 2) the ML model can be used to improve predictions of the same RSO at future epochs; and 3) the ML model based on a RSO can be applied to other RSOs that share some common features.

artificial intelligence, machine learning, svm model, (15 more...)

arXiv.org Machine Learning

doi: 10.1016/j.asr.2018.03.001

1801.04856

Country:

Europe (0.93)
North America > United States > Massachusetts > Middlesex County (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Government (0.93)
Energy (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.52)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Asynchronous Distributed Variational Gaussian Processes for Regression

Peng, Hao, Zhe, Shandian, Qi, Yuan

arXiv.org Machine LearningJun-12-2017

Gaussian processes (GPs) are powerful non-parametric function estimators. However, their applications are largely limited by the expensive computational cost of the inference procedures. Existing stochastic or distributed synchronous variational inferences, although have alleviated this issue by scaling up GPs to millions of samples, are still far from satisfactory for real-world large applications, where the data sizes are often orders of magnitudes larger, say, billions. To solve this problem, we propose ADVGP, the first Asynchronous Distributed Variational Gaussian Process inference for regression, on the recent large-scale machine learning platform, PARAMETERSERVER. ADVGP uses a novel, flexible variational framework based on a weight space augmentation, and implements the highly efficient, asynchronous proximal gradient optimization. While maintaining comparable or better predictive performance, ADVGP greatly improves upon the efficiency of the existing variational methods. With ADVGP, we effortlessly scale up GP regression to a real-world application with billions of samples and demonstrate an excellent, superior prediction accuracy to the popular linear models.

advgp, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

1704.06735

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Incrementally Learning the Hierarchical Softmax Function for Neural Language Models

Peng, Hao ( Beihang University ) | Li, Jianxin (Beihang University) | Song, Yangqiu ( Hong Kong University of Science and Technology ) | Liu, Yaopeng ( Beihang University )

AAAI ConferencesFeb-14-2017

Neural network language models (NNLMs) have attracted a lot of attention recently. In this paper, we present a training method that can incrementally train the hierarchical softmax function for NNMLs. We split the cost function to model old and update corpora separately, and factorize the objective function for the hierarchical softmax. Then we provide a new stochastic gradient based method to update all the word vectors and parameters, by comparing the old tree generated based on the old corpus and the new tree generated based on the combined (old and update) corpus. Theoretical analysis shows that the mean square error of the parameter vectors can be bounded by a function of the number of changed words related to the parameter node. Experimental results show that incremental training can save a lot of time. The smaller the update corpus is, the faster the update training process is, where an up to 30 times speedup has been achieved. We also use both word similarity/relatedness tasks and dependency parsing task as our benchmarks to evaluate the correctness of the updated word vectors.

neural network, us government, vector, (21 more...)

AAAI Conferences

Thirty-First AAAI Conference on Artificial Intelligence

Country: Asia > China (0.47)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.89)

Add feedback