AITopics

In this paper, we aim to construct a deep neural network which embeds high dimensional symmetric positive definite (SPD) matrices into a more discriminative low dimensional SPD manifold. To this end, we develop two types of basic layers: a 2D fully connected layer which reduces the dimensionality of the SPD matrices, and a symmetrically clean layer which achieves non-linear mapping. Specifically, we extend the classical fully connected layer such that it is suitable for SPD matrices, and we further show that SPD matrices with symmetric pair elements setting zero operations are still symmetric positive definite. Finally, we complete the construction of the deep neural network for SPD manifold learning by stacking the two layers. Experiments on several face datasets demonstrate the effectiveness of the proposed method.

artificial intelligence, machine learning, matrix, (19 more...)

Thirty-First AAAI Conference on Artificial Intelligence

Country: North America > United States (0.28)

Industry: Education (0.61)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Latent Dependency Forest Models

Chu, Shanbo (ShanghaiTech University) | Jiang, Yong (ShanghaiTech University) | Tu, Kewei (ShanghaiTech University)

Probabilistic modeling is one of the foundations of modern Learning the structure of a probabilistic model resembles machine learning and artificial intelligence, which aims to learning the set of production rules of a grammar, while compactly represent the joint probability distribution of random learning model parameters resembles learning grammar rule variables. The most widely used approach for probabilistic probabilities. From the unsupervised grammar learning literature, modeling is probabilistic graphical models. A probabilistic one can see that learning approaches based on PCFGs graphical model represents a probability distribution with a have not been very successful, while the state-of-the-art performance directed or undirected graph. It represents random variables has mostly been achieved based on less expressive with the nodes in the graph and uses the edges in the graph to models such as dependency grammars (DGs) (Klein and encode the probabilistic relationships between random variables.

artificial intelligence, machine learning, natural language, (19 more...)

Thirty-First AAAI Conference on Artificial Intelligence

Country:

North America > United States (0.68)
Asia (0.47)

Industry: Education > Curriculum > Subject-Specific Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.95)

Ji, Guoliang (National Laboratory of Pattern Recognition (NLPR), Institute of Automation Chinese Academy of Sciences) | Liu, Kang (National Laboratory of Pattern Recognition (NLPR), Institute of Automation Chinese Academy of Sciences) | He, Shizhu (National Laboratory of Pattern Recognition (NLPR), Institute of Automation Chinese Academy of Sciences) | Zhao, Jun (National Laboratory of Pattern Recognition (NLPR), Institute of Automation Chinese Academy of Sciences)

Distant Supervision for Relation Extraction with Sentence-Level Attention and Entity Descriptions

Distant supervision for relation extraction is an efficient method to scale relation extraction to very large corpora which contains thousands of relations. However, the existing approaches have flaws on selecting valid instances and lack of background knowledge about the entities. In this paper, we propose a sentence-level attention model to select the valid instances, which makes full use of the supervision information from knowledge bases. And we extract entity descriptions from Freebase and Wikipedia pages to supplement background knowledge for our task. The background knowledge not only provides more information for predicting relations, but also brings better entity representations for the attention module. We conduct three experiments on a widely used dataset and the experimental results show that our approach outperforms all the baseline systems significantly.

machine learning, natural language, relation, (17 more...)

Thirty-First AAAI Conference on Artificial Intelligence

Country: North America > United States > Nevada (0.16)

Genre: Research Report > New Finding (0.34)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Zhou, Yuxun (Unviersity of California, Berkeley) | Kang, Zhaoyi (Unviersity of California, Berkeley) | Spanos, Costas J. (Unviersity of California, Berkeley)

Parametric Dual Maximization for Non-Convex Learning Problems

We consider a class of non-convex learning problems that can be formulated as jointly optimizing regularized hinge loss and a set of auxiliary variables. Such problems encompass but are not limited to various versions of semi-supervised learning,learning with hidden structures, robust learning, etc. Existing methods either suffer from local minima or have to invoke anon-scalable combinatorial search. In this paper, we propose a novel learning procedure, namely Parametric Dual Maximization(PDM), that can approach global optimality efficiently with user specified approximation levels. The building blocks of PDM are two new results: (1) The equivalent convex maximization reformulation derived by parametric analysis.(2) The improvement of local solutions based on a necessary and sufficient condition for global optimality. Experimental results on two representative applications demonstrate the effectiveness of PDM compared to other approaches.

artificial intelligence, machine learning, optimization problem, (14 more...)

Thirty-First AAAI Conference on Artificial Intelligence

Genre: Research Report > New Finding (0.34)

Industry: Education > Focused Education > Special Education (0.61)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.93)

Zhang, Yu (Hong Kong University of Science and Technology) | Yang, Qiang (Hong Kong University of Science and Technology)

Learning Sparse Task Relations in Multi-Task Learning

In multi-task learning, when the number of tasks is large, pairwise task relations exhibit sparse patterns since usually a task cannot be helpful to all of the other tasks and moreover, sparse task relations can reduce the risk of overfitting compared with the dense ones. In this paper, we focus on learning sparse task relations. Based on a regularization framework which can learn task relations among multiple tasks, we propose a SParse covAriance based mulTi-taSk (SPATS) model to learn a sparse covariance by using the ℓ l regularization. The resulting objective function of the SPATS method is convex, which allows us to devise an alternating method to solve it. Moreover, some theoretical properties of the proposed model are studied. Experiments on synthetic and real-world datasets demonstrate the effectiveness of the proposed method.

artificial intelligence, machine learning, task relation, (17 more...)

Thirty-First AAAI Conference on Artificial Intelligence

Country: Asia > China (0.14)

Industry:

Health & Medicine (0.46)
Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

A Framework of Online Learning with Imbalanced Streaming Data

Yan, Yan (University of Technology Sydney) | Yang, Tianbao (The University of Iowa) | Yang, Yi (University of Technology Sydney) | Chen, Jianhui (Yahoo! Labs)

A challenge for mining large-scale streaming data overlooked by most existing studies on online learning is the skew-distribution of examples over different classes. Many previous works have considered cost-sensitive approaches in an online setting for streaming data, where fixed costs are assigned to different classes, or ad-hoc costs are adapted based on the distribution of data received so far. However, it is not necessary for them to achieve optimal performance in terms of the measures suited for imbalanced data, such as F-measure, area under ROC curve (AUROC), area under precision and recall curve (AUPRC). This work proposes a general framework for online learning with imbalanced streaming data, where examples are coming sequentially and models are updated accordingly on-the-fly. By simultaneously learning multiple classifiers with different cost vectors, the proposed method can be adopted for different target measures for imbalanced data, including F-measure, AUROC and AUPRC. Moreover, we present a rigorous theoretical justification of the proposed framework for the F-measure maximization. Our empirical studies demonstrate the competitive if not better performance of the proposed method compared to previous cost-sensitive and resampling based online learning algorithms and those that are designed for optimizing certain measures.

algorithm, artificial intelligence, machine learning, (16 more...)

Thirty-First AAAI Conference on Artificial Intelligence

Country: North America > United States > Iowa (0.28)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)

Tan, Ben (Hong Kong University of Science and Technology) | Zhang, Yu (Hong Kong University of Science and Technology) | Pan, Sinno Jialin (Nanyang Technological University) | Yang, Qiang (Hong Kong University of Science and Technology)

Distant Domain Transfer Learning

In this paper, we study a novel transfer learning problem termed Distant Domain Transfer Learning (DDTL). Different from existing transfer learning problems which assume that there is a close relation between the source domain and the target domain, in the DDTL problem, the target domain can be totally different from the source domain. For example, the source domain classifies face images but the target domain distinguishes plane images. Inspired by the cognitive processof human where two seemingly unrelated concepts can be connected by learning intermediate concepts gradually, we propose a Selective Learning Algorithm (SLA) to solve the DDTL problem with supervised autoencoder or supervised convolutional autoencoder as a base model for handling different types of inputs. Intuitively, the SLA algorithm selects usefully unlabeled data gradually from intermediate domains as a bridge to break the large distribution gap for transferring knowledge between two distant domains. Empirical studies on image classification problems demonstrate the effectiveness of the proposed algorithm, and on some tasks the improvement in terms of the classification accuracy is up to 17% over “non-transfer” methods.

artificial intelligence, machine learning, target domain, (19 more...)

Thirty-First AAAI Conference on Artificial Intelligence

Country: Asia > China (0.14)

Genre: Overview (0.46)

Industry: Education > Focused Education > Special Education (0.44)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Where to Add Actions in Human-in-the-Loop Reinforcement Learning

Mandel, Travis (University of Washington) | Liu, Yun-En (Enlearn) | Brunskill, Emma (Carnegie Mellon University) | Popović, Zoran (University of Washington)

In order for reinforcement learning systems to learn quickly in vast action spaces such as the space of all possible pieces of text or the space of all images, leveraging human intuition and creativity is key. However, a human-designed action space is likely to be initially imperfect and limited; furthermore, humans may improve at creating useful actions with practice or new information. Therefore, we propose a framework in which a human adds actions to a reinforcement learning system over time to boost performance. In this setting, however, it is key that we use human effort as efficiently as possible, and one significant danger is that humans waste effort adding actions at places (states) that aren't very important. Therefore, we propose Expected Local Improvement (ELI), an automated method which selects states at which to query humans for a new action. We evaluate ELI on a variety of simulated domains adapted from the literature, including domains with over a million actions and domains where the simulated experts change over time. We find ELI demonstrates excellent empirical performance, even in settings where the synthetic "experts" are quite poor.

machine learning, reinforcement, reinforcement learning, (15 more...)

Thirty-First AAAI Conference on Artificial Intelligence

Country: North America > United States (0.68)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Efficient Online Model Adaptation by Incremental Simplex Tableau

Lei, Zhixian (Harward University) | Ye, Xuehan (Renmin University of China) | Wang, Yongcai (Renmin University of China) | Li, Deying (Renmin University of China) | Xu, Jia (Hunter College, City University of New York)

Online multi-kernel learning is promising in the era of mobile computing, in which a combined classifier with multiple kernels are offline trained, and online adapts to personalized features for serving the end user precisely and smartly. The online adaptation is mainly carried out at the end-devices, which requires the adaptation algorithms to be light, efficient and accurate. Previous results focused mainly on efficiency. This paper proposes an novel online model adaptation framework for not only efficiency but also optimal online adaptation. At first, an online optimal incremental simplex tableau (IST)algorithm is proposed, which approaches the model adaption by linear programming and produces the optimized model update in each step when a personalized training data is collected.But keeping online optimal in each step is expensive and may cause over-fitting especially when the online data is noisy. A Fast-IST approach is therefore proposed, which measures the deviation between the training data and the current model. It schedules updating only when enough deviation is detected. The efficiency of each update is further enhanced by running IST only limited iterations, which bounds the computation complexity. Theoretical analysis and extensive evaluations show that Fast-IST saves computation cost greatly, while achieving speedy and accurate model adaptation.It provides better model adaptation speed and accuracy while using even lower computing cost than the state-of-the art.

algorithm, artificial intelligence, machine learning, (16 more...)

Thirty-First AAAI Conference on Artificial Intelligence

Country:

North America > United States (0.47)
Asia (0.28)

Industry: Education > Educational Setting > Online (0.71)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Learning Unitary Operators with Help From u(n)

Hyland, Stephanie L. (ETH Zurich ) | Rätsch, Gunnar (ETH Zurich)

A major challenge in the training of recurrent neural networks is the so-called vanishing or exploding gradient problem. The use of a norm-preserving transition operator can address this issue, but parametrization is challenging. In this work we focus on unitary operators and describe a parametrization using the Lie algebra u( n ) associated with the Lie group U ( n ) of n × n unitary matrices. The exponential map provides a correspondence between these spaces, and allows us to define a unitary matrix using n 2 real coefficients relative to a basis of the Lie algebra. The parametrization is closed under additive updates of these coefficients, and thus provides a simple space in which to do gradient descent. We demonstrate the effectiveness of this parametrization on the problem of learning arbitrary unitary operators, comparing to several baselines and outperforming a recently-proposed lower-dimensional parametrization. We additionally use our parametrization to generalize a recently-proposed unitary recurrent neural network to arbitrary unitary matrices, using it to solve standard long-memory tasks.

artificial intelligence, machine learning, matrix, (19 more...)

Thirty-First AAAI Conference on Artificial Intelligence

Country: Europe > Switzerland (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)