AITopics

Movement primitives are an important policy class for real-world robotics. However, the high dimensionality of their parametrization makes the policy optimization expensive both in terms of samples and computation. Enabling an efficient representation of movement primitives facilitates the application of machine learning techniques such as reinforcement on robotics. Motions, especially in highly redundant kinematic structures, exhibit high correlation in the configuration space. For these reasons, prior work has mainly focused on the application of dimensionality reduction techniques in the configuration space. In this paper, we investigate the application of dimensionality reduction in the parameter space, identifying principal movements. The resulting approach is enriched with a probabilistic treatment of the parameters, inheriting all the properties of the Probabilistic Movement Primitives. We test the proposed technique both on a real robotic task and on a database of complex human movements. The empirical analysis shows that the dimensionality reduction in parameter space is more effective than in configuration space, as it enables the representation of the movements with a significant reduction of parameters.

dimensionality reduction, movement primitive, reduction, (14 more...)

2003.02634

Country:

Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (1.00)

Valavi, Hossein, Liu, Sulin, Ramadge, Peter J.

The Landscape of Matrix Factorization Revisited

We revisit the landscape of the simple matrix factorization problem. For low-rank matrix factorization, prior work has shown that there exist infinitely many critical points all of which are either global minima or strict saddles. At a strict saddle the minimum eigenvalue of the Hessian is negative. Of interest is whether this minimum eigenvalue is uniformly bounded below zero over all strict saddles. To answer this we consider orbits of critical points under the general linear group. For each orbit we identify a representative point, called a canonical point. If a canonical point is a strict saddle, so is every point on its orbit. We derive an expression for the minimum eigenvalue of the Hessian at each canonical strict saddle and use this to show that the minimum eigenvalue of the Hessian over the set of strict saddles is not uniformly bounded below zero. We also show that a known invariance property of gradient flow ensures the solution of gradient flow only encounters critical points on an invariant manifold $\mathcal{M}_C$ determined by the initial condition. We show that, in contrast to the general situation, the minimum eigenvalue of strict saddles in $\mathcal{M}_{0}$ is uniformly bounded below zero. We obtain an expression for this bound in terms of the singular values of the matrix being factorized. This bound depends on the size of the nonzero singular values and on the separation between distinct nonzero singular values of the matrix.

critical point, eigenvalue, eigenvector, (15 more...)

2002.12795

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Adjakossa, Eric, Goude, Yannig, Wintenberger, Olivier

Kalman Recursions Aggregated Online

In this article, we aim at improving the prediction of expert aggregation by using the underlying properties of the models that provide expert predictions. We restrict ourselves to the case where expert predictions come from Kalman recursions, fitting state-space models. By using exponential weights, we construct different algorithms of Kalman recursions Aggregated Online (KAO) that compete with the best expert or the best convex combination of experts in a more or less adaptive way. We improve the existing results on expert aggregation literature when the experts are Kalman recursions by taking advantage of the second-order properties of the Kalman recursions. We apply our approach to Kalman recursions and extend it to the general adversarial expert setting by state-space modeling the errors of the experts. We apply these new algorithms to a real dataset of electricity consumption and show how it can improve forecast performances comparing to other exponentially weighted average procedures.

aggregation, kalman recursion, prediction, (15 more...)

2002.12173

Country:

Europe > Austria > Vienna (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.50)

Industry: Energy > Power Industry (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Multi-source Domain Adaptation in the Deep Learning Era: A Systematic Survey

Zhao, Sicheng, Li, Bo, Reed, Colorado, Xu, Pengfei, Keutzer, Kurt

In many practical applications, it is often difficult and expensive to obtain enough large-scale labeled data to train deep neural networks to their full capability. Therefore, transferring the learned knowledge from a separate, labeled source domain to an unlabeled or sparsely labeled target domain becomes an appealing alternative. However, direct transfer often results in significant performance decay due to domain shift. Domain adaptation (DA) addresses this problem by minimizing the impact of domain shift between the source and target domains. Multi-source domain adaptation (MDA) is a powerful extension in which the labeled data may be collected from multiple sources with different distributions. Due to the success of DA methods and the prevalence of multi-source data, MDA has attracted increasing attention in both academia and industry. In this survey, we define various MDA strategies and summarize available datasets for evaluation. We also compare modern MDA methods in the deep learning era, including latent space transformation and intermediate domain generation. Finally, we discuss future research directions for MDA.

adaptation, classification, domain adaptation, (17 more...)

2002.12169

Country:

North America > United States > Colorado (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > Germany (0.04)
Asia > China (0.04)

Genre: Overview (1.00)

Industry: Leisure & Entertainment (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Infinitely Wide Graph Convolutional Networks: Semi-supervised Learning via Gaussian Processes

Hu, Jilin, Shen, Jianbing, Yang, Bin, Shao, Ling

Graph convolutional neural networks~(GCNs) have recently demonstrated promising results on graph-based semi-supervised classification, but little work has been done to explore their theoretical properties. Recently, several deep neural networks, e.g., fully connected and convolutional neural networks, with infinite hidden units have been proved to be equivalent to Gaussian processes~(GPs). To exploit both the powerful representational capacity of GCNs and the great expressive power of GPs, we investigate similar properties of infinitely wide GCNs. More specifically, we propose a GP regression model via GCNs~(GPGC) for graph-based semi-supervised learning. In the process, we formulate the kernel matrix computation of GPGC in an iterative analytical form. Finally, we derive a conditional distribution for the labels of unobserved nodes based on the graph structure, labels for the observed nodes, and the feature matrix of all the nodes. We conduct extensive experiments to evaluate the semi-supervised classification performance of GPGC and demonstrate that it outperforms other state-of-the-art methods by a clear margin on all the datasets while being efficient.

dataset, learning, neural network, (16 more...)

2002.12168

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Denmark > North Jutland > Aalborg (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Mannam, Varun, Kazemi, Arman

Performance Analysis of Semi-supervised Learning in the Small-data Regime using VAEs

Extracting large amounts of data from biological samples is not feasible due to radiation issues, and image processing in the small-data regime is one of the critical challenges when working with a limited amount of data. In this work, we applied an existing algorithm named Variational Auto Encoder (VAE) that pre-trains a latent space representation of the data to capture the features in a lower-dimension for the small-data regime input. The fine-tuned latent space provides constant weights that are useful for classification. Here we will present the performance analysis of the VAE algorithm with different latent space sizes in the semi-supervised learning using the CIFAR-10 dataset.

dataset, latent space, semi-supervised learning, (13 more...)

2002.12164

Country:

North America > United States > Indiana > St. Joseph County > Notre Dame (0.31)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Colorado (0.04)
Asia (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

A Comprehensive Approach to Unsupervised Embedding Learning based on AND Algorithm

Han, Sungwon, Xu, Yizhan, Park, Sungwon, Cha, Meeyoung, Li, Cheng-Te

Unsupervised embedding learning aims to extract good representation from data without the need for any manual labels, which has been a critical challenge in many supervised learning tasks. This paper proposes a new unsupervised embedding approach, called Super-AND, which extends the current state-of-the-art model [11]. Super-AND has its unique set of losses that can gather similar samples nearby within a lowdensity space while keeping invariant features intact against data augmentation. Super-AND outperforms all existing approaches and achieves an accuracy of 89.2% on the image classification task for CIFAR-10. We discuss the practical implications of this method in assisting semisupervised tasks.

learning, neighborhood, super-and, (14 more...)

2002.12158

Country:

North America > United States > Colorado > El Paso County > Colorado Springs (0.04)
North America > United States > California (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Madhavan, Ramanujam, Wadhwa, Mohit

Fairness-Aware Learning with Prejudice Free Representations

Machine learning models are extensively being used to make decisions that have a significant impact on human life. These models are trained over historical data that may contain information about sensitive attributes such as race, sex, religion, etc. The presence of such sensitive attributes can impact certain population subgroups unfairly. It is straightforward to remove sensitive features from the data; however, a model could pick up prejudice from latent sensitive attributes that may exist in the training data. This has led to the growing apprehension about the fairness of the employed models. In this paper, we propose a novel algorithm that can effectively identify and treat latent discriminating features. The approach is agnostic of the learning algorithm and generalizes well for classification as well as regression tasks. It can also be used as a key aid in proving that the model is free of discrimination towards regulatory compliance if the need arises. The approach helps to collect discrimination-free features that would improve the model performance while ensuring the fairness of the model. The experimental results from our evaluations on publicly available real-world datasets show a near-ideal fairness measurement in comparison to other methods.

algorithm, dataset, prejudice, (12 more...)

2002.12143

Country:

North America > Honduras (0.04)
Europe > United Kingdom > Scotland (0.04)
Europe > Hungary (0.04)
(27 more...)

Genre: Research Report (0.65)

Industry:

Law (0.66)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Giaquinto, Robert, Banerjee, Arindam

Gradient Boosted Flows

Normalizing flows (NF) are a powerful framework for approximating posteriors. By mapping a simple base density through invertible transformations, flows provide an exact method of density evaluation and sampling. The trend in normalizing flow literature has been to devise deeper, more complex transformations to achieve greater flexibility. We propose an alternative: Gradient Boosted Flows (GBF) model a variational posterior by successively adding new NF components by gradient boosting so that each new NF component is fit to the residuals of the previously trained components. The GBF formulation results in a variational posterior that is a mixture model, whose flexibility increases as more components are added. Moreover, GBFs offer a wider, not deeper, approach that can be incorporated to improve the results of many existing NFs. We demonstrate the effectiveness of this technique for density estimation and, by coupling GBF with a variational autoencoder, generative modeling of images.

gradient, neural information processing system, posterior, (14 more...)

2002.11896

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Sweden > Stockholm > Stockholm (0.04)
North America > Canada > Quebec > Montreal (0.04)
(5 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Metz, Luke, Maheswaranathan, Niru, Sun, Ruoxi, Freeman, C. Daniel, Poole, Ben, Sohl-Dickstein, Jascha

Using a thousand optimization tasks to learn hyperparameter search strategies

We present TaskSet, a dataset of tasks for use in training and evaluating optimizers. TaskSet is unique in its size and diversity, containing over a thousand tasks ranging from image classification with fully connected or convolutional neural networks, to variational autoencoders, to non-volume preserving flows on a variety of datasets. As an example application of such a dataset we explore meta-learning an ordered list of hyperparameters to try sequentially. By learning this hyperparameter list from data generated using TaskSet we achieve large speedups in sample efficiency over random search. Next we use the diversity of the TaskSet and our method for learning hyperparameter lists to empirically explore the generalization of these lists to new optimization tasks in a variety of settings including ImageNet classification with Resnet50 and LM1B language modeling with transformers. As part of this work we have opensourced code for all tasks, as well as ~29 million training curves for these problems and the corresponding hyperparameters.

hyperparameter, optimization task, optimizer, (13 more...)

2002.11887

Country:

North America > United States > Oregon (0.04)
North America > Canada > Quebec > Montreal (0.04)
North America > Canada > Ontario > Toronto (0.04)
(3 more...)

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)