AITopics

In high-dimensional data space, semi-supervised feature learning based on Euclidean distance shows instability under a broad set of conditions. Furthermore, the scarcity and high cost of labels prompt us to explore new semi-supervised learning methods with the fewest labels. In this paper, we develop a novel Minor Constraint Disturbances-based Deep Semi-supervised Feature Learning framework (MCD-DSFL) from the perspective of probability distribution for feature representation. There are two fundamental modules in the proposed framework: one is a Minor Constraint Disturbances-based restricted Boltzmann machine with Gaussian visible units (MCDGRBM) for modelling continuous data and the other is a Minor Constraint Disturbances-based restricted Boltzmann machine (MCDRBM) for modelling binary data. The Minor Constraint Disturbances (MCD) consist of less instance-level constraints which are produced by only two randomly selected labels from each class. The Kullback-Leibler (KL) divergences of the MCD are fused into the Contrastive Divergence (CD) learning for training the proposed MCDGRBM and MCDRBM models. Then, the probability distributions of hidden layer features are as similar as possible in the same class and they are as dissimilar as possible in the different classes simultaneously. Despite the weak influence of the MCD for our shallow models (MCDGRBM and MCDRBM), the proposed deep MCD-DSFL framework improves the representation capability significantly under its leverage effect. The semi-supervised strategy based on the KL divergence of the MCD significantly reduces the reliance on the labels and improves the stability of the semi-supervised feature learning in high-dimensional space simultaneously.

algorithm, ieee transaction, mcd-dsfl framework, (12 more...)

2003.06321

Country:

Asia > China > Sichuan Province > Chengdu (0.04)
Asia > Macao (0.04)
Europe > Belgium (0.04)
(5 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.54)

Santamaria, Maria, Izquierdo, Ebroul, Blasi, Saverio, Mrak, Marta

Estimation of Rate Control Parameters for Video Coding Using CNN

Rate-control is essential to ensure efficient video delivery. Typical rate-control algorithms rely on bit allocation strategies, to appropriately distribute bits among frames. As reference frames are essential for exploiting temporal redundancies, intra frames are usually assigned a larger portion of the available bits. In this paper, an accurate method to estimate number of bits and quality of intra frames is proposed, which can be used for bit allocation in a rate-control scheme. The algorithm is based on deep learning, where networks are trained using the original frames as inputs, while distortions and sizes of compressed frames after encoding are used as ground truths. Two approaches are proposed where either local or global distortions are predicted.

cnn, distortion, distortion map, (13 more...)

doi: 10.1109/VCIP.2018.8698721

2003.06315

Country:

Europe > United Kingdom > England > Greater London > London (0.05)
North America > United States > Montana (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Tang, Yunhao, Valko, Michal, Munos, Rémi

Taylor Expansion Policy Optimization

In this work, we investigate the application of Taylor expansions in reinforcement learning. In particular, we propose Taylor expansion policy optimization, a policy optimization formalism that generalizes prior work (e.g., TRPO) as a first-order special case. We also show that Taylor expansions intimately relate to off-policy evaluation. Finally, we show that this new formulation entails modifications which improve the performance of several state-of-the-art distributed algorithms.

expansion, taylor expansion, trajectory, (16 more...)

2003.06259

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Darlow, Luke Nicholas, Storkey, Amos

What Information Does a ResNet Compress?

The information bottleneck principle (Shwartz-Ziv & Tishby, 2017) suggests that SGD-based training of deep neural networks results in optimally compressed hidden layers, from an information theoretic perspective. However, this claim was established on toy data. The goal of the work we present here is to test whether the information bottleneck principle is applicable to a realistic setting using a larger and deeper convolutional architecture, a ResNet model. We trained PixelCNN++ models as inverse representation decoders to measure the mutual information between hidden layers of a ResNet and input image data, when trained for (1) classification and (2) autoencoding. We find that two stages of learning happen for both training regimes, and that compression does occur, even for an autoencoder. Sampling images by conditioning on hidden layers' activations offers an intuitive visualisation to understand what a ResNets learns to forget.

compression, information, neural network, (15 more...)

2003.06254

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Fabbiani, Emanuele, Nahata, Pulkit, De Nicolao, Giuseppe, Ferrari-Trecate, Giancarlo

Identification of AC Networks via Online Learning

With the advent of renewable energy resources, generation in power networks is drifting from the classical centralized paradigm to an increasingly distributed scenario. While offering many advantages, renewable-based generation can compromise grid reliability, due to its intermittent nature and creation of reverse power flows. In order to guarantee the safe operation of power systems and avoid dangerous phenomena like blackouts, innovative and efficient control algorithms are necessary. Nevertheless, advanced algorithms necessitate grid identification, that is, the knowledge of grid topology and line parameters. Most works on the identification of electric networks focus on topology verification, assuming a known initial topology and aiming at detecting sparse changes, such as line trips or switch activations [1, 2]. More recently, attention has shifted to the estimation of network topology and line parameters without any apriori information. Two main branches of research have appeared. On the one hand, works like [3, 4] propose learning algorithms that exploit the statistical properties of nodal measurements to determine the operational structure and the line impedances. These approaches have the major advantage of accounting for buses with no available measurements (hidden nodes) [4], although restrictive assumptions are required, e.g.

admittance matrix, algorithm, matrix, (14 more...)

2003.0621

Country:

North America > United States > Illinois (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)
Europe > Italy (0.04)

Genre: Research Report (0.50)

Industry:

Energy > Renewable (1.00)
Energy > Power Industry (1.00)
Education > Educational Setting > Online (0.41)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.41)

Blanco-Mallo, E., Remeseiro, B., Bolón-Canedo, V., Alonso-Betanzos, A.

On the effectiveness of convolutional autoencoders on image-based personalized recommender systems

Recommender systems (RS) are increasingly present in our daily lives, especially since the advent of Big Data, which allows for storing all kinds of information about users' preferences. Personalized RS are successfully applied in platforms such as Netflix, Amazon or YouTube. However, they are missing in gastronomic platforms such as TripAdvisor, where moreover we can find millions of images tagged with users' tastes. This paper explores the potential of using those images as sources of information for modeling users' tastes and proposes an image-based classification system to obtain personalized recommendations, using a convolutional autoencoder as feature extractor. The proposed architecture will be applied to TripAdvisor data, using users' reviews that can be defined as a triad composed by a user, a restaurant, and an image of it taken by the user. Since the dataset is highly unbalanced, the use of data augmentation on the minority class is also considered in the experimentation. Results on data from three cities of different sizes (Santiago de Compostela, Barcelona and New York) demonstrate the effectiveness of using a convolutional autoencoder as feature extractor, instead of the standard deep features computed with convolutional neural networks.

autoencoder, convolutional autoencoder, recommendation, (16 more...)

2003.06205

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.26)
North America > United States > New York (0.26)
Europe > Spain > Galicia > A Coruña Province > Santiago de Compostela (0.24)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Genre: Research Report (0.82)

Industry:

Information Technology > Services (0.66)
Consumer Products & Services > Restaurants (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Can Implicit Bias Explain Generalization? Stochastic Convex Optimization as a Case Study

Dauber, Assaf, Feder, Meir, Koren, Tomer, Livni, Roi

The notion of implicit bias, or implicit regularization, has been suggested as a means to explain the surprising generalization ability of modern-days overparameterized learning algorithms. This notion refers to the tendency of the optimization algorithm towards a certain structured solution that often generalizes well. Recently, several papers have studied implicit regularization and were able to identify this phenomenon in various scenarios. We revisit this paradigm in arguably the simplest non-trivial setup, and study the implicit bias of Stochastic Gradient Descent (SGD) in the context of Stochastic Convex Optimization. As a first step, we provide a simple construction that rules out the existence of a \emph{distribution-independent} implicit regularizer that governs the generalization ability of SGD. We then demonstrate a learning problem that rules out a very general class of \emph{distribution-dependent} implicit regularizers from explaining generalization, which includes strongly convex regularizers as well as non-degenerate norm-based regularizations. Certain aspects of our constructions point out to significant difficulties in providing a comprehensive explanation of an algorithm's generalization performance by solely arguing about its implicit regularization properties.

construction, regularization, regularizer, (15 more...)

2003.06152

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Duan, Tiehang, Chauhan, Mihir, Shaikh, Mohammad Abuzar, Srihari, Sargur

Ultra Efficient Transfer Learning with Meta Update for Cross Subject EEG Classification

Electroencephalogram (EEG) signal is widely used in brain computer interfaces (BCI), the pattern of which differs significantly across different subjects, and poses a major challenge for real world application of EEG classifiers. We found an efficient transfer learning method, named Meta UPdate Strategy (MUPS), boosts cross subject classification performance of EEG signals, and only need a small amount of data from target subject. The model tackles the problem with a two step process: (1) extract versatile features that are effective across all source subjects, and (2) adapt the model to target subject. The proposed model, which originates from meta learning, aims to find feature representation that is broadly suitable for different subjects, and maximizes sensitivity of the loss function on new subject such that one or a small number of gradient steps can lead to effective adaptation. The method can be applied to all deep learning oriented models. We performed extensive experiments on two public datasets, the proposed MUPS model outperforms current state of the arts by a large margin on accuracy and AUC-ROC when only a small amount of target data is used.

classification, dataset, target subject, (13 more...)

2003.06113

Country: North America > United States > New York > Erie County > Buffalo (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Learning Graph Embedding with Limited Labeled Data: An Efficient Sampling Approach

Li, Qirui, Liu, Xiaoming, Shen, Chao, Peng, Xi, Zhou, Yadong, Guan, Xiaohong

Semi-supervised graph embedding methods represented by graph convolutional network has become one of the most popular methods for utilizing deep learning approaches to process the graph-based data for applications. Mostly existing work focus on designing novel algorithm structure to improve the performance, but ignore one common training problem, i.e., could these methods achieve the same performance with limited labelled data? To tackle this research gap, we propose a sampling-based training framework for semi-supervised graph embedding methods to achieve better performance with smaller training data set. The key idea is to integrate the sampling theory and embedding methods by a pipeline form, which has the following advantages: 1) the sampled training data can maintain more accurate graph characteristics than uniformly chosen data, which eliminates the model deviation; 2) the smaller scale of training data is beneficial to reduce the human resource cost to label them; The extensive experiments show that the sampling-based method can achieve the same performance only with 10$\%$-50$\%$ of the scale of training data. It verifies that the framework could extend the existing semi-supervised methods to the scenarios with the extremely small scale of labelled data.

graph, node, training data, (14 more...)

2003.061

Country:

Asia > China > Shaanxi Province > Xi'an (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Yang, Liu, Meng, Xuhui, Karniadakis, George Em

B-PINNs: Bayesian Physics-Informed Neural Networks for Forward and Inverse PDE Problems with Noisy Data

We propose a Bayesian physics-informed neural network (B-PINN) to solve both forward and inverse nonlinear problems described by partial differential equations (PDEs) and noisy data. In this Bayesian framework, the Bayesian neural network (BNN) combined with a PINN for PDEs serves as the prior while the Hamiltonian Monte Carlo (HMC) or the variational inference (VI) could serve as an estimator of the posterior. B-PINNs make use of both physical laws and scattered noisy measurements to provide predictions and quantify the aleatoric uncertainty arising from the noisy data in the Bayesian framework. Compared with PINNs, in addition to uncertainty quantification, B-PINNs obtain more accurate predictions in scenarios with large noise due to their capability of avoiding overfitting. We conduct a systematic comparison between the two different approaches for the B-PINN posterior estimation (i.e., HMC or VI), along with dropout used for quantifying uncertainty in deep neural networks. Our experiments show that HMC is more suitable than VI for the B-PINNs posterior estimation, while dropout employed in PINNs can hardly provide accurate predictions with reasonable uncertainty. Finally, we replace the BNN in the prior with a truncated Karhunen-Lo\`eve (KL) expansion combined with HMC or a deep normalizing flow (DNF) model as posterior estimators. The KL is as accurate as BNN and much faster but this framework cannot be easily extended to high-dimensional problems unlike the BNN based framework.

b-pinn-hmc, neural network, standard deviation, (14 more...)

2003.06097

Country:

North America > United States > Washington > Benton County > Richland (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)