AITopics

2007.02056

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

A Survey on Self-supervised Pre-training for Sequential Transfer Learning in Neural Networks

Mao, Huanru Henry

Deep neural networks are typically trained under a supervised learning framework where a model learns a single task using labeled data. Instead of relying solely on labeled data, practitioners can harness unlabeled or related data to improve model performance, which is often more accessible and ubiquitous. Self-supervised pre-training for transfer learning is becoming an increasingly popular technique to improve state-of-the-art results using unlabeled data. It involves first pre-training a model on a large amount of unlabeled data, then adapting the model to target tasks of interest. In this review, we survey self-supervised learning methods and their applications within the sequential transfer learning framework. We provide an overview of the taxonomy for self-supervised learning and transfer learning, and highlight some prominent methods for designing pre-training tasks across different domains. Finally, we discuss recent trends and suggest areas for future investigation.

artificial intelligence, inductive learning, machine learning, (15 more...)

2007.008

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(9 more...)

Genre: Overview (1.00)

Industry: Leisure & Entertainment (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Min, So Yeon, Raghavan, Preethi, Szolovits, Peter

TransINT: Embedding Implication Rules in Knowledge Graphs with Isomorphic Intersections of Linear Subspaces

Knowledge Graphs (KG), composed of entities and relations, provide a structured representation of knowledge. For easy access to statistical approaches on relational data, multiple methods to embed a KG into f(KG) $\in$ R^d have been introduced. We propose TransINT, a novel and interpretable KG embedding method that isomorphically preserves the implication ordering among relations in the embedding space. Given implication rules, TransINT maps set of entities (tied by a relation) to continuous sets of vectors that are inclusion-ordered isomorphically to relation implications. With a novel parameter sharing scheme, TransINT enables automatic training on missing but implied facts without rule grounding. On a benchmark dataset, we outperform the best existing state-of-the-art rule integration embedding methods with significant margins in link Prediction and triple Classification. The angles between the continuous sets embedded by TransINT provide an interpretable way to mine semantic relatedness and implication rules among relations.

machine learning, natural language, relation, (18 more...)

2007.00271

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Colorado > Denver County > Denver (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.61)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)

Schultheis, Erik, Qaraei, Mohammadreza, Gupta, Priyanshu, Babbar, Rohit

Unbiased Loss Functions for Extreme Classification With Missing Labels

The goal in extreme multi-label classification (XMC) is to tag an instance with a small subset of relevant labels from an extremely large set of possible labels. In addition to the computational burden arising from large number of training instances, features and labels, problems in XMC are faced with two statistical challenges, (i) large number of 'tail-labels' -- those which occur very infrequently, and (ii) missing labels as it is virtually impossible to manually assign every relevant label to an instance. In this work, we derive an unbiased estimator for general formulation of loss functions which decompose over labels, and then infer the forms for commonly used loss functions such as hinge- and squared-hinge-loss and binary cross-entropy loss. We show that the derived unbiased estimators, in the form of appropriate weighting factors, can be easily incorporated in state-of-the-art algorithms for extreme classification, thereby scaling to datasets with hundreds of thousand labels. However, empirically, we find a slightly altered version that gives more relative weight to tail labels to perform even better. We suspect is due to the label imbalance in the dataset, which is not explicitly addressed by our theoretically derived estimator. Minimizing the proposed loss functions leads to significant improvement over existing methods (up to 20% in some cases) on benchmark datasets in XMC.

artificial intelligence, classification, machine learning, (14 more...)

2007.00237

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)
Asia > India > Uttar Pradesh > Kanpur (0.04)

Genre: Research Report (0.50)

Industry: Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.67)

Bello, Kevin, Honorio, Jean

Fairness constraints can help exact inference in structured prediction

Many inference problems in structured prediction can be modeled as maximizing a score function on a space of labels, where graphs are a natural representation to decompose the total score into a sum of unary (nodes) and pairwise (edges) scores. Given a generative model with an undirected connected graph $G$ and true vector of binary labels, it has been previously shown that when $G$ has good expansion properties, such as complete graphs or $d$-regular expanders, one can exactly recover the true labels (with high probability and in polynomial time) from a single noisy observation of each edge and node. We analyze the previously studied generative model by Globerson et al. (2015) under a notion of statistical parity. That is, given a fair binary node labeling, we ask the question whether it is possible to recover the fair assignment, with high probability and in polynomial time, from single edge and node observations. We find that, in contrast to the known trade-offs between fairness and model performance, the addition of the fairness constraint improves the probability of exact recovery. We effectively explain this phenomenon and empirically show how graphs with poor expansion properties, such as grids, are now capable to achieve exact recovery with high probability. Finally, as a byproduct of our analysis, we provide a tighter minimum-eigenvalue bound than that of Weyl's inequality.

artificial intelligence, constraint, machine learning, (17 more...)

2007.00218

Country:

North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.61)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

arXiv.org Machine LearningJun-30-2020

Learning De-biased Representations with Biased Representations

Bahng, Hyojin, Chun, Sanghyuk, Yun, Sangdoo, Choo, Jaegul, Oh, Seong Joon

Many machine learning algorithms are trained and evaluated by splitting data from a single source into training and test sets. While such focus on in-distribution learning scenarios has led to interesting advancement, it has not been able to tell if models are relying on dataset biases as shortcuts for successful prediction (e.g., using snow cues for recognising snowmobiles), resulting in biased models that fail to generalise when the bias shifts to a different class. The cross-bias generalisation problem has been addressed by de-biasing training data through augmentation or re-sampling, which are often prohibitive due to the data collection cost (e.g., collecting images of a snowmobile on a desert) and the difficulty of quantifying or expressing biases in the first place. In this work, we propose a novel framework to train a de-biased representation by encouraging it to be different from a set of representations that are biased by design. This tactic is feasible in many scenarios where it is much easier to define a set of biased representations than to define and quantify bias. We demonstrate the efficacy of our method across a variety of synthetic and real-world biases; our experiments show that the method discourages models from taking bias shortcuts, resulting in improved generalisation. Source code is available at https://github.com/clovaai/rebias.

accuracy, recognition, representation, (14 more...)

1910.02806

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Machine LearningJun-30-2020

Subject-Aware Contrastive Learning for Biosignals

Cheng, Joseph Y., Goh, Hanlin, Dogrusoz, Kaan, Tuzel, Oncel, Azemi, Erdrin

Datasets for biosignals, such as electroencephalogram (EEG) and electrocardiogram (ECG), often have noisy labels and have limited number of subjects (<100). To handle these challenges, we propose a self-supervised approach based on contrastive learning to model biosignals with a reduced reliance on labeled data and with fewer subjects. In this regime of limited labels and subjects, intersubject variability negatively impacts model performance. Thus, we introduce subject-aware learning through (1) a subject-specific contrastive loss, and (2) an adversarial training to promote subject-invariance during the self-supervised learning. We also develop a number of time-series data augmentation techniques to be used with the contrastive loss for biosignals. Our method is evaluated on publicly available datasets of two different biosignals with different tasks: EEG decoding and ECG anomaly detection. The embeddings learned using self-supervision yield competitive classification results compared to entirely supervised methods. We show that subject-invariance improves representation quality for these tasks, and observe that subject-specific loss increases performance when fine-tuning with supervised labels.

artificial intelligence, inductive learning, machine learning, (20 more...)

2007.04871

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.04)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Sun, Zhiqing, Yang, Yiming

An EM Approach to Non-autoregressive Conditional Sequence Generation

arXiv.org Machine LearningJun-29-2020

Autoregressive (AR) models have been the dominating approach to conditional sequence generation, but are suffering from the issue of high inference latency. Non-autoregressive (NAR) models have been recently proposed to reduce the latency by generating all output tokens in parallel but could only achieve inferior accuracy compared to their autoregressive counterparts, primarily due to a difficulty in dealing with the multi-modality in sequence generation. This paper proposes a new approach that jointly optimizes both AR and NAR models in a unified Expectation-Maximization (EM) framework. In the E-step, an AR model learns to approximate the regularized posterior of the NAR model. In the M-step, the NAR model is updated on the new posterior and selects the training examples for the next AR model. This iterative process can effectively guide the system to remove the multi-modality in the output sequences. To our knowledge, this is the first EM approach to NAR sequence generation. We evaluate our method on the task of machine translation. Experimental results on benchmark data sets show that the proposed approach achieves competitive, if not better, performance with existing NAR models and significantly reduces the inference latency.

arxiv preprint arxiv, nar model, translation, (10 more...)

2006.16378

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Europe > Austria > Vienna (0.14)
Oceania > Australia > Australian Capital Territory > Canberra (0.05)
Asia > Thailand > Krabi > Krabi (0.05)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.86)

#artificialintelligenceJun-26-2020, 13:00:54 GMT

Google Brain's SimCLRv2 Achieves New SOTA in Semi-Supervised Learning

Following on the February release of its contrastive learning framework SimCLR, the same team of Google Brain researchers guided by Turing Award honouree Dr. Geoffrey Hinton has presented SimCLRv2, an upgraded approach that boosts the SOTA results by 21.6 percent. The updated framework takes the "unsupervised pretrain, supervised fine-tune" paradigm popular in natural language processing and applies it to image recognition. Unlabelled data is learned in a task-agnostic way in the pretraining phase, which means the model has no prior classification knowledge. The researchers find that using a deep and wide neural network can be more label-efficient and greatly improve accuracy. Unlike SimCLR, whose largest model is ResNet-50, SimCLRv2's largest model is a 152-layer ResNet, which is three times wider in channels and selective kernels.

accuracy, artificial intelligence, machine learning, (9 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.40)

Levine, Alexander, Feizi, Soheil

Deep Partition Aggregation: Provable Defense against General Poisoning Attacks

arXiv.org Machine LearningJun-25-2020

Adversarial poisoning attacks distort training data in order to corrupt the test-time behavior of a classifier. A provable defense provides a certificate for each test sample, which is a lower bound on the magnitude of any adversarial distortion of the training set that can corrupt the test sample's classification. We propose two provable defenses against poisoning attacks: (i) Deep Partition Aggregation (DPA), a certified defense against a general poisoning threat model, defined as the insertion or deletion of a bounded number of samples to the training set -- by implication, this threat model also includes arbitrary distortions to a bounded number of images and/or labels; and (ii) Semi-Supervised DPA (SS-DPA), a certified defense against label-flipping poisoning attacks. DPA is an ensemble method where base models are trained on partitions of the training set determined by a hash function. DPA is related to subset aggregation, a well-studied ensemble method in classical machine learning. DPA can also be viewed as an extension of randomized ablation (Levine & Feizi, 2020a), a certified defense against sparse evasion attacks, to the poisoning domain. Our label-flipping defense, SS-DPA, uses a semi-supervised learning algorithm as its base classifier model: we train each base classifier using the entire unlabeled training set in addition to the labels for a partition. SS-DPA outperforms the existing certified defense for label-flipping attacks (Rosenfeld et al., 2020). SS-DPA certifies >= 50% of test images against 675 label flips (vs. < 200 label flips with the existing defense) on MNIST and 83 label flips on CIFAR-10. Against general poisoning attacks (no prior certified defense), DPA certifies >= 50% of test images against > 500 poison image insertions on MNIST, and nine insertions on CIFAR-10. These results establish new state-of-the-art provable defenses against poison attacks.

artificial intelligence, classifier, machine learning, (19 more...)

2006.14768

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)