AITopics

doi: 10.3390/s21113606

2106.00988

Country:

Europe > Romania > Centru Development Region > Brașov County > Brașov (0.04)
Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)

Genre: Research Report (1.00)

Industry:

Transportation > Ground > Road (1.00)
Information Technology (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.95)

Thorne, James, Yazdani, Majid, Saeidi, Marzieh, Silvestri, Fabrizio, Riedel, Sebastian, Halevy, Alon

Database Reasoning Over Text

arXiv.org Artificial IntelligenceJun-2-2021

Neural models have shown impressive performance gains in answering queries from natural language text. However, existing works are unable to support database queries, such as "List/Count all female athletes who were born in 20th century", which require reasoning over sets of relevant facts with operations such as join, filtering and aggregation. We show that while state-of-the-art transformer models perform very well for small databases, they exhibit limitations in processing noisy data, numerical operations, and queries that aggregate facts. We propose a modular architecture to answer these database-style queries over multiple spans from text and aggregating these at scale. We evaluate the architecture using WikiNLDB, a novel dataset for exploring such queries. Our architecture scales to databases containing thousands of facts whereas contemporary models are limited by how many facts can be encoded. In direct comparison on small databases, our approach increases overall answer accuracy from 85% to 90%. On larger databases, our approach retains its accuracy whereas transformer baselines could not encode the context.

database, operator, query, (15 more...)

2106.01074

Country:

Europe > France > Grand Est > Bas-Rhin > Strasbourg (0.04)
South America > Uruguay > Montevideo > Montevideo (0.04)
North America > United States > North Carolina (0.04)
(7 more...)

Genre:

Personal (0.46)
Research Report (0.40)

Industry: Leisure & Entertainment > Sports > Tennis (0.46)

Technology:

Information Technology > Databases (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.70)
(2 more...)

arXiv.org Artificial IntelligenceJun-2-2021

Few-Shot Partial-Label Learning

Zhao, Yunfeng, Yu, Guoxian, Liu, Lei, Yan, Zhongmin, Cui, Lizhen, Domeniconi, Carlotta

Partial-label learning (PLL) generally focuses on inducing a noise-tolerant multi-class classifier by training on overly-annotated samples, each of which is annotated with a set of labels, but only one is the valid label. A basic promise of existing PLL solutions is that there are sufficient partial-label (PL) samples for training. However, it is more common than not to have just few PL samples at hand when dealing with new tasks. Furthermore, existing few-shot learning algorithms assume precise labels of the support set; as such, irrelevant labels may seriously mislead the meta-learner and thus lead to a compromised performance. How to enable PLL under a few-shot learning setting is an important problem, but not yet well studied. In this paper, we introduce an approach called FsPLL (Few-shot PLL). FsPLL first performs adaptive distance metric learning by an embedding network and rectifying prototypes on the tasks previously encountered. Next, it calculates the prototype of each class of a new task in the embedding network. An unseen example can then be classified via its distance to each prototype. Experimental results on widely-used few-shot datasets (Omniglot and miniImageNet) demonstrate that our FsPLL can achieve a superior performance than the state-of-the-art methods across different settings, and it needs fewer samples for quickly adapting to new tasks.

fspll, pl sample, prototype, (15 more...)

2106.00984

Country:

North America > United States (0.04)
Asia > China > Shandong Province > Jinan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Veitch, Victor, D'Amour, Alexander, Yadlowsky, Steve, Eisenstein, Jacob

Counterfactual Invariance to Spurious Correlations: Why and How to Pass Stress Tests

arXiv.org Machine LearningJun-1-2021

Informally, a `spurious correlation' is the dependence of a model on some aspect of the input data that an analyst thinks shouldn't matter. In machine learning, these have a know-it-when-you-see-it character; e.g., changing the gender of a sentence's subject changes a sentiment predictor's output. To check for spurious correlations, we can `stress test' models by perturbing irrelevant parts of input data and seeing if model predictions change. In this paper, we study stress testing using the tools of causal inference. We introduce \emph{counterfactual invariance} as a formalization of the requirement that changing irrelevant parts of the input shouldn't change model predictions. We connect counterfactual invariance to out-of-domain model performance, and provide practical schemes for learning (approximately) counterfactual invariant predictors (without access to counterfactual examples). It turns out that both the means and implications of counterfactual invariance depend fundamentally on the true underlying causal structure of the data. Distinct causal structures require distinct regularization schemes to induce counterfactual invariance. Similarly, counterfactual invariance implies different domain shift guarantees depending on the underlying causal structure. This theory is supported by empirical results on text classification.

causal structure, counterfactual invariance, predictor, (10 more...)

2106.00545

Country: North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report (0.40)

Industry:

Leisure & Entertainment (0.93)
Media > Film (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)

Zhou, Kaiyang, Loy, Chen Change, Liu, Ziwei

Semi-Supervised Domain Generalization with Stochastic StyleMatch

arXiv.org Artificial IntelligenceJun-1-2021

Most existing research on domain generalization assumes source data gathered from multiple domains are fully annotated. However, in real-world applications, we might have only a few labels available from each source domain due to high annotation cost, along with abundant unlabeled data that are much easier to obtain. In this work, we investigate semi-supervised domain generalization (SSDG), a more realistic and practical setting. Our proposed approach, StyleMatch, is inspired by FixMatch, a state-of-the-art semi-supervised learning method based on pseudo-labeling, with several new ingredients tailored to solve SSDG. Specifically, 1) to mitigate overfitting in the scarce labeled source data while improving robustness against noisy pseudo labels, we introduce stochastic modeling to the classifier's weights, seen as class prototypes, with Gaussian distributions. 2) To enhance generalization under domain shift, we upgrade FixMatch's two-view consistency learning paradigm based on weak and strong augmentations to a multi-view version with style augmentation as the third complementary view. To provide a comprehensive study and evaluation, we establish two SSDG benchmarks, which cover a wide range of strong baseline methods developed in relevant areas including domain generalization and semi-supervised learning. Extensive experiments demonstrate that StyleMatch achieves the best out-of-distribution generalization performance in the low-data regime. We hope our approach and benchmarks can pave the way for future research on data-efficient and generalizable learning systems.

domain generalization, generalization, latexit sha1, (17 more...)

2106.00592

Country: Asia > Singapore (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

arXiv.org Machine LearningMay-29-2021

Graph Self Supervised Learning: the BT, the HSIC, and the VICReg

Nag, Sayan

Self-supervised learning and pre-training strategies have developed over the last few years especially for Convolutional Neural Networks (CNNs). Recently application of such methods can also be noticed for Graph Neural Networks (GNNs). In this paper, we have used a graph based self-supervised learning strategy with different loss functions (Barlow Twins[ 7], HSIC[ 4], VICReg[ 1]) which have shown promising results when applied with CNNs previously. We have also proposed a hybrid loss function combining the advantages of VICReg and HSIC and called it as VICRegHSIC. The performance of these aforementioned methods have been compared when applied to two different datasets namely MUTAG and PROTEINS. Moreover, the impact of different batch sizes, projector dimensions and data augmentation strategies have also been explored. The results are preliminary and we will be continuing to explore with other datasets.

augmentation, learning, loss function, (11 more...)

2105.12247

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Kalai, Adam Tauman, Kanade, Varun

Towards optimally abstaining from prediction

arXiv.org Machine LearningMay-28-2021

A common challenge across all areas of machine learning is that training data is not distributed like test data, due to natural shifts, "blind spots," or adversarial examples. We consider a model where one may abstain from predicting, at a fixed cost. In particular, our transductive abstention algorithm takes labeled training examples and unlabeled test examples as input, and provides predictions with optimal prediction loss guarantees. The loss bounds match standard generalization bounds when test examples are i.i.d. from the training distribution, but add an additional term that is the cost of abstaining times the statistical distance between the train and test distribution (or the fraction of adversarial examples). For linear regression, we give a polynomial-time algorithm based on Celis-Dennis-Tapia optimization algorithms. For binary classification, we show how to efficiently implement it using a proper agnostic learner (i.e., an Empirical Risk Minimizer) for the class of interest. Our work builds on a recent abstention algorithm of Goldwasser, Kalais, and Montasser (2020) for transductive binary classification.

algorithm, classification, probability, (17 more...)

2105.14119

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

arXiv.org Machine LearningMay-28-2021

Bridging the Gap Between Practice and PAC-Bayes Theory in Few-Shot Meta-Learning

Ding, Nan, Chen, Xi, Levinboim, Tomer, Goodman, Sebastian, Soricut, Radu

Despite recent advances in its theoretical understanding, there still remains a significant gap in the ability of existing PAC-Bayesian theories on meta-learning to explain performance improvements in the few-shot learning setting, where the number of training examples in the target tasks is severely limited. This gap originates from an assumption in the existing theories which supposes that the number of training examples in the observed tasks and the number of training examples in the target tasks follow the same distribution, an assumption that rarely holds in practice. By relaxing this assumption, we develop two PAC-Bayesian bounds tailored for the few-shot learning setting and show that two existing meta-learning algorithms (MAML and Reptile) can be derived from our bounds, thereby bridging the gap between practice and PAC-Bayesian theories. Furthermore, we derive a new computationally-efficient PACMAML algorithm, and show it outperforms existing meta-learning algorithms on several few-shot benchmark datasets.

algorithm, pacmaml, target task, (12 more...)

2105.14099

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)

#artificialintelligenceMay-27-2021, 03:27:24 GMT

Semi-Supervised Learning

Semi-supervised learning allows neural networks to mimic human inductive logic and sort unknown information fast and accurately without human intervention. Any problem where you have a large amount of input data but only a few reference points available is a good candidate semi-supervised learning. A classic example is a photo archive with millions of random images. Instead of manually labeling each picture, a human searching for images of people can just tag a few relevant samples from the database. Then the neural network can scour the databank and find every image it believes represents a human.

neural network, semi-supervised learning

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.64)

Parmar, Jitendra, Chouhan, Satyendra Singh, Rathore, Santosh Singh

Open-world Machine Learning: Applications, Challenges, and Opportunities

arXiv.org Artificial IntelligenceMay-27-2021

Traditional machine learning especially supervised learning follows the assumptions of closed-world learning i.e., for each testing class a training class is available. However, such machine learning models fail to identify the classes which were not available during training time. These classes can be referred to as unseen classes. Whereas, open-world machine learning deals with arbitrary inputs (data with unseen classes) to machine learning systems. Moreover, traditional machine learning is static learning which is not appropriate for an active environment where the perspective and sources, and/or volume of data are changing rapidly. In this paper, first, we present an overview of open-world learning with importance to the real-world context. Next, different dimensions of open-world learning are explored and discussed. The area of open-world learning gained the attention of the research community in the last decade only. We have searched through different online digital libraries and scrutinized the work done in the last decade. This paper presents a systematic review of various techniques for open-world machine learning. It also presents the research gaps, challenges, and future directions in open-world learning. This paper will help researchers to understand the comprehensive developments of open-world learning and the likelihoods to extend the research in suitable areas. It will also help to select applicable methodologies and datasets to explore this further.

classification, learning, open-world learning, (13 more...)

2105.13448

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Oceania > Australia > New South Wales (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(4 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (0.68)
Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)