AITopics

2010.09225

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Japan > Hokkaidō (0.04)
Europe > Russia (0.04)
Asia > Russia (0.04)

Genre: Research Report (0.63)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.67)

arXiv.org Machine LearningOct-14-2020

Heuristic Semi-Supervised Learning for Graph Generation Inspired by Electoral College

Li, Chen, Peng, Xutan, Peng, Hao, Li, Jianxin, Wang, Lihong, Yu, Philip S., He, Lifang

Recently, graph-based algorithms have drawn much attention because of their impressive success in semi-supervised setups. For better model performance, previous studies learn to transform the topology of the input graph. However, these works only focus on optimizing the original nodes and edges, leaving the direction of augmenting existing data unexplored. In this paper, by simulating the generation process of graph signals, we propose a novel heuristic pre-processing technique, namely ELectoral COllege (ELCO), which automatically expands new nodes and edges to refine the label similarity within a dense subgraph. Substantially enlarging the original training set with high-quality generated labeled data, our framework can effectively benefit downstream models. To justify the generality and practicality of ELCO, we couple it with the popular Graph Convolution Network and Graph Attention Network to perform extensive evaluations on three standard datasets. In all setups tested, our method boosts the average score of base models by a large margin of 4.7 points, as well as consistently outperforms the state-of-the-art. We release our code and data on https://github.com/RingBDStack/ELCO to guarantee reproducibility.

artificial intelligence, machine learning, node, (17 more...)

2006.06469

Country: North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (0.70)
Government > Voting & Elections (0.61)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.51)

arXiv.org Artificial IntelligenceOct-14-2020

MS$^2$L: Multi-Task Self-Supervised Learning for Skeleton Based Action Recognition

Lin, Lilang, Song, Sijie, Yan, Wenhan, Liu, Jiaying

In this paper, we address self-supervised representation learning from human skeletons for action recognition. Previous methods, which usually learn feature presentations from a single reconstruction task, may come across the overfitting problem, and the features are not generalizable for action recognition. Instead, we propose to integrate multiple tasks to learn more general representations in a self-supervised manner. To realize this goal, we integrate motion prediction, jigsaw puzzle recognition, and contrastive learning to learn skeleton features from different aspects. Skeleton dynamics can be modeled through motion prediction by predicting the future sequence. And temporal patterns, which are critical for action recognition, are learned through solving jigsaw puzzles. We further regularize the feature space by contrastive learning. Besides, we explore different training strategies to utilize the knowledge from self-supervised tasks for action recognition. We evaluate our multi-task self-supervised learning approach with action classifiers trained under different configurations, including unsupervised, semi-supervised and fully-supervised settings. Our experiments on the NW-UCLA, NTU RGB+D, and PKUMMD datasets show remarkable performance for action recognition, demonstrating the superiority of our method in learning more discriminative and general features. Our project website is available at https://langlandslin.github.io/projects/MSL/.

artificial intelligence, machine learning, recognition, (16 more...)

doi: 10.1145/3394171.3413548

2010.05599

Country:

North America > United States > Washington > King County > Seattle (0.04)
Asia > China > Guangxi Province > Nanning (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Reich, Steven, Mueller, David, Andrews, Nicholas

Ensemble Distillation for Structured Prediction: Calibrated, Accurate, Fast---Choose Three

arXiv.org Machine LearningOct-13-2020

Modern neural networks do not always produce well-calibrated predictions, even when trained with a proper scoring function such as cross-entropy. In classification settings, simple methods such as isotonic regression or temperature scaling may be used in conjunction with a held-out dataset to calibrate model outputs. However, extending these methods to structured prediction is not always straightforward or effective; furthermore, a held-out calibration set may not always be available. In this paper, we study ensemble distillation as a general framework for producing well-calibrated structured prediction models while avoiding the prohibitive inference-time cost of ensembles. We validate this framework on two tasks: named-entity recognition and machine translation. We find that, across both tasks, ensemble distillation produces models which retain much of, and occasionally improve upon, the performance and calibration benefits of ensembles, while only requiring a single model during test-time.

artificial intelligence, machine learning, natural language, (20 more...)

2010.06721

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Texas (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(3 more...)

Genre: Research Report > New Finding (0.93)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.91)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

arXiv.org Machine LearningOct-13-2020

Theory and Algorithms for Shapelet-based Multiple-Instance Learning

Suehiro, Daiki, Hatano, Kohei, Takimoto, Eiji, Yamamoto, Shuji, Bannai, Kenichi, Takeda, Akiko

We propose a new formulation of Multiple-Instance Learning (MIL), in which a unit of data consists of a set of instances called a bag. The goal is to find a good classifier of bags based on the similarity with a "shapelet" (or pattern), where the similarity of a bag with a shapelet is the maximum similarity of instances in the bag. In previous work, some of the training instances are chosen as shapelets with no theoretical justification. In our formulation, we use all possible, and thus infinitely many shapelets, resulting in a richer class of classifiers. We show that the formulation is tractable, that is, it can be reduced through Linear Programming Boosting (LPBoost) to Difference of Convex (DC) programs of finite (actually polynomial) size. Our theoretical result also gives justification to the heuristics of some of the previous work. The time complexity of the proposed algorithm highly depends on the size of the set of all instances in the training sample. To apply to the data containing a large number of instances, we also propose a heuristic option of the algorithm without the loss of the theoretical guarantee. Our empirical study demonstrates that our algorithm uniformly works for Shapelet Learning tasks on time-series classification and various MIL tasks with comparable accuracy to the existing methods. Moreover, we show that the proposed heuristics allow us to achieve the result with reasonable computational time.

artificial intelligence, machine learning, shapelet, (17 more...)

doi: 10.1162/neco_a_01297

2006.0113

Country:

Asia > Japan > Kyūshū & Okinawa > Kyūshū (0.05)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)

Li, Qifei, Zhou, Wangchunshu

Connecting the Dots Between Fact Verification and Fake News Detection

arXiv.org Artificial IntelligenceOct-11-2020

Fact verification models have enjoyed a fast advancement in the last two years with the development of pre-trained language models like BERT and the release of large scale datasets such as FEVER. However, the challenging problem of fake news detection has not benefited from the improvement of fact verification models, which is closely related to fake news detection. In this paper, we propose a simple yet effective approach to connect the dots between fact verification and fake news detection. Our approach first employs a text summarization model pre-trained on news corpora to summarize the long news article into a short claim. Then we use a fact verification model pre-trained on the FEVER dataset to detect whether the input news article is real or fake. Our approach makes use of the recent success of fact verification models and enables zero-shot fake news detection, alleviating the need of large-scale training data to train fake news detection models. Experimental results on FakenewsNet, a benchmark dataset for fake news detection, demonstrate the effectiveness of our proposed approach.

artificial intelligence, machine learning, natural language, (16 more...)

2010.05202

Genre: Research Report (0.82)

Industry: Media > News (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.33)

Karamcheti, Siddharth, Sadigh, Dorsa, Liang, Percy

Learning Adaptive Language Interfaces through Decomposition

arXiv.org Artificial IntelligenceOct-11-2020

Our goal is to create an interactive natural language interface that efficiently and reliably learns from users to complete tasks in simulated robotics settings. We introduce a neural semantic parsing system that learns new high-level abstractions through decomposition: users interactively teach the system by breaking down high-level utterances describing novel behavior into low-level steps that it can understand. Unfortunately, existing methods either rely on grammars which parse sentences with limited flexibility, or neural sequence-to-sequence models that do not learn efficiently or reliably from individual examples. Our approach bridges this gap, demonstrating the flexibility of modern neural systems, as well as the one-shot reliable generalization of grammar-based methods. Our crowdsourced interactive experiments suggest that over time, users complete complex tasks more efficiently while using our system by leveraging what they just taught. At the same time, getting users to trust the system enough to be incentivized to teach high-level utterances is still an ongoing challenge. We end with a discussion of some of the obstacles we need to overcome to fully realize the potential of the interactive paradigm.

machine learning, natural language, utterance, (16 more...)

2010.0519

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Sweden > Skåne County > Malmö (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)

arXiv.org Artificial IntelligenceOct-10-2020

Automated Concatenation of Embeddings for Structured Prediction

Wang, Xinyu, Jiang, Yong, Bach, Nguyen, Wang, Tao, Huang, Zhongqiang, Huang, Fei, Tu, Kewei

Pretrained contextualized embeddings are powerful word representations for structured prediction tasks. Recent work found that better word representations can be obtained by concatenating different types of embeddings. However, the selection of embeddings to form the best concatenated representation usually varies depending on the task and the collection of candidate embeddings, and the ever-increasing number of embedding types makes it a more difficult problem. In this paper, we propose Automated Concatenation of Embeddings (ACE) to automate the process of finding better concatenations of embeddings for structured prediction tasks, based on a formulation inspired by recent progress on neural architecture search. Specifically, a controller alternately samples a concatenation of embeddings, according to its current belief of the effectiveness of individual embedding types in consideration for a task, and updates the belief based on a reward. We follow strategies in reinforcement learning to optimize the parameters of the controller and compute the reward based on the accuracy of a task model, which is fed with the sampled concatenation as input and trained on a task dataset. Empirical results on 6 tasks and 23 datasets show that our approach outperforms strong baselines and achieves state-of-the-art performance with fine-tuned embeddings in the vast majority of evaluations.

computational linguistic, machine learning, natural language, (22 more...)

2010.05006

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Italy > Tuscany > Florence (0.04)
Asia > China > Hong Kong (0.04)
(17 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.82)

Robinson, Joshua, Chuang, Ching-Yao, Sra, Suvrit, Jegelka, Stefanie

Contrastive Learning with Hard Negative Samples

arXiv.org Machine LearningOct-9-2020

We consider the question: how can you sample good negative examples for contrastive learning? We argue that, as with metric learning, learning contrastive representations benefits from hard negative samples (i.e., points that are difficult to distinguish from an anchor point). The key challenge toward using hard negatives is that contrastive methods must remain unsupervised, making it infeasible to adopt existing negative sampling strategies that use label information. In response, we develop a new class of unsupervised methods for selecting hard negative samples where the user can control the amount of hardness. A limiting case of this sampling results in a representation that tightly clusters each class, and pushes different classes as far apart as possible. The proposed method improves downstream performance across multiple modalities, requires only few additional lines of code to implement, and introduces no computational overhead.

learning, objective, representation, (13 more...)

2010.04592

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.35)

arXiv.org Machine LearningOct-9-2020

xOrder: A Model Agnostic Post-Processing Framework for Achieving Ranking Fairness While Maintaining Algorithm Utility

Cui, Sen, Pan, Weishen, Zhang, Changshui, Wang, Fei

Algorithmic fairness has received lots of interests in machine learning recently. In this paper, we focus on the bipartite ranking scenario, where the instances come from either the positive or negative class and the goal is to learn a ranking function that ranks positive instances higher than negative ones. In an unfair setting, the probabilities of ranking the positives higher than negatives are different across different protected groups. We propose a general post-processing framework, xOrder, for achieving fairness in bipartite ranking while maintaining the algorithm classification performance. In particular, we optimize a weighted sum of the utility and fairness by directly adjusting the relative ordering across groups. We formulate this problem as identifying an optimal warping path across {different} protected groups and solve it through a dynamic programming process. xOrder is compatible with various classification models and applicable to a variety of ranking fairness metrics. We evaluate our proposed algorithm on four benchmark data sets and two real world patient electronic health record repository. The experimental results show that our approach can achieve great balance between the algorithm utility and ranking fairness. Our algorithm can also achieve robust performance when training and testing ranking score distributions are significantly different.

artificial intelligence, inductive learning, machine learning, (20 more...)

2006.08267

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > Oregon > Lane County > Eugene (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)