AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

arXiv.org Artificial IntelligenceMay-10-2021

Improving Factual Consistency of Abstractive Summarization via Question Answering

Nan, Feng, Santos, Cicero Nogueira dos, Zhu, Henghui, Ng, Patrick, McKeown, Kathleen, Nallapati, Ramesh, Zhang, Dejiao, Wang, Zhiguo, Arnold, Andrew O., Xiang, Bing

A commonly observed problem with the state-of-the art abstractive summarization models is that the generated summaries can be factually inconsistent with the input documents. The fact that automatic summarization may produce plausible-sounding yet inaccurate summaries is a major concern that limits its wide application. In this paper we present an approach to address factual consistency in summarization. We first propose an efficient automatic evaluation metric to measure factual consistency; next, we propose a novel learning algorithm that maximizes the proposed metric during model training. Through extensive experiments, we confirm that our method is effective in improving factual consistency and even overall quality of the summaries, as judged by both automatic metrics and human evaluation.

factual consistency, input document, summarization, (11 more...)

2105.04623

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > Italy (0.05)
Africa > South Africa (0.05)
(11 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Air (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.93)
Leisure & Entertainment > Sports > Boxing (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)

#artificialintelligenceMay-6-2021, 14:15:27 GMT

10 Deadly Sins of ML Model Training

During model training, there are scenarios when the loss-epoch graph keeps bouncing around and does not seem to converge irrespective of the number of epochs. There is no silver bullet as there are multiple root causes to investigate -- bad training examples, missing truths, changing data distributions, too high a learning rate. The most common one I have seen is bad training examples related to a combination of anomalous data and incorrect labels. Sometimes there are scenarios where the model seems to be converging, but suddenly the loss value increases significantly, i.e., loss value reduces and then increases significantly with epochs. There are multiple reasons for this kind of exploding loss.

accuracy, deadly sin, ml model training, (6 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

arXiv.org Artificial IntelligenceApr-30-2021

Efficient Non-Sampling Knowledge Graph Embedding

Li, Zelong, Ji, Jianchao, Fu, Zuohui, Ge, Yingqiang, Xu, Shuyuan, Chen, Chong, Zhang, Yongfeng

Knowledge Graph (KG) is a flexible structure that is able to describe the complex relationship between data entities. Currently, most KG embedding models are trained based on negative sampling, i.e., the model aims to maximize some similarity of the connected entities in the KG, while minimizing the similarity of the sampled disconnected entities. Negative sampling helps to reduce the time complexity of model learning by only considering a subset of negative instances, which may fail to deliver stable model performance due to the uncertainty in the sampling procedure. To avoid such deficiency, we propose a new framework for KG embedding -- Efficient Non-Sampling Knowledge Graph Embedding (NS-KGE). The basic idea is to consider all of the negative instances in the KG for model learning, and thus to avoid negative sampling. The framework can be applied to square-loss based knowledge graph embedding models or models whose loss can be converted to a square loss. A natural side-effect of this non-sampling strategy is the increased computational complexity of model learning. To solve the problem, we leverage mathematical derivations to reduce the complexity of non-sampling loss function, which eventually provides us both better efficiency and better accuracy in KG embedding compared with existing models. Experiments on benchmark datasets show that our NS-KGE framework can achieve a better performance on efficiency and accuracy over traditional negative sampling based models, and that the framework is applicable to a large class of knowledge graph embedding models.

artificial intelligence, inductive learning, machine learning, (19 more...)

doi: 10.1145/3442381.3449859

2104.10796

Country:

North America > United States (0.46)
Europe (0.30)

Genre: Research Report (0.82)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.88)

#artificialintelligenceApr-29-2021, 07:28:48 GMT

Machine learning: What are membership inference attacks?

This article is part of Demystifying AI, a series of posts that (try to) disambiguate the jargon and myths surrounding AI. One of the wonders of machine learning is that it turns any kind of data into mathematical equations. Once you train a machine learning model on training examples--whether it's on images, audio, raw text, or tabular data--what you get is a set of numerical parameters. In most cases, the model no longer needs the training dataset and uses the tuned parameters to map new and unseen examples to categories or value predictions. You can then discard the training data and publish the model on GitHub or run it on your own servers without worrying about storing or distributing sensitive information contained in the training dataset.

inference attack, membership inference attack, training data, (13 more...)

Industry: Information Technology > Security & Privacy (0.71)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)

#artificialintelligenceApr-29-2021, 00:35:11 GMT

Membership inference attacks detect data used to train machine learning models

One of the wonders of machine learning is that it turns any kind of data into mathematical equations. Once you train a machine learning model on training examples--whether it's on images, audio, raw text, or tabular data--what you get is a set of numerical parameters. In most cases, the model no longer needs the training dataset and uses the tuned parameters to map new and unseen examples to categories or value predictions. You can then discard the training data and publish the model on GitHub or run it on your own servers without worrying about storing or distributing sensitive information contained in the training dataset. But a type of attack called "membership inference" makes it possible to detect the data used to train a machine learning model.

inference attack, training data, training example, (12 more...)

Industry: Information Technology > Security & Privacy (0.71)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.64)

arXiv.org Artificial IntelligenceApr-29-2021

Scalable Semi-supervised Landmark Localization for X-ray Images using Few-shot Deep Adaptive Graph

Zhou, Xiao-Yun, Lai, Bolin, Li, Weijian, Wang, Yirui, Zheng, Kang, Wang, Fakai, Lin, Chihung, Lu, Le, Huang, Lingyun, Han, Mei, Xie, Guotong, Xiao, Jing, Chang-Fu, Kuo, Harrison, Adam, Miao, Shun

Landmark localization plays an important role in medical image analysis. Learning based methods, including CNN and GCN, have demonstrated the state-of-the-art performance. However, most of these methods are fully-supervised and heavily rely on manual labeling of a large training dataset. In this paper, based on a fully-supervised graph-based method, DAG, we proposed a semi-supervised extension of it, termed few-shot DAG, \ie five-shot DAG. It first trains a DAG model on the labeled data and then fine-tunes the pre-trained model on the unlabeled data with a teacher-student SSL mechanism. In addition to the semi-supervised loss, we propose another loss using JS divergence to regulate the consistency of the intermediate feature maps. We extensively evaluated our method on pelvis, hand and chest landmark detection tasks. Our experiment results demonstrate consistent and significant improvements over previous methods.

dag, few-shot dag, student model, (12 more...)

2104.14629

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
North America > United States > New York > Monroe County > Rochester (0.04)
North America > United States > Maryland > Montgomery County > Bethesda (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.52)

#artificialintelligenceApr-28-2021, 12:30:17 GMT

Introduction to Inductive Learning in Artificial Intelligence

Machine learning is one of the most important subfields of artificial intelligence. It has been viewed as a viable way of avoiding the knowledge bottleneck problem in developing knowledge-based systems. Inductive Learning, also known as Concept Learning, is how AI systems attempt to use a generalized rule to carry out observations. To generate a set of classification rules, Inductive Learning Algorithms (APIs) are used. These generated rules are in the "If this then that" format.

application, inductive learning, learning, (12 more...)

Industry:

Banking & Finance (0.52)
Education (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

arXiv.org Artificial IntelligenceApr-28-2021

Semi-Supervised Learning of Visual Features by Non-Parametrically Predicting View Assignments with Support Samples

Assran, Mahmoud, Caron, Mathilde, Misra, Ishan, Bojanowski, Piotr, Joulin, Armand, Ballas, Nicolas, Rabbat, Michael

This paper proposes a novel method of learning by predicting view assignments with support samples (PAWS). The method trains a model to minimize a consistency loss, which ensures that different views of the same unlabeled instance are assigned similar pseudo-labels. The pseudo-labels are generated non-parametrically, by comparing the representations of the image views to those of a set of randomly sampled labeled images. The distance between the view representations and labeled representations is used to provide a weighting over class labels, which we interpret as a soft pseudo-label. By non-parametrically incorporating labeled samples in this way, PAWS extends the distance-metric loss used in self-supervised methods such as BYOL and SwAV to the semi-supervised setting. Despite the simplicity of the approach, PAWS outperforms other semi-supervised methods across architectures, setting a new state-of-the-art for a ResNet-50 on ImageNet trained with either 10% or 1% of the labels, reaching 75.5% and 66.5% top-1 respectively. PAWS requires 4x to 12x less training than the previous best methods.

arxiv preprint arxiv, representation, resnet-50, (15 more...)

2104.13963

Country:

North America > United States > New York (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)

Genre: Research Report (1.00)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.65)

arXiv.org Artificial IntelligenceApr-27-2021

Contrastive Spatial Reasoning on Multi-View Line Drawings

Xiang, Siyuan, Yang, Anbang, Xue, Yanfei, Yang, Yaoqing, Feng, Chen

Spatial reasoning on multi-view line drawings by state-of-the-art supervised deep networks is recently shown with puzzling low performances on the SPARE3D dataset. To study the reason behind the low performance and to further our understandings of these tasks, we design controlled experiments on both input data and network designs. Guided by the hindsight from these experiment results, we propose a simple contrastive learning approach along with other network modifications to improve the baseline performance. Our approach uses a self-supervised binary classification network to compare the line drawing differences between various views of any two similar 3D objects. It enables deep networks to effectively learn detail-sensitive yet view-invariant line drawing representations of 3D objects. Experiments show that our method could significantly increase the baseline performance in SPARE3D, while some popular self-supervised learning methods cannot.

artificial intelligence, inductive learning, machine learning, (18 more...)

2104.13433

Country:

North America > United States > New York (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report (1.00)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.71)