AITopics

2305.11442

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Russia (0.14)
Asia > Russia (0.14)
(14 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Media > Film (0.68)
Government (0.68)
Automobiles & Trucks > Manufacturer (0.68)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Wang, Yau-Shian, Chi, Ta-Chung, Zhang, Ruohong, Yang, Yiming

PESCO: Prompt-enhanced Self Contrastive Learning for Zero-shot Text Classification

arXiv.org Artificial IntelligenceMay-24-2023

We present PESCO, a novel contrastive learning framework that substantially improves the performance of zero-shot text classification. We formulate text classification as a neural text matching problem where each document is treated as a query, and the system learns the mapping from each query to the relevant class labels by (1) adding prompts to enhance label matching, and (2) using retrieved labels to enrich the training set in a self-training loop of contrastive learning. PESCO achieves state-of-the-art performance on four benchmark text classification datasets. On DBpedia, we achieve 98.5\% accuracy without any labeled data, which is close to the fully-supervised result. Extensive experiments and analyses show all the components of PESCO are necessary for improving the performance of zero-shot text classification.

classification, machine learning, natural language, (17 more...)

2305.14963

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > District of Columbia > Washington (0.04)
(12 more...)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceMay-24-2023

Perturbation-based Self-supervised Attention for Attention Bias in Text Classification

Feng, Huawen, Lin, Zhenxi, Ma, Qianli

In text classification, the traditional attention mechanisms usually focus too much on frequent words, and need extensive labeled data in order to learn. This paper proposes a perturbation-based self-supervised attention approach to guide attention learning without any annotation overhead. Specifically, we add as much noise as possible to all the words in the sentence without changing their semantics and predictions. We hypothesize that words that tolerate more noise are less significant, and we can use this information to refine the attention distribution. Experimental results on three text classification tasks show that our approach can significantly improve the performance of current attention-based models, and is more effective than existing self-supervised methods. We also provide a visualization analysis to verify the effectiveness of our approach.

attention bias, perturbation-based self-supervised attention, text classification

doi: 10.1109/TASLP.2023.3302230

2305.15684

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.80)
Information Technology > Artificial Intelligence > Machine Learning (0.53)

Shome, Debaditya, Yadav, Kuldeep

EXnet: Efficient In-context Learning for Data-less Text classification

arXiv.org Artificial IntelligenceMay-23-2023

Large pre-trained language models (PLMs) have made significant progress in encoding world knowledge and spawned a new set of learning paradigms including zero-shot, few-shot, and in-context learning. Many language tasks can be modeled as a set of prompts (for example, is this text about geography?) and language models can provide binary answers, i.e., Yes or No. There is evidence to suggest that the next-word prediction used by many PLMs does not align well with zero-shot paradigms. Therefore, PLMs are fine-tuned as a question-answering system. In-context learning extends zero-shot learning by incorporating prompts and examples, resulting in increased task accuracy. Our paper presents EXnet, a model specifically designed to perform in-context learning without any limitations on the number of examples. We argue that in-context learning is an effective method to increase task accuracy, and providing examples facilitates cross-task generalization, especially when it comes to text classification tasks. With extensive experiments, we show that even our smallest model (15M parameters) generalizes to several unseen classification tasks and domains.

large language model, natural language, text classification, (16 more...)

2305.14622

Country: Asia > India (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.86)

Bugueño, Margarita, de Melo, Gerard

Connecting the Dots: What Graph-Based Text Representations Work Best for Text Classification using Graph Neural Networks?

arXiv.org Artificial IntelligenceMay-23-2023

Given the success of Graph Neural Networks (GNNs) for structure-aware machine learning, numerous studies have explored their application to text classification, as an alternative to traditional feature representation models. However, most studies considered just a specific domain and validated on data with particular characteristics. This work presents an extensive empirical investigation of graph-based text representation methods proposed for text classification, identifying practical implications and open challenges in the field. We compare several GNN architectures as well as BERT across five datasets, encompassing short and also long documents. The results show that: i) graph performance is highly related to the textual input features and domain, ii) despite its outstanding performance, BERT has difficulties converging when dealing with short texts, iii) graph methods are particularly beneficial for longer documents.

machine learning, natural language, text classification, (17 more...)

2305.14578

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Hong Kong (0.04)
North America > United States > Oregon (0.04)
(4 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

arXiv.org Artificial IntelligenceMay-23-2023

Out-of-Distribution Generalization in Text Classification: Past, Present, and Future

Yang, Linyi, Song, Yaoxiao, Ren, Xuan, Lyu, Chenyang, Wang, Yidong, Liu, Lingqiao, Wang, Jindong, Foster, Jennifer, Zhang, Yue

Machine learning (ML) systems in natural language processing (NLP) face significant challenges in generalizing to out-of-distribution (OOD) data, where the test distribution differs from the training data distribution. This poses important questions about the robustness of NLP models and their high accuracy, which may be artificially inflated due to their underlying sensitivity to systematic biases. Despite these challenges, there is a lack of comprehensive surveys on the generalization challenge from an OOD perspective in text classification. Therefore, this paper aims to fill this gap by presenting the first comprehensive review of recent progress, methods, and evaluations on this topic. We furth discuss the challenges involved and potential future research directions. By providing quick access to existing work, we hope this survey will encourage future research in this area.

large language model, machine learning, natural language, (18 more...)

2305.14104

Country:

Europe > Italy > Tuscany > Florence (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(8 more...)

Genre: Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.68)
Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(3 more...)

Chalkidis, Ilias, Kementchedjhieva, Yova

Retrieval-augmented Multi-label Text Classification

Multi-label text classification (MLC) is a challenging task in settings of large label sets, where label support follows a Zipfian distribution. In this paper, we address this problem through retrieval augmentation, aiming to improve the sample efficiency of classification models. Our approach closely follows the standard MLC architecture of a Transformer-based encoder paired with a set of classification heads. In our case, however, the input document representation is augmented through cross-attention to similar documents retrieved from the training set and represented in a task-specific manner. We evaluate this approach on four datasets from the legal and biomedical domains, all of which feature highly skewed label distributions. Our experiments show that retrieval augmentation substantially improves model performance on the long tail of infrequent labels especially so for lower-resource training scenarios and more challenging long-document data scenarios.

machine learning, natural language, text classification, (16 more...)

2305.13058

Country:

Europe > Italy > Tuscany > Florence (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > Dominican Republic (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry:

Law (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Santi, Dr. Prabhat, Mishra, Kamakhya, Mohanty, Sibabrata

Quantum Text Classifier -- A Synchronistic Approach Towards Classical and Quantum Machine Learning

Although it will be a while before a practical quantum computer is available, there is no need to hold off. Methods and algorithms are being developed to demonstrate the feasibility of running machine learning (ML) pipelines in QC (Quantum Computing). There is a lot of ongoing work on general QML (Quantum Machine Learning) algorithms and applications. However, a working model or pipeline for a text classifier using quantum algorithms isn't available. This paper introduces quantum machine learning w.r.t text classification to readers of classical machine learning. It begins with a brief description of quantum computing and basic quantum algorithms, with an emphasis on building text classification pipelines. A new approach is introduced to implement an end-to-end text classification framework (Quantum Text Classifier - QTC), where pre- and post-processing of data is performed on a classical computer, and text classification is performed using the QML algorithm. This paper also presents an implementation of the QTC framework and available quantum ML algorithms for text classification using the IBM Qiskit library and IBM backends.

classifier, machine learning, natural language, (16 more...)

2305.12783

Genre: Research Report (0.64)

Industry: Information Technology (0.56)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.71)

Improving Classifier Robustness through Active Generation of Pairwise Counterfactuals

Balashankar, Ananth, Wang, Xuezhi, Qin, Yao, Packer, Ben, Thain, Nithum, Chen, Jilin, Chi, Ed H., Beutel, Alex

Counterfactual Data Augmentation (CDA) is a commonly used technique for improving robustness in natural language classifiers. However, one fundamental challenge is how to discover meaningful counterfactuals and efficiently label them, with minimal human labeling cost. Most existing methods either completely rely on human-annotated labels, an expensive process which limits the scale of counterfactual data, or implicitly assume label invariance, which may mislead the model with incorrect labels. In this paper, we present a novel framework that utilizes counterfactual generative models to generate a large number of diverse counterfactuals by actively sampling from regions of uncertainty, and then automatically label them with a learned pairwise classifier. Our key insight is that we can more correctly label the generated counterfactuals by training a pairwise classifier that interpolates the relationship between the original example and the counterfactual. We demonstrate that with a small amount of human-annotated counterfactual data (10%), we can generate a counterfactual augmentation dataset with learned labels, that provides an 18-20% improvement in robustness and a 14-21% reduction in errors on 6 out-of-domain datasets, comparable to that of a fully human-annotated counterfactual dataset for both sentiment classification and question paraphrase tasks.

machine learning, natural language, text classification, (18 more...)

2305.13535

Country:

Asia > China > Hong Kong (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(7 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.34)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.34)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.34)

A Benchmark on Extremely Weakly Supervised Text Classification: Reconcile Seed Matching and Prompting Approaches

Wang, Zihan, Wang, Tianle, Mekala, Dheeraj, Shang, Jingbo

Etremely Weakly Supervised Text Classification (XWS-TC) refers to text classification based on minimal high-level human guidance, such as a few label-indicative seed words or classification instructions. There are two mainstream approaches for XWS-TC, however, never being rigorously compared: (1) training classifiers based on pseudo-labels generated by (softly) matching seed words (SEED) and (2) prompting (and calibrating) language models using classification instruction (and raw texts) to decode label words (PROMPT). This paper presents the first XWS-TC benchmark to compare the two approaches on fair grounds, where the datasets, supervisions, and hyperparameter choices are standardized across methods. Our benchmarking results suggest that (1) Both SEED and PROMPT approaches are competitive and there is no clear winner; (2) SEED is empirically more tolerant than PROMPT to human guidance (e.g., seed words, classification instructions, and label words) changes; (3) SEED is empirically more selective than PROMPT to the pre-trained language models; (4) Recent SEED and PROMPT methods have close connections and a clustering post-processing step based on raw in-domain texts is a strong performance booster to both. We hope this benchmark serves as a guideline in selecting XWS-TC methods in different scenarios and stimulate interest in developing guidance- and model-robust XWS-TC methods. We release the repo at https://github.com/ZihanWangKi/x-TC.

large language model, machine learning, natural language, (19 more...)

2305.12749

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
North America > Dominican Republic (0.04)
(15 more...)

Genre: Research Report > New Finding (0.48)

Industry: Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)