AITopics | Text Classification

Collaborating Authors

Text Classification

"A text classifier is an automated means of determining some metadata about a document. Text classifiers are used for such diverse needs as spam filtering, suggesting categories for indexing a document created in a content management system, or automatically sorting help desk requests."
– John Graham-Cumming, Naive Bayesian Text Classification. Dr. Dobb's. May 1 2005.

News Overviews Instructional Materials AI-Alerts Classics

Metadata-Induced Contrastive Learning for Zero-Shot Multi-Label Text Classification

Zhang, Yu, Shen, Zhihong, Wu, Chieh-Han, Xie, Boya, Hao, Junheng, Wang, Ye-Yi, Wang, Kuansan, Han, Jiawei

arXiv.org Artificial IntelligenceMar-24-2022

Large-scale multi-label text classification (LMTC) aims to associate a document with its relevant labels from a large candidate set. Most existing LMTC approaches rely on massive human-annotated training data, which are often costly to obtain and suffer from a long-tailed label distribution (i.e., many labels occur only a few times in the training set). In this paper, we study LMTC under the zero-shot setting, which does not require any annotated documents with labels and only relies on label surface names and descriptions. To train a classifier that calculates the similarity score between a document and a label, we propose a novel metadata-induced contrastive learning (MICoL) method. Different from previous text-based contrastive learning techniques, MICoL exploits document metadata (e.g., authors, venues, and references of research papers), which are widely available on the Web, to derive similar document-document pairs. Experimental results on two large-scale datasets show that: (1) MICoL significantly outperforms strong zero-shot text classification and contrastive learning baselines; (2) MICoL is on par with the state-of-the-art supervised metadata-aware LMTC method trained on 10K-200K labeled documents; and (3) MICoL tends to predict more infrequent labels than supervised methods, thus alleviates the deteriorated performance on long-tailed labels.

large language model, machine learning, natural language, (3 more...)

arXiv.org Artificial Intelligence

2202.05932

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.80)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.80)

Add feedback

On Robust Prefix-Tuning for Text Classification

Yang, Zonghan, Liu, Yang

arXiv.org Machine LearningMar-19-2022

Recently, prefix-tuning has gained increasing attention as a parameter-efficient finetuning method for large-scale pretrained language models. The method keeps the pretrained models fixed and only updates the prefix token parameters for each downstream task. Despite being lightweight and modular, prefix-tuning still lacks robustness to textual adversarial attacks. However, most currently developed defense techniques necessitate auxiliary model update and storage, which inevitably hamper the modularity and low storage of prefix-tuning. In this work, we propose a robust prefix-tuning framework that preserves the efficiency and modularity of prefix-tuning. The core idea of our framework is leveraging the layerwise activations of the language model by correctly-classified training data as the standard for additional prefix finetuning. During the test phase, an extra batch-level prefix is tuned for each batch and added to the original prefix for robustness enhancement. Extensive experiments on three text classification benchmarks show that our framework substantially improves robustness over several strong baselines against five textual attacks of different types while maintaining comparable accuracy on clean texts. We also interpret our robust prefix-tuning framework from the optimal control perspective and pose several directions for future research.

computational linguistic, conference paper, proceedings, (12 more...)

arXiv.org Machine Learning

2203.10378

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
Asia > China > Hong Kong (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(18 more...)

Genre: Research Report (1.00)

Industry:

Government > Military (0.34)
Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Deep Learning for Technical Document Classification

Jiang, Shuo, Hu, Jie, Magee, Christopher L., Luo, Jianxi

arXiv.org Artificial IntelligenceFeb-19-2022

In large technology companies, the requirements for managing and organizing technical documents created by engineers and managers have increased dramatically in recent years, which has led to a higher demand for more scalable, accurate, and automated document classification. Prior studies have only focused on processing text for classification, whereas technical documents often contain multimodal information. To leverage multimodal information for document classification to improve the model performance, this paper presents a novel multimodal deep learning architecture, TechDoc, which utilizes three types of information, including natural language texts and descriptive images within documents and the associations among the documents. The architecture synthesizes the convolutional neural network, recurrent neural network, and graph neural network through an integrated training process. We applied the architecture to a large multimodal technical document database and trained the model for classifying documents based on the hierarchical International Patent Classification system. Our results show that TechDoc presents a greater classification accuracy than the unimodal methods and other state-of-the-art benchmarks. The trained model can potentially be scaled to millions of real-world multimodal technical documents, which is useful for data and knowledge management in large technology companies and organizations.

classification, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TEM.2022.3152216

2106.14269

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > Singapore (0.05)
Asia > China > Shanghai > Shanghai (0.05)
(5 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology (1.00)
Law > Intellectual Property & Technology Law (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Counterfactual Multi-Token Fairness in Text Classification

Lohia, Pranay

arXiv.org Artificial IntelligenceFeb-8-2022

The counterfactual token generation has been limited to perturbing only a single token in texts that are generally short and single sentences. These tokens are often associated with one of many sensitive attributes. With limited counterfactuals generated, the goal to achieve invariant nature for machine learning classification models towards any sensitive attribute gets bounded, and the formulation of Counterfactual Fairness gets narrowed. In this paper, we overcome these limitations by solving root problems and opening bigger domains for understanding. We have curated a resource of sensitive tokens and their corresponding perturbation tokens, even extending the support beyond traditionally used sensitive attributes like Age, Gender, Race to Nationality, Disability, and Religion. The concept of Counterfactual Generation has been extended to multi-token support valid over all forms of texts and documents. We define the method of generating counterfactuals by perturbing multiple sensitive tokens as Counterfactual Multi-token Generation. The method has been conceptualized to showcase significant performance improvement over single-token methods and validated over multiple benchmark datasets. The emendation in counterfactual generation propagates in achieving improved Counterfactual Multi-token Fairness.

corresponding perturbation, counterfactual, fairness, (9 more...)

arXiv.org Artificial Intelligence

2202.03792

Country:

Asia > South Korea > Seoul > Seoul (0.06)
North America > United States > Texas (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
(6 more...)

Genre: Research Report (0.50)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.42)

Add feedback

Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation

Wang, Jixuan, Wang, Kuan-Chieh, Rudzicz, Frank, Brudno, Michael

arXiv.org Artificial IntelligenceJan-27-2022

Large pretrained language models (LMs) like BERT have improved performance in many disparate natural language processing (NLP) tasks. However, fine tuning such models requires a large number of training examples for each target task. Simultaneously, many realistic NLP problems are "few shot", without a sufficiently large training set. In this work, we propose a novel conditional neural process-based approach for few-shot text classification that learns to transfer from other diverse tasks with rich annotation. Our key idea is to represent each task using gradient information from a base model and to train an adaptation network that modulates a text classifier conditioned on the task representation. While previous task-aware few-shot learners represent tasks by input encoding, our novel task representation is more powerful, as the gradient captures input-output relationships of a task. Experimental results show that our approach outperforms traditional fine-tuning, sequential transfer learning, and state-of-the-art meta learning approaches on a collection of diverse few-shot tasks. We further conducted analysis and ablations to justify our design choices.

base model, dataset, representation, (12 more...)

arXiv.org Artificial Intelligence

2201.11576

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Industry: Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Semantic Code Classification for Automated Machine Learning

Guseva, Polina, Drozdova, Anastasia, Denisenko, Natalia, Sapozhnikova, Daria, Pyaternev, Ivan, Scherbakova, Anna, Ustuzhanin, Andrey

arXiv.org Artificial IntelligenceJan-25-2022

A range of applications for automatic machine learning need the generation process to be controllable. In this work, we propose a way to control the output via a sequence of simple actions, that are called semantic code classes. Finally, we present a semantic code classification task and discuss methods for solving this problem on the Natural Language to Machine Learning (NL2ML) dataset.

kernel type, tf-idf integer 2, tf-idf numeric 0, (15 more...)

arXiv.org Artificial Intelligence

2201.11252

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
Asia > South Korea (0.04)

Genre: Research Report (0.42)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.69)

Add feedback

Dual Contrastive Learning: Text Classification via Label-Aware Data Augmentation

Chen, Qianben, Zhang, Richong, Zheng, Yaowei, Mao, Yongyi

arXiv.org Artificial IntelligenceJan-21-2022

Contrastive learning has achieved remarkable success in representation learning via self-supervision in unsupervised settings. However, effectively adapting contrastive learning to supervised learning tasks remains as a challenge in practice. In this work, we introduce a dual contrastive learning (DualCL) framework that simultaneously learns the features of input samples and the parameters of classifiers in the same space. Specifically, DualCL regards the parameters of the classifiers as augmented samples associating to different labels and then exploits the contrastive learning between the input samples and the augmented samples. Empirical studies on five benchmark text classification datasets and their low-resource version demonstrate the improvement in classification accuracy and confirm the capability of learning discriminative representations of DualCL.

contrastive learning, learning, representation, (12 more...)

arXiv.org Artificial Intelligence

2201.08702

Country:

Asia > China > Beijing > Beijing (0.04)
North America > Canada > Ontario > National Capital Region > Ottawa (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

TourBERT: A pretrained language model for the tourism industry

Arefieva, Veronika, Egger, Roman

arXiv.org Artificial IntelligenceJan-19-2022

Tourism is one of the most important economic sectors in the world (Hollenhorst The Bidirectional Encoder Representations et al., 2014), and its services have many from Transformers (BERT) is currently the characteristics that distinguish them from most important and state-of-the-art natural other products. Services are not tangible language model (Tenney et al., 2019) since and cannot be tested in advance, which is its launch in 2018 by Google. BERT Large, why the customer assumes an increased which is based on a Transformer risk before starting the trip. The service is architecture, is considered one of the most co-created together with the customer, so powerful language models with 24 layers, the customer is an active co-creator of the 16 attention heads, and 340 million service. Services are subject to the unoactu parameters (Lan et al. 2019). BERT is a principle, which means they are pretrained model and can be fine-tuned to produced at the same time as they are perform numerous downstream tasks such consumed, and they are considered as text classification, question answering, bilateral, i.e. a reciprocal relationship sentiment analysis, extractive between persons (Chehimi, 2014). In summarization, named entity recognition, addition, tourism services are relatively or sentence similarity (Egger, 2022). The expensive compared to everyday products model was pretrained on a huge English and have an intercultural dimension.

evaluation, language model, tourbert, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.13140/RG.2.2.18322.99525

2201.07449

Country:

Europe > Austria > Salzburg > Salzburg (0.05)
Oceania > New Zealand (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)
Europe > Germany > Hesse > Darmstadt Region > Wiesbaden (0.04)

Genre: Research Report (1.00)

Industry: Consumer Products & Services > Travel (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.54)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
(2 more...)

Add feedback

Polarity and Subjectivity Detection with Multitask Learning and BERT Embedding

Satapathy, Ranjan, Pardeshi, Shweta, Cambria, Erik

arXiv.org Artificial IntelligenceJan-14-2022

Multitask learning often helps improve the performance of related tasks as these often have inter-dependence on each other and perform better when solved in a joint framework. In this paper, we present a deep multitask learning framework that jointly performs polarity and subjective detection. We propose an attention-based multitask model for predicting polarity and subjectivity. The input sentences are transformed into vectors using pre-trained BERT and Glove embeddings, and the results depict that BERT embedding based model works better than the Glove based model. We compare our approach with state-of-the-art models in both subjective and polarity classification single-task and multitask frameworks. The proposed approach reports baseline performances for both polarity detection and subjectivity detection.

detection, proceedings, representation, (11 more...)

arXiv.org Artificial Intelligence

2201.05363

Country: North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.92)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.69)

Add feedback

MotifClass: Weakly Supervised Text Classification with Higher-order Metadata Information

Zhang, Yu, Garg, Shweta, Meng, Yu, Chen, Xiusi, Han, Jiawei

arXiv.org Artificial IntelligenceJan-11-2022

We study the problem of weakly supervised text classification, which aims to classify text documents into a set of pre-defined categories with category surface names only and without any annotated training document provided. Most existing classifiers leverage textual information in each document. However, in many domains, documents are accompanied by various types of metadata (e.g., authors, venue, and year of a research paper). These metadata and their combinations may serve as strong category indicators in addition to textual contents. In this paper, we explore the potential of using metadata to help weakly supervised text classification. To be specific, we model the relationships between documents and metadata via a heterogeneous information network. To effectively capture higher-order structures in the network, we use motifs to describe metadata combinations. We propose a novel framework, named MotifClass, which (1) selects category-indicative motif instances, (2) retrieves and generates pseudo-labeled training samples based on category names and indicative motif instances, and (3) trains a text classifier using the pseudo training data. Extensive experiments on real-world datasets demonstrate the superior performance of MotifClass to existing weakly supervised text classification approaches. Further analysis shows the benefit of considering higher-order metadata information in our framework.

artificial intelligence, natural language, weakly supervised text classification, (2 more...)

arXiv.org Artificial Intelligence

2111.04022

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)

Add feedback