AITopics

2410.1813

Country:

Asia > China > Hunan Province > Changsha (0.05)
North America > United States > New York (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)

Mylonas, Nikolaos, Stylianou, Nikolaos, Tsikrika, Theodora, Vrochidis, Stefanos, Kompatsiaris, Ioannis

A Multi-Task Text Classification Pipeline with Natural Language Explanations: A User-Centric Evaluation in Sentiment Analysis and Offensive Language Identification in Greek Tweets

arXiv.org Artificial IntelligenceOct-14-2024

Interpretability is a topic that has been in the spotlight for the past few years. Most existing interpretability techniques produce interpretations in the form of rules or feature importance. These interpretations, while informative, may be harder to understand for non-expert users and therefore, cannot always be considered as adequate explanations. To that end, explanations in natural language are often preferred, as they are easier to comprehend and also more presentable to end-users. This work introduces an early concept for a novel pipeline that can be used in text classification tasks, offering predictions and explanations in natural language. It comprises of two models: a classifier for labelling the text and an explanation generator which provides the explanation. The proposed pipeline can be adopted by any text classification task, given that ground truth rationales are available to train the explanation generator. Our experiments are centred around the tasks of sentiment analysis and offensive language identification in Greek tweets, using a Greek Large Language Model (LLM) to obtain the necessary explanations that can act as rationales. The experimental evaluation was performed through a user study based on three different metrics and achieved promising results for both datasets.

dataset, explanation, sentiment, (15 more...)

2410.1029

Country:

Europe > Greece (0.05)
North America > United States (0.04)
South America > Paraguay > Asunción > Asunción (0.04)
(5 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.83)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
(2 more...)

Gan, Chengguang, Mori, Tatsunori

Empirical Study of Mutual Reinforcement Effect and Application in Few-shot Text Classification Tasks via Prompt

arXiv.org Artificial IntelligenceOct-13-2024

The Mutual Reinforcement Effect (MRE) investigates the synergistic relationship between word-level and text-level classifications in text classification tasks. It posits that the performance of both classification levels can be mutually enhanced. However, this mechanism has not been adequately demonstrated or explained in prior research. To address this gap, we employ empirical experiment to observe and substantiate the MRE theory. Our experiments on 21 MRE mix datasets revealed the presence of MRE in the model and its impact. Specifically, we conducted compare experiments use fine-tune. The results of findings from comparison experiments corroborates the existence of MRE. Furthermore, we extended the application of MRE to prompt learning, utilizing word-level information as a verbalizer to bolster the model's prediction of text-level classification labels. In our final experiment, the F1-score significantly surpassed the baseline in 18 out of 21 MRE Mix datasets, further validating the notion that word-level information enhances the language model's comprehension of the text as a whole.

large language model, machine learning, natural language, (19 more...)

2410.09745

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.96)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.89)

Rizvi, Syed Mustafa Haider, Imran, Ramsha, Mahmood, Arif

Text Classification using Graph Convolutional Networks: A Comprehensive Survey

arXiv.org Artificial IntelligenceOct-12-2024

Text classification is a quintessential and practical problem in natural language processing with applications in diverse domains such as sentiment analysis, fake news detection, medical diagnosis, and document classification. A sizable body of recent works exists where researchers have studied and tackled text classification from different angles with varying degrees of success. Graph convolution network (GCN)-based approaches have gained a lot of traction in this domain over the last decade with many implementations achieving state-of-the-art performance in more recent literature and thus, warranting the need for an updated survey. This work aims to summarize and categorize various GCN-based Text Classification approaches with regard to the architecture and mode of supervision. It identifies their strengths and limitations and compares their performance on various benchmark datasets. We also discuss future research directions and the challenges that exist in this domain.

machine learning, natural language, text classification, (13 more...)

2410.09399

Country:

Asia > China (0.14)
Asia > Pakistan > Punjab > Lahore Division > Lahore (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre:

Overview (1.00)
Research Report > New Finding (0.46)
Research Report > Promising Solution (0.45)

Industry:

Information Technology > Security & Privacy (1.00)
Information Technology > Services (0.92)
Health & Medicine > Therapeutic Area (0.67)
Health & Medicine > Diagnostic Medicine (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Neural Information Processing SystemsOct-11-2024, 10:29:21 GMT

Character-level Convolutional Networks for Text Classification

This article offers an empirical exploration on the use of character-level convolutional networks (ConvNets) for text classification. We constructed several large-scale datasets to show that character-level convolutional networks could achieve state-of-the-art or competitive results. Comparisons are offered against traditional models such as bag of words, n-grams and their TFIDF variants, and deep learning models such as word-based ConvNets and recurrent neural networks.

character-level convolutional network, text classification

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Plaud, Roman, Labeau, Matthieu, Saillenfest, Antoine, Bonald, Thomas

Revisiting Hierarchical Text Classification: Inference and Metrics

arXiv.org Artificial IntelligenceOct-11-2024

Hierarchical text classification (HTC) is the task of assigning labels to a text within a structured space organized as a hierarchy. Recent works treat HTC as a conventional multilabel classification problem, therefore evaluating it as such. We instead propose to evaluate models based on specifically designed hierarchical metrics and we demonstrate the intricacy of metric choice and prediction inference method. We introduce a new challenging dataset and we evaluate fairly, recent sophisticated models, comparing them with a range of simple but strong baselines, including a new theoretically motivated loss. Finally, we show that those baselines are very often competitive with the latest models. This highlights the importance of carefully considering the evaluation methodology when proposing new methods for HTC. Code implementation and dataset are available at \url{https://github.com/RomanPlaud/revisitingHTC}.

machine learning, natural language, text classification, (17 more...)

2410.01305

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Ontario > Toronto (0.04)
Europe > Austria > Vienna (0.04)
(15 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.46)
Research Report > Promising Solution (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Neural Information Processing SystemsOct-10-2024, 18:15:12 GMT

Public Wisdom Matters! Discourse-Aware Hyperbolic Fourier Co-Attention for Social Text Classification

Social media has become the fulcrum of all forms of communication. Classifying social texts such as fake news, rumour, sarcasm, etc. has gained significant attention. The surface-level signals expressed by a social-text itself may not be adequate for such tasks; therefore, recent methods attempted to incorporate other intrinsic signals such as user behavior and the underlying graph structure. Oftentimes, the public wisdom expressed through the comments/replies to a social-text acts as a surrogate of crowd-sourced view and may provide us with complementary signals. State-of-the-art methods on social-text classification tend to ignore such a rich hierarchical signal.

discourse-aware hyperbolic fourier co-attention, public wisdom matter, social text classification, (6 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.71)
Information Technology > Communications > Social Media (0.61)

Neural Information Processing SystemsOct-10-2024, 14:59:23 GMT

AttentionXML: Label Tree-based Attention-Aware Deep Model for High-Performance Extreme Multi-Label Text Classification

Extreme multi-label text classification (XMTC) is an important problem in the era of {\it big data}, for tagging a given text with the most relevant multiple labels from an extremely large-scale label set. XMTC can be found in many applications, such as item categorization, web page tagging, and news annotation. Traditionally most methods used bag-of-words (BOW) as inputs, ignoring word context as well as deep semantic information. Recent attempts to overcome the problems of BOW by deep learning still suffer from 1) failing to capture the important subtext for each label and 2) lack of scalability against the huge number of labels. We propose a new label tree-based deep learning model for XMTC, called AttentionXML, with two unique features: 1) a multi-label attention mechanism with raw text as input, which allows to capture the most relevant part of text to each label; and 2) a shallow and wide probabilistic label tree (PLT), which allows to handle millions of labels, especially for "tail labels".

attentionxml, high-performance extreme multi-label text classification, label tree-based attention-aware deep model, (3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.64)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.61)

Neural Information Processing SystemsOct-10-2024, 03:20:31 GMT

SADGA: Structure-Aware Dual Graph Aggregation Network for Text-to-SQL

The Text-to-SQL task, aiming to translate the natural language of the questions into SQL queries, has drawn much attention recently. One of the most challenging problems of Text-to-SQL is how to generalize the trained model to the unseen database schemas, also known as the cross-domain Text-to-SQL task. The key lies in the generalizability of (i) the encoding method to model the question and the database schema and (ii) the question-schema linking method to learn the mapping between words in the question and tables/columns in the database schema. Focusing on the above two key issues, we propose a \emph{Structure-Aware Dual Graph Aggregation Network} (SADGA) for cross-domain Text-to-SQL. In SADGA, we adopt the graph structure to provide a unified encoding model for both the natural language question and database schema.

database schema, structure-aware dual graph aggregation network, text-to-sql, (5 more...)

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.40)

Neural Information Processing SystemsOct-10-2024, 02:27:22 GMT

Fast Multi-Resolution Transformer Fine-tuning for Extreme Multi-label Text Classification

Extreme multi-label text classification (XMC) seeks to find relevant labels from an extreme large label collection for a given text input. Many real-world applications can be formulated as XMC problems, such as recommendation systems, document tagging and semantic search. Recently, transformer based XMC methods, such as X-Transformer and LightXML, have shown significant improvement over other XMC methods. Despite leveraging pre-trained transformer models for text representation, the fine-tuning procedure of transformer models on large label space still has lengthy computational time even with powerful GPUs. In this paper, we propose a novel recursive approach, XR-Transformer to accelerate the procedure through recursively fine-tuning transformer models on a series of multi-resolution objectives related to the original XMC objective function.

extreme multi-label text classification, fast multi-resolution transformer fine-tuning, transformer model, (3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.67)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.65)