Goto

Collaborating Authors

 Text Classification


STT: Soft Template Tuning for Few-Shot Adaptation

arXiv.org Artificial Intelligence

Prompt tuning has been an extremely effective tool to adapt a pre-trained model to downstream tasks. However, standard prompt-based methods mainly consider the case of sufficient data of downstream tasks. It is still unclear whether the advantage can be transferred to the few-shot regime, where only limited data are available for each downstream task. Although some works have demonstrated the potential of prompt-tuning under the few-shot setting, the main stream methods via searching discrete prompts or tuning soft prompts with limited data are still very challenging. Through extensive empirical studies, we find that there is still a gap between prompt tuning and fully fine-tuning for few-shot learning. To bridge the gap, we propose a new prompt-tuning framework, called Soft Template Tuning (STT). STT combines manual and auto prompts, and treats downstream classification tasks as a masked language modeling task. Comprehensive evaluation on different settings suggests STT can close the gap between fine-tuning and prompt-based methods without introducing additional parameters. Significantly, it can even outperform the time- and resource-consuming fine-tuning method on sentiment classification tasks.


Improving Meta-learning for Low-resource Text Classification and Generation via Memory Imitation

arXiv.org Artificial Intelligence

Building models of natural language processing (NLP) is challenging in low-resource scenarios where only limited data are available. Optimization-based meta-learning algorithms achieve promising results in low-resource scenarios by adapting a well-generalized model initialization to handle new tasks. Nonetheless, these approaches suffer from the memorization overfitting issue, where the model tends to memorize the meta-training tasks while ignoring support sets when adapting to new tasks. To address this issue, we propose a memory imitation meta-learning (MemIML) method that enhances the model's reliance on support sets for task adaptation. Specifically, we introduce a task-specific memory module to store support set information and construct an imitation module to force query sets to imitate the behaviors of some representative support-set samples stored in the memory. A theoretical analysis is provided to prove the effectiveness of our method, and empirical results also demonstrate that our method outperforms competitive baselines on both text classification and generation tasks.


Putting the Con in Context: Identifying Deceptive Actors in the Game of Mafia

arXiv.org Artificial Intelligence

While neural networks demonstrate a remarkable ability to model linguistic content, capturing contextual information related to a speaker's conversational role is an open area of research. In this work, we analyze the effect of speaker role on language use through the game of Mafia, in which participants are assigned either an honest or a deceptive role. In addition to building a framework to collect a dataset of Mafia game records, we demonstrate that there are differences in the language produced by players with different roles. We confirm that classification models are able to rank deceptive players as more suspicious than honest ones based only on their use of language. Furthermore, we show that training models on two auxiliary tasks outperforms a standard BERT-based text classification approach. We also present methods for using our trained models to identify features that distinguish between player roles, which could be used to assist players during the Mafia game.


Increasing Accuracy of Sentiment Classification Using Negation Handling

#artificialintelligence

The function for the negation handler is available at my Github repo. An example of the function output is shown below. 'Negation' is the main function being called on the tokenized sentence as shown. In the function, whenever a negation word (like'not', "n't", 'non-', 'un-', etc) is encountered, a set of cognitive synonyms called synsets are generated for the word next to the negation. These synsets are interlinked by conceptual semantic and lexical relations to each other in a lexical database called WordNet.


SBERT vs. Data2vec on Text Classification

#artificialintelligence

I personally do believe all the fancy ML research and advanced AI algorithm works have very minimal value if not zero until the date when they can be applied to real-life projects without asking the users for an insane amount of resources and excessive domain knowledge. And Hugging Face builds the bridge. Hugging Face is the home for thousands of pre-trained models which have made great contributions to democratizing artificial intelligence through open source and open science. Today, I want to give you an end-to-end code demo to compare two of the most popular pre-trained models by conducting a multi-label text classification analysis. The first model is SentenceTransformers (SBERT).


Text Classification with Movie Reviews

#artificialintelligence

This notebook classifies movie reviews as positive or negative using the text of the review. This is an example of binary--or two-class--classification, an important and widely applicable kind of machine learning problem. We'll use the IMDB dataset that contains the text of 50,000 movie reviews from the Internet Movie Database. These are split into 25,000 reviews for training and 25,000 reviews for testing. The training and testing sets are balanced, meaning they contain an equal number of positive and negative reviews.


Multi Class Text Classification using Python and GridDB

#artificialintelligence

On the Internet, there are a lot of sources that provide enormous amounts of daily news. Further, the demand for information by users has been growing continuously, so it is important to classify the news in a way that lets users access the information they are interested in quickly and efficiently. Using this model, users would be able to identify news topics that go untracked, and/or make recommendations based on their prior interests. Thus, we aim to build models that take news headlines and short descriptions as inputs and produce news categories as outputs. The problem we will tackle is the classification of BBC News articles and their categories.


Mono vs Multilingual BERT for Hate Speech Detection and Text Classification: A Case Study in Marathi

arXiv.org Artificial Intelligence

Transformers are the most eminent architectures used for a vast range of Natural Language Processing tasks. These models are pre-trained over a large text corpus and are meant to serve state-of-the-art results over tasks like text classification. In this work, we conduct a comparative study between monolingual and multilingual BERT models. We focus on the Marathi language and evaluate the models on the datasets for hate speech detection, sentiment analysis and simple text classification in Marathi. We use standard multilingual models such as mBERT, indicBERT and xlm-RoBERTa and compare with MahaBERT, MahaALBERT and MahaRoBERTa, the monolingual models for Marathi. We further show that Marathi monolingual models outperform the multilingual BERT variants on five different downstream fine-tuning experiments. We also evaluate sentence embeddings from these models by freezing the BERT encoder layers. We show that monolingual MahaBERT based models provide rich representations as compared to sentence embeddings from multi-lingual counterparts. However, we observe that these embeddings are not generic enough and do not work well on out of domain social media datasets. We consider two Marathi hate speech datasets L3Cube-MahaHate, HASOC-2021, a Marathi sentiment classification dataset L3Cube-MahaSent, and Marathi Headline, Articles classification datasets.


Combining NLP and Machine Learning for Document Classification

#artificialintelligence

Text mining is a popular topic for exploring what text you have in documents etc. Text mining and NLP can help you discover different patterns in the text like uncovering certain words or phases which are commonly used, to identifying certain patterns and linkages between different texts/documents. Combining this work on Text mining you can use Word Clouds, time-series analysis, etc to discover other aspects and patterns in the text. Check out my previous blog posts (post 1, post 2) on performing Text Mining on documents (manifestos from some of the political parties from the last two national government elections in Ireland). These two posts gives you a simple indication of what is possible.


NLP with Transformers -- 1 (FINE TUNING BERT FOR TEXT CLASSIFICATION) !!!🚀🚀🚀

#artificialintelligence

BERT (Bi-Directional Encoder Representation from Transformers) is that type of transformer introduced by Google which consists of only encoder and no decoder. Finally after following a similar approach on test data we perform our test evaluation using Mathew's correlation Coefficient which is highly recommended as a metric for classification type of problems. Voila!!! we finally fine tuned our bert model as per our use-case. Complete implementation can be found here…..