AITopics

2207.08408

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.34)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.34)
(2 more...)

arXiv.org Artificial IntelligenceJul-14-2022

Improving Meta-learning for Low-resource Text Classification and Generation via Memory Imitation

Zhao, Yingxiu, Tian, Zhiliang, Yao, Huaxiu, Zheng, Yinhe, Lee, Dongkyu, Song, Yiping, Sun, Jian, Zhang, Nevin L.

Building models of natural language processing (NLP) is challenging in low-resource scenarios where only limited data are available. Optimization-based meta-learning algorithms achieve promising results in low-resource scenarios by adapting a well-generalized model initialization to handle new tasks. Nonetheless, these approaches suffer from the memorization overfitting issue, where the model tends to memorize the meta-training tasks while ignoring support sets when adapting to new tasks. To address this issue, we propose a memory imitation meta-learning (MemIML) method that enhances the model's reliance on support sets for task adaptation. Specifically, we introduce a task-specific memory module to store support set information and construct an imitation module to force query sets to imitate the behaviors of some representative support-set samples stored in the memory. A theoretical analysis is provided to prove the effectiveness of our method, and empirical results also demonstrate that our method outperforms competitive baselines on both text classification and generation tasks.

information, memorization, module, (14 more...)

2203.1167

Country:

North America > United States > South Carolina (0.04)
Asia > China > Hong Kong (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (0.46)
Education (0.46)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Ibraheem, Samee, Zhou, Gaoyue, DeNero, John

Putting the Con in Context: Identifying Deceptive Actors in the Game of Mafia

arXiv.org Artificial IntelligenceJul-5-2022

While neural networks demonstrate a remarkable ability to model linguistic content, capturing contextual information related to a speaker's conversational role is an open area of research. In this work, we analyze the effect of speaker role on language use through the game of Mafia, in which participants are assigned either an honest or a deceptive role. In addition to building a framework to collect a dataset of Mafia game records, we demonstrate that there are differences in the language produced by players with different roles. We confirm that classification models are able to rank deceptive players as more suspicious than honest ones based only on their use of language. Furthermore, we show that training models on two auxiliary tasks outperforms a standard BERT-based text classification approach. We also present methods for using our trained models to identify features that distinguish between player roles, which could be used to assist players during the Mafia game.

bystander, participant, utterance, (14 more...)

2207.02253

Country:

North America > United States > Hawaii (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.54)

#artificialintelligenceMay-31-2022, 01:05:16 GMT

Increasing Accuracy of Sentiment Classification Using Negation Handling

The function for the negation handler is available at my Github repo. An example of the function output is shown below. 'Negation' is the main function being called on the tokenized sentence as shown. In the function, whenever a negation word (like'not', "n't", 'non-', 'un-', etc) is encountered, a set of cognitive synonyms called synsets are generated for the word next to the negation. These synsets are interlinked by conceptual semantic and lexical relations to each other in a lexical database called WordNet.

antonym, negation handling, sentiment classification, (7 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.40)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.40)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.40)

#artificialintelligenceMay-23-2022, 17:06:07 GMT

SBERT vs. Data2vec on Text Classification

I personally do believe all the fancy ML research and advanced AI algorithm works have very minimal value if not zero until the date when they can be applied to real-life projects without asking the users for an insane amount of resources and excessive domain knowledge. And Hugging Face builds the bridge. Hugging Face is the home for thousands of pre-trained models which have made great contributions to democratizing artificial intelligence through open source and open science. Today, I want to give you an end-to-end code demo to compare two of the most popular pre-trained models by conducting a multi-label text classification analysis. The first model is SentenceTransformers (SBERT).

data2vec, pre-trained model, text classification, (11 more...)

Country: Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.05)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.76)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.63)

#artificialintelligenceApr-27-2022, 14:37:20 GMT

Text Classification with Movie Reviews

This notebook classifies movie reviews as positive or negative using the text of the review. This is an example of binary--or two-class--classification, an important and widely applicable kind of machine learning problem. We'll use the IMDB dataset that contains the text of 50,000 movie reviews from the Internet Movie Database. These are split into 25,000 reviews for training and 25,000 reviews for testing. The training and testing sets are balanced, meaning they contain an equal number of positive and negative reviews.

accuracy, loss function, training data, (13 more...)

Industry:

Media > Film (0.94)
Leisure & Entertainment (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.42)

#artificialintelligenceApr-21-2022, 02:43:23 GMT

Multi Class Text Classification using Python and GridDB

On the Internet, there are a lot of sources that provide enormous amounts of daily news. Further, the demand for information by users has been growing continuously, so it is important to classify the news in a way that lets users access the information they are interested in quickly and efficiently. Using this model, users would be able to identify news topics that go untracked, and/or make recommendations based on their prior interests. Thus, we aim to build models that take news headlines and short descriptions as inputs and produce news categories as outputs. The problem we will tackle is the classification of BBC News articles and their categories.

category, dataset, griddb, (12 more...)

Genre: Instructional Material > Course Syllabus & Notes (0.71)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.41)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.31)

Velankar, Abhishek, Patil, Hrushikesh, Joshi, Raviraj

Mono vs Multilingual BERT for Hate Speech Detection and Text Classification: A Case Study in Marathi

arXiv.org Artificial IntelligenceApr-19-2022

Transformers are the most eminent architectures used for a vast range of Natural Language Processing tasks. These models are pre-trained over a large text corpus and are meant to serve state-of-the-art results over tasks like text classification. In this work, we conduct a comparative study between monolingual and multilingual BERT models. We focus on the Marathi language and evaluate the models on the datasets for hate speech detection, sentiment analysis and simple text classification in Marathi. We use standard multilingual models such as mBERT, indicBERT and xlm-RoBERTa and compare with MahaBERT, MahaALBERT and MahaRoBERTa, the monolingual models for Marathi. We further show that Marathi monolingual models outperform the multilingual BERT variants on five different downstream fine-tuning experiments. We also evaluate sentence embeddings from these models by freezing the BERT encoder layers. We show that monolingual MahaBERT based models provide rich representations as compared to sentence embeddings from multi-lingual counterparts. However, we observe that these embeddings are not generic enough and do not work well on out of domain social media datasets. We consider two Marathi hate speech datasets L3Cube-MahaHate, HASOC-2021, a Marathi sentiment classification dataset L3Cube-MahaSent, and Marathi Headline, Articles classification datasets.

artificial intelligence, natural language, text classification, (16 more...)

doi: 10.1007/978-3-031-20650-4_10

2204.08669

Country:

Europe > Italy > Tuscany > Florence (0.05)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
Asia > India > Maharashtra (0.04)

Genre: Research Report (0.68)

Industry: Media > News (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.69)

#artificialintelligenceApr-7-2022, 20:13:38 GMT

Combining NLP and Machine Learning for Document Classification

Text mining is a popular topic for exploring what text you have in documents etc. Text mining and NLP can help you discover different patterns in the text like uncovering certain words or phases which are commonly used, to identifying certain patterns and linkages between different texts/documents. Combining this work on Text mining you can use Word Clouds, time-series analysis, etc to discover other aspects and patterns in the text. Check out my previous blog posts (post 1, post 2) on performing Text Mining on documents (manifestos from some of the political parties from the last two national government elections in Ireland). These two posts gives you a simple indication of what is possible.

classification, dataset, frequency, (10 more...)

Industry: Government (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.52)

#artificialintelligenceApr-5-2022, 05:00:11 GMT

NLP with Transformers -- 1 (FINE TUNING BERT FOR TEXT CLASSIFICATION) !!!🚀🚀🚀

BERT (Bi-Directional Encoder Representation from Transformers) is that type of transformer introduced by Google which consists of only encoder and no decoder. Finally after following a similar approach on test data we perform our test evaluation using Mathew's correlation Coefficient which is highly recommended as a metric for classification type of problems. Voila!!! we finally fine tuned our bert model as per our use-case. Complete implementation can be found here…..

classification, fine tuning bert, transformer, (2 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.40)