AITopics

2210.17541

Country:

North America > Dominican Republic (0.04)
North America > United States > Oregon (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (0.46)
Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Pan, Shuaiqun, Méndez, Sergio J. Rodríguez, Taylor, Kerry

A Pipeline for Analysing Grant Applications

arXiv.org Artificial IntelligenceOct-30-2022

Data mining techniques can transform massive amounts of unstructured data into quantitative data that quickly reveal insights, trends, and patterns behind the original data. In this paper, a data mining model is applied to analyse the 2019 grant applications submitted to an Australian Government research funding agency to investigate whether grant schemes successfully identifies innovative project proposals, as intended. The grant applications are peer-reviewed research proposals that include specific ``innovation and creativity'' (IC) scores assigned by reviewers. In addition to predicting the IC score for each research proposal, we are particularly interested in understanding the vocabulary of innovative proposals. In order to solve this problem, various data mining models and feature encoding algorithms are studied and explored. As a result, we propose a model with the best performance, a Random Forest (RF) classifier over documents encoded with features denoting the presence or absence of unigrams. In specific, the unigram terms are encoded by a modified Term Frequency - Inverse Document Frequency (TF-IDF) algorithm, which only implements the IDF part of TF-IDF. Besides the proposed model, this paper also presents a rigorous experimental pipeline for analysing grant applications, and the experimental results prove its feasibility.

data mining, machine learning, natural language, (18 more...)

2210.16843

Country:

Oceania > Australia > Australian Capital Territory > Canberra (0.04)
Europe > Middle East > Malta > Port Region > Southern Harbour District > Valletta (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Government (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

arXiv.org Artificial IntelligenceOct-29-2022

STPrompt: Semantic-guided and Task-driven prompts for Effective Few-shot Classification

Weng, Jinta, Hu, Yue, Qiu, Jing, Huan, Heyan

The effectiveness of prompt learning has been demonstrated in different pre-trained language models. By formulating suitable template and choosing representative label mapping, prompt learning can be used as an efficient knowledge probe. However, finding suitable prompt in existing methods requires multiple experimental attempts or appropriate vector initialization on formulating suitable template and choosing representative label mapping, which it is more common in few-shot learning tasks. Motivating by PLM working process, we try to construct the prompt from task semantic perspective and thus propose the STPrompt -Semantic-guided and Task-driven Prompt model. Specifically, two novel prompts generated from the semantic dependency tree (Dep-prompt) and task-specific metadata description (Meta-prompt), are firstly constructed in a prompt augmented pool, and the proposed model would automatically select a suitable semantic prompt to motivating the prompt learning process. Our results show that the proposed model achieves the state-of-the-art performance in five different datasets of few-shot text classification tasks, which prove that more semantic and significant prompts could assume as a better knowledge proving tool.

machine learning, mapping, natural language, (20 more...)

2210.16489

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.34)

Liu, Ziwen, Grau-Bove, Josep, Orr, Scott Allan

BERT-Flow-VAE: A Weakly-supervised Model for Multi-Label Text Classification

arXiv.org Artificial IntelligenceOct-27-2022

Multi-label Text Classification (MLTC) is the task of categorizing documents into one or more topics. Considering the large volumes of data and varying domains of such tasks, fully supervised learning requires manually fully annotated datasets which is costly and time-consuming. In this paper, we propose BERT-Flow-VAE (BFV), a Weakly-Supervised Multi-Label Text Classification (WSMLTC) model that reduces the need for full supervision. This new model (1) produces BERT sentence embeddings and calibrates them using a flow model, (2) generates an initial topic-document matrix by averaging results of a seeded sparse topic model and a textual entailment model which only require surface name of topics and 4-6 seed words per topic, and (3) adopts a VAE framework to reconstruct the embeddings under the guidance of the topic-document matrix. Finally, (4) it uses the means produced by the encoder model in the VAE architecture as predictions for MLTC. Experimental results on 6 multi-label datasets show that BFV can substantially outperform other baseline WSMLTC models in key metrics and achieve approximately 84% performance of a fully-supervised model.

machine learning, natural language, text classification, (17 more...)

2210.15225

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.92)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceOct-26-2022

A Curriculum Learning Approach for Multi-domain Text Classification Using Keyword weight Ranking

Yuan, Zilin, Li, Yinghui, Li, Yangning, Xie, Rui, Wu, Wei, Zheng, Hai-Tao

Text classification is a very classic NLP task, but it has two prominent shortcomings: On the one hand, text classification is deeply domain-dependent. That is, a classifier trained on the corpus of one domain may not perform so well in another domain. On the other hand, text classification models require a lot of annotated data for training. However, for some domains, there may not exist enough annotated data. Therefore, it is valuable to investigate how to efficiently utilize text data from different domains to improve the performance of models in various domains. Some multi-domain text classification models are trained by adversarial training to extract shared features among all domains and the specific features of each domain. We noted that the distinctness of the domain-specific features is different, so in this paper, we propose to use a curriculum learning strategy based on keyword weight ranking to improve the performance of multi-domain text classification models. The experimental results on the Amazon review and FDU-MTL datasets show that our curriculum learning strategy effectively improves the performance of multi-domain text classification models based on adversarial learning and outperforms state-of-the-art methods.

classification, machine learning, natural language, (16 more...)

2210.15147

Country: Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report (0.84)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Improving Imbalanced Text Classification with Dynamic Curriculum Learning

Zhang, Xulong, Wang, Jianzong, Cheng, Ning, Xiao, Jing

Recent advances in pre-trained language models have improved the performance for text classification tasks. However, little attention is paid to the priority scheduling strategy on the samples during training. Humans acquire knowledge gradually from easy to complex concepts, and the difficulty of the same material can also vary significantly in different learning stages. Inspired by this insights, we proposed a novel self-paced dynamic curriculum learning (SPDCL) method for imbalanced text classification, which evaluates the sample difficulty by both linguistic character and model capacity. Meanwhile, rather than using static curriculum learning as in the existing research, our SPDCL can reorder and resample training data by difficulty criterion with an adaptive from easy to hard pace. The extensive experiments on several classification tasks show the effectiveness of SPDCL strategy, especially for the imbalanced dataset.

difficulty criterion, machine learning, natural language, (18 more...)

2210.14724

Country: Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.92)

Dai, Xiang, Chalkidis, Ilias, Darkner, Sune, Elliott, Desmond

Revisiting Transformer-based Models for Long Document Classification

The recent literature in text classification is biased towards short text sequences (e.g., sentences or paragraphs). In real-world applications, multi-page multi-paragraph documents are common and they cannot be efficiently encoded by vanilla Transformer-based models. We compare different Transformer-based Long Document Classification (TrLDC) approaches that aim to mitigate the computational overhead of vanilla transformers to encode much longer text, namely sparse attention and hierarchical encoding methods. We examine several aspects of sparse attention (e.g., size of local attention window, use of global attention) and hierarchical (e.g., document splitting strategy) transformers on four document classification datasets covering different domains. We observe a clear benefit from being able to process longer text, and, based on our results, we derive practical advice of applying Transformer-based models on long document classification tasks.

large language model, machine learning, natural language, (18 more...)

2204.06683

Country:

Europe > Denmark > Capital Region > Copenhagen (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report > New Finding (0.49)

Industry:

Health & Medicine (1.00)
Law (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Mekala, Dheeraj, Dong, Chengyu, Shang, Jingbo

LOPS: Learning Order Inspired Pseudo-Label Selection for Weakly Supervised Text Classification

Weakly supervised text classification methods typically train a deep neural classifier based on pseudo-labels. The quality of pseudo-labels is crucial to final performance but they are inevitably noisy due to their heuristic nature, so selecting the correct ones has a huge potential for performance boost. One straightforward solution is to select samples based on the softmax probability scores in the neural classifier corresponding to their pseudo-labels. However, we show through our experiments that such solutions are ineffective and unstable due to the erroneously high-confidence predictions from poorly calibrated models. Recent studies on the memorization effects of deep neural models suggest that these models first memorize training samples with clean labels and then those with noisy labels. Inspired by this observation, we propose a novel pseudo-label selection method LOPS that takes learning order of samples into consideration. We hypothesize that the learning order reflects the probability of wrong annotation in terms of ranking, and therefore, propose to select the samples that are learnt earlier. LOPS can be viewed as a strong performance-boost plug-in to most of existing weakly-supervised text classification methods, as confirmed in extensive experiments on four real-world datasets.

classifier, machine learning, natural language, (19 more...)

2205.12528

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Florida (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(6 more...)

Genre: Research Report > Experimental Study (0.68)

Industry: Leisure & Entertainment > Sports > Soccer (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Leveraging QA Datasets to Improve Generative Data Augmentation

Mekala, Dheeraj, Vu, Tu, Schick, Timo, Shang, Jingbo

The ability of generative language models (GLMs) to generate text has improved considerably in the last few years, enabling their use for generative data augmentation. In this work, we propose CONDA, an approach to further improve GLMs' ability to generate synthetic data by reformulating data generation as context generation for a given question-answer (QA) pair and leveraging QA datasets for training context generators. Then, we cast downstream tasks into the same question answering format and adapt the fine-tuned context generators to the target task domain. Finally, we use the fine-tuned GLM to generate relevant contexts, which are in turn used as synthetic training data for their corresponding tasks. We perform extensive experiments on multiple classification datasets and demonstrate substantial improvements in performance for both few- and zero-shot settings. Our analysis reveals that QA datasets that require high-level reasoning abilities (e.g., abstractive and common-sense QA datasets) tend to give the best boost in performance in both few-shot and zero-shot settings.

large language model, natural language, question answering, (18 more...)

2205.12604

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Dominican Republic (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(12 more...)

Genre: Research Report (0.64)

Industry:

Media > Film (0.93)
Education (0.88)
Leisure & Entertainment > Sports (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.54)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.47)

arXiv.org Artificial IntelligenceOct-24-2022

Generating Hierarchical Explanations on Text Classification Without Connecting Rules

Ju, Yiming, Zhang, Yuanzhe, Liu, Kang, Zhao, Jun

The opaqueness of deep NLP models has motivated the development of methods for interpreting how deep models predict. Recently, work has introduced hierarchical attribution, which produces a hierarchical clustering of words, along with an attribution score for each cluster. However, existing work on hierarchical attribution all follows the connecting rule, limiting the cluster to a continuous span in the input text. We argue that the connecting rule as an additional prior may undermine the ability to reflect the model decision process faithfully. To this end, we propose to generate hierarchical explanations without the connecting rule and introduce a framework for generating hierarchical clusters. Experimental results and further analysis show the effectiveness of the proposed method in providing high-quality explanations for reflecting model predicting process.

generating hierarchical explanation, natural language, text classification, (2 more...)

2210.1327

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.40)