Discourse & Dialogue
Reviews: Reverse engineering recurrent networks for sentiment classification reveals line attractor dynamics
This paper provides insightful analysis into what decision processes are actually implemented by a trained recurrent network for sentiment classification, and uncover simple line attractor dynamics. All reviewers agree that this is interesting and illuminating, and that this work shows a good example of what can be done to open the black box of deep systems.
Review for NeurIPS paper: A Discrete Variational Recurrent Topic Model without the Reparametrization Trick
Summary and Contributions: In this paper, the authors attempted to utilize neural variational inference to construct a neural topic model with discrete random variables, and proposed one model, namely VRTM, which combine1. The exploration of combining RNNs and topic models is interesting and significant, which can help topic models to handle sequence text and capture more text information than the bag-of-word model, which is prevalently utilized in LDA-based topic models. Specifically, when facing the thematic words, VRTM uses both the RNN and topic model predications to generative the next word; however, when facing the syntactic words, only the output of the RNN is utilized to predict the next word. In particular, during the generative process, the discrete topic assignment has been attached to each thematic word, which is beneficial for the Interpretability. To be specific, the authors first designed one reasonable generative model, which can apply different strategies for generating thematic and syntactic words with different inputs, i.e., a mixture of LDA and RNN predications or just the output of the RNN.
Review for NeurIPS paper: A Discrete Variational Recurrent Topic Model without the Reparametrization Trick
Reviews are all on the accept side: 1 top 50% of accepted and 3 marginally above threshold. Only R4 (strong accept) intervened in the discussion. As the main reason for calling this paper borderline was limited novelty compared to [7], I had to proceed to a detailed comparative rereading of this paper to [7]. In my opinion, this approach is very different from [7]. While the authors presented it as only introducing a small modeling difference from [7], this has a huge impact on everything, in particular the resulting DNN architecture and the inference process.
STAR: Stepwise Task Augmentation and Relation Learning for Aspect Sentiment Quad Prediction
Lai, Wenna, Xie, Haoran, Xu, Guandong, Li, Qing
Aspect-based sentiment analysis (ABSA) aims to identify four sentiment elements, including aspect term, aspect category, opinion term, and sentiment polarity. These elements construct the complete picture of sentiments. The most challenging task, aspect sentiment quad prediction (ASQP), predicts these elements simultaneously, hindered by difficulties in accurately coupling different sentiment elements. A key challenge is insufficient annotated data that limits the capability of models in semantic understanding and reasoning about quad prediction. To address this, we propose stepwise task augmentation and relation learning (STAR), a strategy inspired by human reasoning. STAR constructs auxiliary data to learn quadruple relationships incrementally by augmenting with pairwise and overall relation tasks derived from training data. By encouraging the model to infer causal relationships among sentiment elements without requiring additional annotations, STAR effectively enhances quad prediction. Extensive experiments demonstrate the proposed STAR exhibits superior performance on four benchmark datasets.
Multi-View Attention Syntactic Enhanced Graph Convolutional Network for Aspect-based Sentiment Analysis
Huang, Xiang, Peng, Hao, Sun, Shuo, Hao, Zhifeng, Lin, Hui, Wang, Shuhai
Aspect-based Sentiment Analysis (ABSA) is the task aimed at predicting the sentiment polarity of aspect words within sentences. Recently, incorporating graph neural networks (GNNs) to capture additional syntactic structure information in the dependency tree derived from syntactic dependency parsing has been proven to be an effective paradigm for boosting ABSA. Despite GNNs enhancing model capability by fusing more types of information, most works only utilize a single topology view of the dependency tree or simply conflate different perspectives of information without distinction, which limits the model performance. To address these challenges, in this paper, we propose a new multi-view attention syntactic enhanced graph convolutional network (MASGCN) that weighs different syntactic information of views using attention mechanisms. Specifically, we first construct distance mask matrices from the dependency tree to obtain multiple subgraph views for GNNs. To aggregate features from different views, we propose a multi-view attention mechanism to calculate the attention weights of views. Furthermore, to incorporate more syntactic information, we fuse the dependency type information matrix into the adjacency matrices and present a structural entropy loss to learn the dependency type adjacency matrix. Comprehensive experiments on four benchmark datasets demonstrate that our model outperforms state-of-the-art methods. The codes and datasets are available at https://github.com/SELGroup/MASGCN.
Review for NeurIPS paper: Bidirectional Convolutional Poisson Gamma Dynamical Systems
Summary and Contributions: The paper presents a new hierarchical Bayesian model -- convolutional Poisson-Gamma Dynamical Systems (conv-PGDS) -- for generating the observed words in a document corpus. Globally, the model assumes there are K "topic filters", D_1, ... D_K, which are distributions over 3-grams from a finite size vocabulary (size V). Each "topic" (indexed by k) has an appearance probability weight v_k 0 for appearing in a document, and we define transition probability vectors \pi_k Given this global structure, the model generates each document iid. To generate a document j, we use a Gamma dynamical system (with transitions \pi) to obtain a sequence of un-normalized membership "weight embeddings", w_j1 ... w_jT, one for each sentence (indexed by t). Each weight embedding vector w_jt indicates the relative weight of topic k across all words in the sentence t.
Comparative Approaches to Sentiment Analysis Using Datasets in Major European and Arabic Languages
Krasitskii, Mikhail, Kolesnikova, Olga, Hernandez, Liliana Chanona, Sidorov, Grigori, Gelbukh, Alexander
This study explores transformer-based models such as BERT, mBERT, and XLM-R for multilingual sentiment analysis across diverse linguistic structures. Key contributions include the identification of XLM-R's superior adaptability in morphologically complex languages, achieving accuracy levels above 88%. The work highlights fine-tuning strategies and emphasizes their significance for improving sentiment classification in underrepresented languages.
Multi-Modality Collaborative Learning for Sentiment Analysis
Wang, Shanmin, Liu, Chengguang, Liu, Qingshan
Multimodal sentiment analysis (MSA) identifies individuals' sentiment states in videos by integrating visual, audio, and text modalities. Despite progress in existing methods, the inherent modality heterogeneity limits the effective capture of interactive sentiment features across modalities. In this paper, by introducing a Multi-Modality Collaborative Learning (MMCL) framework, we facilitate cross-modal interactions and capture enhanced and complementary features from modality-common and modality-specific representations, respectively. Specifically, we design a parameter-free decoupling module and separate uni-modality into modality-common and modality-specific components through semantics assessment of cross-modal elements. For modality-specific representations, inspired by the act-reward mechanism in reinforcement learning, we design policy models to adaptively mine complementary sentiment features under the guidance of a joint reward. For modality-common representations, intra-modal attention is employed to highlight crucial components, playing enhanced roles among modalities. Experimental results, including superiority evaluations on four databases, effectiveness verification of each module, and assessment of complementary features, demonstrate that MMCL successfully learns collaborative features across modalities and significantly improves performance. The code can be available at https://github.com/smwanghhh/MMCL.
Massively Multilingual Corpus of Sentiment Datasets and Multi-faceted Sentiment Classification Benchmark
Despite impressive advancements in multilingual corpora collection and model training, developing large-scale deployments of multilingual models still presents a significant challenge. This is particularly true for language tasks that are culture-dependent. One such example is the area of multilingual sentiment analysis, where affective markers can be subtle and deeply ensconced in culture.This work presents the most extensive open massively multilingual corpus of datasets for training sentiment models. The corpus consists of 79 manually selected datasets from over 350 datasets reported in the scientific literature based on strict quality criteria. The corpus covers 27 languages representing 6 language families.
Investigating the Impact of Language-Adaptive Fine-Tuning on Sentiment Analysis in Hausa Language Using AfriBERTa
Sani, Sani Abdullahi, Muhammad, Shamsuddeen Hassan, Jarvis, Devon
Sentiment analysis (SA) plays a vital role in Natural Language Processing (NLP) by ~identifying sentiments expressed in text. Although significant advances have been made in SA for widely spoken languages, low-resource languages such as Hausa face unique challenges, primarily due to a lack of digital resources. This study investigates the effectiveness of Language-Adaptive Fine-Tuning (LAFT) to improve SA performance in Hausa. We first curate a diverse, unlabeled corpus to expand the model's linguistic capabilities, followed by applying LAFT to adapt AfriBERTa specifically to the nuances of the Hausa language. The adapted model is then fine-tuned on the labeled NaijaSenti sentiment dataset to evaluate its performance. Our findings demonstrate that LAFT gives modest improvements, which may be attributed to the use of formal Hausa text rather than informal social media data. Nevertheless, the pre-trained AfriBERTa model significantly outperformed models not specifically trained on Hausa, highlighting the importance of using pre-trained models in low-resource contexts. This research emphasizes the necessity for diverse data sources to advance NLP applications for low-resource African languages. We published the code and the dataset to encourage further research and facilitate reproducibility in low-resource NLP here: https://github.com/Sani-Abdullahi-Sani/Natural-Language-Processing/blob/main/Sentiment%20Analysis%20for%20Low%20Resource%20African%20Languages