AITopics

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.65)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.40)

Neural Information Processing SystemsApr-6-2023, 12:28:34 GMT

Monte Carlo Methods for Maximum Margin Supervised Topic Models

An effective strategy to exploit the supervising side information for discovering predictive topic representations is to impose discriminative constraints induced by such information on the posterior distributions under a topic model. This strategy has been adopted by a number of supervised topic models, such as MedLDA, which employs max-margin posterior constraints. However, unlike the likelihood-based supervised topic models, of which posterior inference can be carried out using the Bayes' rule, the max-margin posterior constraints have made Monte Carlo methods infeasible or at least not directly applicable, thereby limited the choice of inference algorithms to be based on variational approximation with strict mean field assumptions. In this paper, we develop two efficient Monte Carlo methods under much weaker assumptions for max-margin supervised topic models based on an importance sampler and a collapsed Gibbs sampler, respectively, in a convex dual formulation. We report thorough experimental results that compare our approach favorably against existing alternatives in both accuracy and efficiency.

artificial intelligence, maximum margin supervised topic model, natural language, (4 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Neural Information Processing SystemsApr-6-2023, 12:22:52 GMT

Symmetric Correspondence Topic Models for Multilingual Text Analysis

Topic modeling is a widely used approach to analyzing large text collections. A small number of multilingual topic models have recently been explored to discover latent topics among parallel or comparable documents, such as in Wikipedia. Other topic models that were originally proposed for structured data are also applicable to multilingual documents. Correspondence Latent Dirichlet Allocation (CorrLDA) is one such model; however, it requires a pivot language to be specified in advance. We propose a new topic model, Symmetric Correspondence LDA (SymCorrLDA), that incorporates a hidden variable to control a pivot language, in an extension of CorrLDA.

multilingual text analysis, multilingual topic model, symmetric correspondence topic model, (2 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Neural Information Processing SystemsApr-6-2023, 12:07:23 GMT

Lexical and Hierarchical Topic Regression

Inspired by a two-level theory that unifies agenda setting and ideological framing, we propose supervised hierarchical latent Dirichlet allocation (SHLDA) which jointly captures documents' multi-level topic structure and their polar response variables. Our model extends the nested Chinese restaurant process to discover a tree-structured topic hierarchy and uses both per-topic hierarchical and per-word lexical regression parameters to model the response variables. Experiments in a political domain and on sentiment analysis tasks show that SHLDA improves predictive accuracy while adding a new dimension of insight into how topics under discussion are framed.

lexical and hierarchical topic regression, response variable

Country: Asia > Middle East > Jordan (0.13)

Technology: Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.71)

#artificialintelligenceApr-4-2023, 10:07:27 GMT

How to Create a Sentiment Analysis Model From Scratch

Sentiment analysis is a natural language processing (NLP) technique that identifies the attitude behind a text. It is also known as opinion mining. The goal of sentiment analysis is to identify whether a certain text has positive, negative, or neutral sentiment. It is widely used by businesses to automatically classify the sentiment in customer reviews. Analyzing large volumes of reviews helps gain valuable insights into the customers' preferences.

dataset, sentiment, training and testing, (13 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Mane, Swapnil, Khatavkar, Vaibhav

Polarity based Sarcasm Detection using Semigraph

arXiv.org Artificial IntelligenceApr-3-2023

Sarcasm is an advanced linguistic expression often found on various online platforms. Sarcasm detection is challenging in natural language processing tasks that affect sentiment analysis. This article presents the inventive method of the semigraph, including semigraph construction and sarcasm detection processes. A variation of the semigraph is suggested in the pattern-relatedness of the text document. The proposed method is to obtain the sarcastic and non-sarcastic polarity scores of a document using a semigraph. The sarcastic polarity score represents the possibility that a document will become sarcastic. Sarcasm is detected based on the polarity scoring model. The performance of the proposed model enhances the existing prior art approach to sarcasm detection. In the Amazon product review, the model achieved the accuracy, recall, and f-measure of 0.87, 0.79, and 0.83, respectively.

artificial intelligence, machine learning, natural language, (17 more...)

2304.01424

Country:

North America > United States > Virginia (0.04)
Asia > Singapore (0.04)
Asia > India > Maharashtra > Pune (0.04)

Genre: Research Report (1.00)

Industry: Retail (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.94)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.88)
(2 more...)

Antypas, Dimosthenis, Preece, Alun, Camacho-Collados, Jose

Negativity Spreads Faster: A Large-Scale Multilingual Twitter Analysis on the Role of Sentiment in Political Communication

arXiv.org Artificial IntelligenceApr-3-2023

Social media has become extremely influential when it comes to policy making in modern societies, especially in the western world, where platforms such as Twitter allow users to follow politicians, thus making citizens more involved in political discussion. In the same vein, politicians use Twitter to express their opinions, debate among others on current topics and promote their political agendas aiming to influence voter behaviour. In this paper, we attempt to analyse tweets of politicians from three European countries and explore the virality of their tweets. Previous studies have shown that tweets conveying negative sentiment are likely to be retweeted more frequently. By utilising state-of-the-art pre-trained language models, we performed sentiment analysis on hundreds of thousands of tweets collected from members of parliament in Greece, Spain and the United Kingdom, including devolved administrations. We achieved this by systematically exploring and analysing the differences between influential and less popular tweets. Our analysis indicates that politicians' negatively charged tweets spread more widely, especially in more recent times, and highlights interesting differences between political parties as well as between politicians and the general population.

machine learning, natural language, tweet, (20 more...)

doi: 10.1016/j.osnem.2023.100242

2202.00396

Country:

Europe > Greece (0.27)
Europe > United Kingdom > Northern Ireland (0.14)
North America > Haiti (0.14)
(18 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.94)

Industry:

Information Technology > Services (1.00)
Government > Voting & Elections (1.00)
Government > Regional Government (0.93)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

arXiv.org Artificial IntelligenceApr-2-2023

Words that Wound: The Impact of Biased Language on News Sentiment and Stock Market Index

Kim, Wonseong

This study investigates the impact of biased language, specifically 'Words that Wound,' on sentiment analysis in a dataset of 45,379 South Korean daily economic news articles. Using Word2Vec, cosine similarity, and an expanded lexicon, we analyzed the influence of these words on news titles' sentiment scores. Our findings reveal that incorporating biased language significantly amplifies sentiment scores' intensity, particularly negativity. The research examines the effect of heightened negativity in news titles on the KOSPI200 index using linear regression and sentiment analysis. Results indicate that the augmented sentiment lexicon (Sent1000), which includes the top 1,000 negative words with high cosine similarity to 'Crisis,' more effectively captures the impact of news sentiment on the stock market index than the original KNU sentiment lexicon (Sent0). The ARDL model and Impulse Response Function (IRF) analyses disclose that Sent1000 has a stronger and more persistent impact on KOSPI200 compared to Sent0. These findings emphasize the importance of understanding language's role in shaping market dynamics and investor sentiment, particularly the impact of negatively biased language on stock market indices. The study highlights the need for considering context and linguistic nuances when analyzing news content and its potential effects on public opinion and market dynamics.

machine learning, natural language, sentiment, (20 more...)

2304.00468

Country: Asia > South Korea (0.35)

Genre: Research Report > New Finding (1.00)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Bounaama, Rabia, Abderrahim, Mohammed El Amine

Classifying COVID-19 Related Tweets for Fake News Detection and Sentiment Analysis with BERT-based Models

arXiv.org Artificial IntelligenceApr-2-2023

The present paper is about the participation of our team "techno" on CERIST'22 shared tasks. We used an available dataset "task1.c" related to covid-19 pandemic. It comprises 4128 tweets for sentiment analysis task and 8661 tweets for fake news detection task. We used natural language processing tools with the combination of the most renowned pre-trained language models BERT (Bidirectional Encoder Representations from Transformers). The results shows the efficacy of pre-trained language models as we attained an accuracy of 0.93 for the sentiment analysis task and 0.90 for the fake news detection task.

artificial intelligence, machine learning, natural language, (15 more...)

2304.00636

Country:

Asia > Middle East > Saudi Arabia (0.16)
Africa > Middle East > Algeria > Tlemcen Province > Tlemcen (0.05)
North America > United States > California (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.93)
Health & Medicine > Therapeutic Area > Immunology (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

arXiv.org Artificial IntelligenceApr-1-2023

When Crowd Meets Persona: Creating a Large-Scale Open-Domain Persona Dialogue Corpus

Cho, Won Ik, Lee, Yoon Kyung, Bae, Seoyeon, Kim, Jihwan, Park, Sangah, Kim, Moosung, Hahn, Sowon, Kim, Nam Soo

Building a natural language dataset requires caution since word semantics is vulnerable to subtle text change or the definition of the annotated concept. Such a tendency can be seen in generative tasks like question-answering and dialogue generation and also in tasks that create a categorization-based corpus, like topic classification or sentiment analysis. Open-domain conversations involve two or more crowdworkers freely conversing about any topic, and collecting such data is particularly difficult for two reasons: 1) the dataset should be ``crafted" rather than ``obtained" due to privacy concerns, and 2) paid creation of such dialogues may differ from how crowdworkers behave in real-world settings. In this study, we tackle these issues when creating a large-scale open-domain persona dialogue corpus, where persona implies that the conversation is performed by several actors with a fixed persona and user-side workers from an unspecified crowd.

artificial intelligence, dialogue, natural language, (20 more...)

2304.0035

Country:

Asia > South Korea > Seoul > Seoul (0.04)
North America > United States > Texas (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Personal > Interview (0.93)

Industry:

Health & Medicine > Therapeutic Area (0.67)
Information Technology > Security & Privacy (0.66)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.46)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.34)