AITopics

2307.13631

Country:

Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Netherlands > South Holland > Delft (0.04)
(15 more...)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
(7 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
(9 more...)

arXiv.org Artificial IntelligenceJul-25-2023

Unlocking the Emotional World of Visual Media: An Overview of the Science, Research, and Impact of Understanding Emotion

Wang, James Z., Zhao, Sicheng, Wu, Chenyan, Adams, Reginald B., Newman, Michelle G., Shafir, Tal, Tsachor, Rachelle

The emergence of artificial emotional intelligence technology is revolutionizing the fields of computers and robotics, allowing for a new level of communication and understanding of human behavior that was once thought impossible. While recent advancements in deep learning have transformed the field of computer vision, automated understanding of evoked or expressed emotions in visual media remains in its infancy. This foundering stems from the absence of a universally accepted definition of "emotion", coupled with the inherently subjective nature of emotions and their intricate nuances. In this article, we provide a comprehensive, multidisciplinary overview of the field of emotion analysis in visual media, drawing on insights from psychology, engineering, and the arts. We begin by exploring the psychological foundations of emotion and the computational principles that underpin the understanding of emotions from images and videos. We then review the latest research and systems within the field, accentuating the most promising approaches. We also discuss the current technological challenges and limitations of emotion analysis, underscoring the necessity for continued investigation and innovation. We contend that this represents a "Holy Grail" research problem in computing and delineate pivotal directions for future inquiry. Finally, we examine the ethical ramifications of emotion-understanding technologies and contemplate their potential societal impacts. Overall, this article endeavors to equip readers with a deeper understanding of the domain of emotion analysis in visual media and to inspire further research and development in this captivating and rapidly evolving field.

data mining, machine learning, pattern analysis and machine intelligence, (25 more...)

doi: 10.1109/JPROC.2023.3273517

2307.13463

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.13)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(22 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)
Research Report > Promising Solution (0.65)

Industry:

Leisure & Entertainment (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)
(4 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Data Science > Data Mining (1.00)
(10 more...)

Hagerer, Gerhard Johann, Szabo, David, Koch, Andreas, Dominguez, Maria Luisa Ripoll, Widmer, Christian, Wich, Maximilian, Danner, Hannah, Groh, Georg

End-to-End Annotator Bias Approximation on Crowdsourced Single-Label Sentiment Analysis

arXiv.org Artificial IntelligenceJul-24-2023

Sentiment analysis is often a crowdsourcing task prone to subjective labels given by many annotators. It is not yet fully understood how the annotation bias of each annotator can be modeled correctly with state-of-the-art methods. However, resolving annotator bias precisely and reliably is the key to understand annotators' labeling behavior and to successfully resolve corresponding individual misconceptions and wrongdoings regarding the annotation task. Our contribution is an explanation and improvement for precise neural end-to-end bias modeling and ground truth estimation, which reduces an undesired mismatch in that regard of the existing state-of-the-art. Classification experiments show that it has potential to improve accuracy in cases where each sample is annotated only by one single annotator. We provide the whole source code publicly and release an own domain-specific sentiment dataset containing 10,000 sentences discussing organic food products. These are crawled from social media and are singly labeled by 10 non-expert annotators.

annotator, machine learning, natural language, (19 more...)

2111.02326

Country:

North America > United States > California > San Diego County > San Diego (0.05)
Asia > Middle East > Jordan (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
(7 more...)

Genre: Research Report (0.84)

Industry:

Health & Medicine (0.49)
Food & Agriculture (0.46)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.71)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Zhou, Yongxin, Ringeval, Fabien, Portet, François

Evaluating Emotional Nuances in Dialogue Summarization

Automatic dialogue summarization is a well-established task that aims to identify the most important content from human conversations to create a short textual summary. Despite recent progress in the field, we show that most of the research has focused on summarizing the factual information, leaving aside the affective content, which can yet convey useful information to analyse, monitor, or support human interactions. In this paper, we propose and evaluate a set of measures $PEmo$, to quantify how much emotion is preserved in dialog summaries. Results show that, summarization models of the state-of-the-art do not preserve well the emotional content in the summaries. We also show that by reducing the training set to only emotional dialogues, the emotional content is better preserved in the generated summaries, while conserving the most salient factual information.

computational linguistic, machine learning, natural language, (15 more...)

2307.12371

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Maine > Kennebec County > Waterville (0.05)
Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.05)
(10 more...)

Genre: Research Report > New Finding (0.88)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.47)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.47)

Kheiri, Kiana, Karimi, Hamid

SentimentGPT: Exploiting GPT for Advanced Sentiment Analysis and its Departure from Current Machine Learning

This study presents a thorough examination of various Generative Pretrained Transformer (GPT) methodologies in sentiment analysis, specifically in the context of Task 4 on the SemEval 2017 dataset. Three primary strategies are employed: 1) prompt engineering using the advanced GPT-3.5 Turbo, 2) fine-tuning GPT models, and 3) an inventive approach to embedding classification. The research yields detailed comparative insights among these strategies and individual GPT models, revealing their unique strengths and potential limitations. Additionally, the study compares these GPT-based methodologies with other current, high-performing models previously used with the same dataset. The results illustrate the significant superiority of the GPT approaches in terms of predictive performance, more than 22\% in F1-score compared to the state-of-the-art. Further, the paper sheds light on common challenges in sentiment analysis tasks, such as understanding context and detecting sarcasm. It underscores the enhanced capabilities of the GPT models to effectively handle these complexities. Taken together, these findings highlight the promising potential of GPT models in sentiment analysis, setting the stage for future research in this field. The code can be found at https://github.com/DSAatUSU/SentimentGPT

large language model, machine learning, sentiment analysis, (17 more...)

2307.10234

Country:

North America > United States > Utah > Cache County > Logan (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Russia (0.04)
(3 more...)

Genre:

Research Report > Promising Solution (0.48)
Overview > Innovation (0.48)

Industry:

Information Technology > Security & Privacy (1.00)
Education > Educational Setting (0.93)
Health & Medicine > Therapeutic Area (0.68)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
(2 more...)

Tsirmpas, Dimitrios, Gkionis, Ioannis, Mademlis, Ioannis, Papadopoulos, Georgios

Neural Natural Language Processing for Long Texts: A Survey of the State-of-the-Art

The adoption of Deep Neural Networks (DNNs) has greatly benefited Natural Language Processing (NLP) during the past decade. However, the demands of long document analysis are quite different from those of shorter texts, while the ever increasing size of documents uploaded on-line renders automated understanding of lengthy texts a critical issue. Relevant applications include automated Web mining, legal document review, medical records analysis, financial reports analysis, contract management, environmental impact assessment, news aggregation, etc. Despite the relatively recent development of efficient algorithms for analyzing long documents, practical tools in this field are currently flourishing. This article serves as an entry point into this dynamic domain and aims to achieve two objectives. Firstly, it provides an overview of the relevant neural building blocks, serving as a concise tutorial for the field. Secondly, it offers a brief examination of the current state-of-the-art in long document NLP, with a primary focus on two key tasks: document classification and document summarization. Sentiment analysis for long texts is also covered, since it is typically treated as a particular case of document classification. Consequently, this article presents an introductory exploration of document-level analysis, addressing the primary challenges, concerns, and existing solutions. Finally, the article presents publicly available annotated datasets that can facilitate further research in this area.

large language model, machine learning, natural language, (21 more...)

2305.16259

Country:

North America > United States > New Jersey (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Colorado (0.04)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.45)

Industry:

Law (1.00)
Media > News (0.65)
Health & Medicine > Health Care Technology > Medical Record (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
(3 more...)

Target-oriented Sentiment Classification with Sequential Cross-modal Semantic Graph

Huang, Yufeng, Chen, Zhuo, Chen, Jiaoyan, Pan, Jeff Z., Yao, Zhen, Zhang, Wen

Multi-modal aspect-based sentiment classification (MABSC) is task of classifying the sentiment of a target entity mentioned in a sentence and an image. However, previous methods failed to account for the fine-grained semantic association between the image and the text, which resulted in limited identification of fine-grained image aspects and opinions. To address these limitations, in this paper we propose a new approach called SeqCSG, which enhances the encoder-decoder sentiment classification framework using sequential cross-modal semantic graphs. SeqCSG utilizes image captions and scene graphs to extract both global and local fine-grained image information and considers them as elements of the cross-modal semantic graph along with tokens from tweets. The sequential cross-modal semantic graph is represented as a sequence with a multi-modal adjacency matrix indicating relationships between elements. Experimental results show that the approach outperforms existing methods and achieves state-of-the-art performance on two standard datasets. Further analysis has demonstrated that the model can implicitly learn the correlation between fine-grained information of the image and the text with the given target. Our code is available at https://github.com/zjukg/SeqCSG.

artificial intelligence, information, natural language, (16 more...)

2208.09417

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)

arXiv.org Artificial IntelligenceJul-21-2023

NusaCrowd: Open Source Initiative for Indonesian NLP Resources

Cahyawijaya, Samuel, Lovenia, Holy, Aji, Alham Fikri, Winata, Genta Indra, Wilie, Bryan, Mahendra, Rahmad, Wibisono, Christian, Romadhony, Ade, Vincentio, Karissa, Koto, Fajri, Santoso, Jennifer, Moeljadi, David, Wirawan, Cahya, Hudi, Frederikus, Parmonangan, Ivan Halim, Alfina, Ika, Wicaksono, Muhammad Satrio, Putra, Ilham Firdausi, Rahmadani, Samsul, Oenang, Yulianti, Septiandri, Ali Akbar, Jaya, James, Dhole, Kaustubh D., Suryani, Arie Ardiyanti, Putri, Rifki Afina, Su, Dan, Stevens, Keith, Nityasya, Made Nindyatama, Adilazuarda, Muhammad Farid, Ignatius, Ryan, Diandaru, Ryandito, Yu, Tiezheng, Ghifari, Vito, Dai, Wenliang, Xu, Yan, Damapuspita, Dyah, Tho, Cuk, Karo, Ichwanul Muslim Karo, Fatyanosa, Tirana Noor, Ji, Ziwei, Fung, Pascale, Neubig, Graham, Baldwin, Timothy, Ruder, Sebastian, Sujaini, Herry, Sakti, Sakriani, Purwarianti, Ayu

We present NusaCrowd, a collaborative initiative to collect and unify existing resources for Indonesian languages, including opening access to previously non-public resources. Through this initiative, we have brought together 137 datasets and 118 standardized data loaders. The quality of the datasets has been assessed manually and automatically, and their value is demonstrated through multiple experiments. NusaCrowd's data collection enables the creation of the first zero-shot benchmarks for natural language understanding and generation in Indonesian and the local languages of Indonesia. Furthermore, NusaCrowd brings the creation of the first multilingual automatic speech recognition benchmark in Indonesian and the local languages of Indonesia. Our work strives to advance natural language processing (NLP) research for languages that are under-represented despite being widely spoken.

large language model, machine learning, natural language, (24 more...)

2212.09648

Country:

North America > United States > Texas > Dallas County > Dallas (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Timor-Leste (0.14)
(64 more...)

Genre: Research Report > New Finding (0.45)

Industry:

Law (0.67)
Government (0.67)
Information Technology > Services (0.67)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
(5 more...)

Chavan, Tanmay, Gokhale, Omkar, Kane, Aditya, Patankar, Shantanu, Joshi, Raviraj

My Boli: Code-mixed Marathi-English Corpora, Pretrained Language Models and Evaluation Benchmarks

arXiv.org Artificial IntelligenceJul-20-2023

The research on code-mixed data is limited due to the unavailability of dedicated code-mixed datasets and pre-trained language models. In this work, we focus on the low-resource Indian language Marathi which lacks any prior work in code-mixing. We present L3Cube-MeCorpus, a large code-mixed Marathi-English (Mr-En) corpus with 10 million social media sentences for pretraining. We also release L3Cube-MeBERT and MeRoBERTa, code-mixed BERT-based transformer models pre-trained on MeCorpus. Furthermore, for benchmarking, we present three supervised datasets MeHate, MeSent, and MeLID for downstream tasks like code-mixed Mr-En hate speech detection, sentiment analysis, and language identification respectively. These evaluation datasets individually consist of manually annotated \url{~}12,000 Marathi-English code-mixed tweets. Ablations show that the models trained on this novel corpus significantly outperform the existing state-of-the-art BERT models. This is the first work that presents artifacts for code-mixed Marathi research. All datasets and models are publicly released at https://github.com/l3cube-pune/MarathiNLP .

dataset, marathi, tweet, (10 more...)

2306.1403

Country:

Asia > India > Maharashtra (0.05)
Europe > Ukraine > Kyiv Oblast > Kyiv (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.35)

Landowska, Alina, Robak, Marek, Skorski, Maciej

What Twitter Data Tell Us about the Future?

arXiv.org Artificial IntelligenceJul-20-2023

Anticipation is a fundamental human cognitive ability that involves thinking about and living towards the future. While language markers reflect anticipatory thinking, research on anticipation from the perspective of natural language processing is limited. This study aims to investigate the futures projected by futurists on Twitter and explore the impact of language cues on anticipatory thinking among social media users. We address the research questions of what futures Twitter's futurists anticipate and share, and how these anticipated futures can be modeled from social data. To investigate this, we review related works on anticipation, discuss the influence of language markers and prestigious individuals on anticipatory thinking, and present a taxonomy system categorizing futures into "present futures" and "future present". This research presents a compiled dataset of over 1 million publicly shared tweets by future influencers and develops a scalable NLP pipeline using SOTA models. The study identifies 15 topics from the LDA approach and 100 distinct topics from the BERTopic approach within the futurists' tweets. These findings contribute to the research on topic modelling and provide insights into the futures anticipated by Twitter's futurists. The research demonstrates the futurists' language cues signals futures-in-the-making that enhance social media users to anticipate their own scenarios and respond to them in present. The fully open-sourced dataset, interactive analysis, and reproducible source code are available for further exploration.

artificial intelligence, futures, natural language, (14 more...)

2308.02035

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > Portugal > Lisbon > Lisbon (0.06)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(4 more...)

Genre:

Research Report > New Finding (0.88)
Research Report > Experimental Study (0.66)

Industry:

Information Technology > Services (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.48)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.42)