AITopics | Bucur, Ana-Maria

Collaborating Authors

Bucur, Ana-Maria

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Datasets for Depression Modeling in Social Media: An Overview

Bucur, Ana-Maria, Moldovan, Andreea-Codrina, Parvatikar, Krutika, Zampieri, Marcos, KhudaBukhsh, Ashiqur R., Dinu, Liviu P.

arXiv.org Artificial IntelligenceMar-27-2025

Depression is the most common mental health disorder, and its prevalence increased during the COVID-19 pandemic. As one of the most extensively researched psychological conditions, recent research has increasingly focused on leveraging social media data to enhance traditional methods of depression screening. This paper addresses the growing interest in interdisciplinary research on depression, and aims to support early-career researchers by providing a comprehensive and up-to-date list of datasets for analyzing and predicting depression through social media data. We present an overview of datasets published between 2019 and 2024. We also make the comprehensive list of datasets available online as a continuously updated resource, with the hope that it will facilitate further interdisciplinary research into the linguistic expressions of depression on social media.

artificial intelligence, machine learning, proceedings, (17 more...)

arXiv.org Artificial Intelligence

2503.21513

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

BRIGHTER: BRIdging the Gap in Human-Annotated Textual Emotion Recognition Datasets for 28 Languages

Muhammad, Shamsuddeen Hassan, Ousidhoum, Nedjma, Abdulmumin, Idris, Wahle, Jan Philip, Ruas, Terry, Beloucif, Meriem, de Kock, Christine, Surange, Nirmal, Teodorescu, Daniela, Ahmad, Ibrahim Said, Adelani, David Ifeoluwa, Aji, Alham Fikri, Ali, Felermino D. M. A., Alimova, Ilseyar, Araujo, Vladimir, Babakov, Nikolay, Baes, Naomi, Bucur, Ana-Maria, Bukula, Andiswa, Cao, Guanqun, Cardenas, Rodrigo Tufino, Chevi, Rendi, Chukwuneke, Chiamaka Ijeoma, Ciobotaru, Alexandra, Dementieva, Daryna, Gadanya, Murja Sani, Geislinger, Robert, Gipp, Bela, Hourrane, Oumaima, Ignat, Oana, Lawan, Falalu Ibrahim, Mabuya, Rooweither, Mahendra, Rahmad, Marivate, Vukosi, Piper, Andrew, Panchenko, Alexander, Ferreira, Charles Henrique Porto, Protasov, Vitaly, Rutunda, Samuel, Shrivastava, Manish, Udrea, Aura Cristina, Wanzare, Lilian Diana Awuor, Wu, Sophie, Wunderlich, Florian Valentin, Zhafran, Hanif Muhammad, Zhang, Tianhui, Zhou, Yi, Mohammad, Saif M.

arXiv.org Artificial IntelligenceFeb-17-2025

People worldwide use language in subtle and complex ways to express emotions. While emotion recognition -- an umbrella term for several NLP tasks -- significantly impacts different applications in NLP and other fields, most work in the area is focused on high-resource languages. Therefore, this has led to major disparities in research and proposed solutions, especially for low-resource languages that suffer from the lack of high-quality datasets. In this paper, we present BRIGHTER-- a collection of multilabeled emotion-annotated datasets in 28 different languages. BRIGHTER covers predominantly low-resource languages from Africa, Asia, Eastern Europe, and Latin America, with instances from various domains annotated by fluent speakers. We describe the data collection and annotation processes and the challenges of building these datasets. Then, we report different experimental results for monolingual and crosslingual multi-label emotion identification, as well as intensity-level emotion recognition. We investigate results with and without using LLMs and analyse the large variability in performance across languages and text domains. We show that BRIGHTER datasets are a step towards bridging the gap in text-based emotion recognition and discuss their impact and utility.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.11926

Country:

North America > United States (1.00)
Europe (1.00)
Asia (0.88)
(2 more...)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (1.00)

Add feedback

On the State of NLP Approaches to Modeling Depression in Social Media: A Post-COVID-19 Outlook

Bucur, Ana-Maria, Moldovan, Andreea-Codrina, Parvatikar, Krutika, Zampieri, Marcos, KhudaBukhsh, Ashiqur R., Dinu, Liviu P.

arXiv.org Artificial IntelligenceOct-11-2024

Computational approaches to predicting mental health conditions in social media have been substantially explored in the past years. Multiple surveys have been published on this topic, providing the community with comprehensive accounts of the research in this area. Among all mental health conditions, depression is the most widely studied due to its worldwide prevalence. The COVID-19 global pandemic, starting in early 2020, has had a great impact on mental health worldwide. Harsh measures employed by governments to slow the spread of the virus (e.g., lockdowns) and the subsequent economic downturn experienced in many countries have significantly impacted people's lives and mental health. Studies have shown a substantial increase of above 50% in the rate of depression in the population. In this context, we present a survey on natural language processing (NLP) approaches to modeling depression in social media, providing the reader with a post-COVID-19 outlook. This survey contributes to the understanding of the impacts of the pandemic on modeling depression in social media. We outline how state-of-the-art approaches and new datasets have been used in the context of the COVID-19 pandemic. Finally, we also discuss ethical issues in collecting and processing mental health data, considering fairness, accountability, and ethics.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2410.08793

Country:

Europe (1.00)
North America > United States > New Mexico (0.14)
Asia > Middle East > UAE (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)
Research Report > Promising Solution (0.66)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Automatic Extraction of the Romanian Academic Word List: Data and Methods

Bucur, Ana-Maria, Dincă, Andreea, Chitez, Mădălina, Rogobete, Roxana

arXiv.org Artificial IntelligenceJul-29-2023

This paper presents the methodology and data used for the automatic extraction of the Romanian Academic Word List (Ro-AWL). Academic Word Lists are useful in both L2 and L1 teaching contexts. For the Romanian language, no such resource exists so far. Ro-AWL has been generated by combining methods from corpus and computational linguistics with L2 academic writing approaches. We use two types of data: (a) existing data, such as the Romanian Frequency List based on the ROMBAC corpus, and (b) self-compiled data, such as the expert academic writing corpus EXPRES. For constructing the academic word list, we follow the methodology for building the Academic Vocabulary List for the English language. The distribution of Ro-AWL features (general distribution, POS distribution) into four disciplinary datasets is in line with previous research. Ro-AWL is freely available and can be used for teaching, research and NLP applications.

artificial intelligence, corpus, natural language, (14 more...)

arXiv.org Artificial Intelligence

2307.16045

Country: Europe > Romania (0.29)

Genre: Research Report (0.82)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.60)

Add feedback

Utilizing ChatGPT Generated Data to Retrieve Depression Symptoms from Social Media

Bucur, Ana-Maria

arXiv.org Artificial IntelligenceJul-6-2023

In this work, we present the contribution of the BLUE team in the eRisk Lab task on searching for symptoms of depression. The task consists of retrieving and ranking Reddit social media sentences that convey symptoms of depression from the BDI-II questionnaire. Given that synthetic data provided by LLMs have been proven to be a reliable method for augmenting data and fine-tuning downstream models, we chose to generate synthetic data using ChatGPT for each of the symptoms of the BDI-II questionnaire. We designed a prompt such that the generated data contains more richness and semantic diversity than the BDI-II responses for each question and, at the same time, contains emotional and anecdotal experiences that are specific to the more intimate way of sharing experiences on Reddit. We perform semantic search and rank the sentences' relevance to the BDI-II symptoms by cosine similarity. We used two state-of-the-art transformer-based models (MentalRoBERTa and a variant of MPNet) for embedding the social media posts, the original and generated responses of the BDI-II. Our results show that using sentence embeddings from a model designed for semantic search outperforms the approach using embeddings from a model pre-trained on mental health data. Furthermore, the generated synthetic data were proved too specific for this task, the approach simply relying on the BDI-II responses had the best performance.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2307.02313

Country:

Europe > Greece (0.14)
Europe > Spain (0.14)
Europe > Romania (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

It's Just a Matter of Time: Detecting Depression with Time-Enriched Multimodal Transformers

Bucur, Ana-Maria, Cosma, Adrian, Rosso, Paolo, Dinu, Liviu P.

arXiv.org Artificial IntelligenceFeb-6-2023

Depression detection from user-generated content on the internet has been a long-lasting topic of interest in the research community, providing valuable screening tools for psychologists. The ubiquitous use of social media platforms lays out the perfect avenue for exploring mental health manifestations in posts and interactions with other users. Current methods for depression detection from social media mainly focus on text processing, and only a few also utilize images posted by users. In this work, we propose a flexible time-enriched multimodal transformer architecture for detecting depression from social media posts, using pretrained models for extracting image and text embeddings. Our model operates directly at the user-level, and we enrich it with the relative time between posts by using time2vec positional embeddings. Moreover, we propose another model variant, which can operate on randomly sampled and unordered sets of posts to be more robust to dataset noise. We show that our method, using EmoBERTa and CLIP embeddings, surpasses other methods on two multimodal datasets, obtaining state-of-the-art results of 0.931 F1 score on a popular multimodal Twitter dataset, and 0.902 F1 score on the only multimodal Reddit dataset.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2301.05453

Country: Europe > Spain (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Information Technology (0.94)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Detecting Early Onset of Depression from Social Media Text using Learned Confidence Scores

Bucur, Ana-Maria, Dinu, Liviu P.

arXiv.org Machine LearningNov-3-2020

Computational research on mental health disorders from written texts covers an interdisciplinary area between natural language processing and psychology. A crucial aspect of this problem is prevention and early diagnosis, as suicide resulted from depression being the second leading cause of death for young adults. In this work, we focus on methods for detecting the early onset of depression from social media texts, in particular from Reddit. To that end, we explore the eRisk 2018 dataset and achieve good results with regard to the state of the art by leveraging topic analysis and learned confidence scores to guide the decision process.

attention deficit hyperactivity disorder, detection, neural network, (20 more...)

arXiv.org Machine Learning

2011.01695

Country: Europe > Romania (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback