vlog
We Care: Multimodal Depression Detection and Knowledge Infused Mental Health Therapeutic Response Generation
Moon, Palash, Bhattacharyya, Pushpak
The detection of depression through non-verbal cues has gained significant attention. Previous research predominantly centred on identifying depression within the confines of controlled laboratory environments, often with the supervision of psychologists or counsellors. Unfortunately, datasets generated in such controlled settings may struggle to account for individual behaviours in real-life situations. In response to this limitation, we present the Extended D-vlog dataset, encompassing a collection of 1, 261 YouTube vlogs. Additionally, the emergence of large language models (LLMs) like GPT3.5, and GPT4 has sparked interest in their potential they can act like mental health professionals. Yet, the readiness of these LLM models to be used in real-life settings is still a concern as they can give wrong responses that can harm the users. We introduce a virtual agent serving as an initial contact for mental health patients, offering Cognitive Behavioral Therapy (CBT)-based responses. It comprises two core functions: 1. Identifying depression in individuals, and 2. Delivering CBT-based therapeutic responses. Our Mistral model achieved impressive scores of 70.1% and 30.9% for distortion assessment and classification, along with a Bert score of 88.7%. Moreover, utilizing the TVLT model on our Multimodal Extended D-vlog Dataset yielded outstanding results, with an impressive F1-score of 67.8%
LMVD: A Large-Scale Multimodal Vlog Dataset for Depression Detection in the Wild
He, Lang, Chen, Kai, Zhao, Junnan, Wang, Yimeng, Pei, Ercheng, Chen, Haifeng, Jiang, Jiewei, Zhang, Shiqing, Zhang, Jie, Wang, Zhongmin, He, Tao, Tiwari, Prayag
Depression can significantly impact many aspects of an individual's life, including their personal and social functioning, academic and work performance, and overall quality of life. Many researchers within the field of affective computing are adopting deep learning technology to explore potential patterns related to the detection of depression. However, because of subjects' privacy protection concerns, that data in this area is still scarce, presenting a challenge for the deep discriminative models used in detecting depression. To navigate these obstacles, a large-scale multimodal vlog dataset (LMVD), for depression recognition in the wild is built. In LMVD, which has 1823 samples with 214 hours of the 1475 participants captured from four multimedia platforms (Sina Weibo, Bilibili, Tiktok, and YouTube). A novel architecture termed MDDformer to learn the non-verbal behaviors of individuals is proposed. Extensive validations are performed on the LMVD dataset, demonstrating superior performance for depression detection. We anticipate that the LMVD will contribute a valuable function to the depression detection community. The data and code will released at the link: https://github.com/helang818/LMVD/.
- Asia > China > Shaanxi Province > Xi'an (0.08)
- Europe > Sweden > Halland County > Halmstad (0.05)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- (12 more...)
- Information Technology (1.00)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Education (1.00)
- (3 more...)
MOGAM: A Multimodal Object-oriented Graph Attention Model for Depression Detection
Cha, Junyeop, Kim, Seoyun, Kim, Dongjae, Park, Eunil
Early detection plays a crucial role in the treatment of depression. Therefore, numerous studies have focused on social media platforms, where individuals express their emotions, aiming to achieve early detection of depression. However, the majority of existing approaches often rely on specific features, leading to limited scalability across different types of social media datasets, such as text, images, or videos. To overcome this limitation, we introduce a Multimodal Object-Oriented Graph Attention Model (MOGAM), which can be applied to diverse types of data, offering a more scalable and versatile solution. Furthermore, to ensure that our model can capture authentic symptoms of depression, we only include vlogs from users with a clinical diagnosis. To leverage the diverse features of vlogs, we adopt a multimodal approach and collect additional metadata such as the title, description, and duration of the vlogs. To effectively aggregate these multimodal features, we employed a cross-attention mechanism. MOGAM achieved an accuracy of 0.871 and an F1-score of 0.888. Moreover, to validate the scalability of MOGAM, we evaluated its performance with a benchmark dataset and achieved comparable results with prior studies (0.61 F1-score). In conclusion, we believe that the proposed model, MOGAM, is an effective solution for detecting depression in social media, offering potential benefits in the early detection and treatment of this mental health condition.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Europe > Italy > Tuscany > Florence (0.04)
- Asia > Bangladesh (0.04)
Leveraging Natural Language Processing For Public Health Screening On YouTube: A COVID-19 Case Study
Aslam, Ahrar Bin, Syed, Zafi Sherhan, Khan, Muhammad Faiz, Baloch, Asghar, Syed, Muhammad Shehram Shah
Background: Social media platforms have become a viable source of medical information, with patients and healthcare professionals using them to share health-related information and track diseases. Similarly, YouTube, the largest video-sharing platform in the world contains vlogs where individuals talk about their illnesses. The aim of our study was to investigate the use of Natural Language Processing (NLP) to identify the spoken content of YouTube vlogs related to the diagnosis of Coronavirus disease of 2019 (COVID-19) for public health screening. Methods: COVID-19 videos on YouTube were searched using relevant keywords. A total of 1000 videos being spoken in English were downloaded out of which 791 were classified as vlogs, 192 were non-vlogs, and 17 were deleted by the channel. The videos were converted into a textual format using Microsoft Streams. The textual data was preprocessed using basic and advanced preprocessing methods. A lexicon of 200 words was created which contained words related to COVID-19. The data was analyzed using topic modeling, word clouds, and lexicon matching. Results: The word cloud results revealed discussions about COVID-19 symptoms like "fever", along with generic terms such as "mask" and "isolation". Lexical analysis demonstrated that in 96.46% of videos, patients discussed generic terms, and in 95.45% of videos, people talked about COVID-19 symptoms. LDA Topic Modeling results also generated topics that successfully captured key themes and content related to our investigation of COVID-19 diagnoses in YouTube vlogs. Conclusion: By leveraging NLP techniques on YouTube vlogs public health practitioners can enhance their ability to mitigate the effects of pandemics and effectively respond to public health challenges.
- Asia > Pakistan > Sindh > Hyderabad Division > Jamshoro (0.04)
- North America > United States > North Carolina (0.04)
- North America > Canada (0.04)
- (3 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.93)
Five #ChatGPT prompts to help Primary school teachers – ICTEvangelist
If you haven't come across #ChatGPT yet, then what rock have you been hiding under? Every social media platform, news outlet, magazine, blog, podcast and vlog is talking about it. The opportunities provided by #ChatGPT are huge. I talked about it here in this recent post: Will AI Make Us Lazy and Less Creative? I really enjoyed Sara Dietschy's recent vlog on the topic: As Geoff Barton explains in his recent TES article (thanks again Jill Berry for spotlighting that!) the tool is far from perfect.
Meet China's First AI-Powered Virtual University Student
Hua Zhibing officially registered and became a student of Beijing's Tsinghua University on Tuesday. Hua Zhibing's appearance, voice and even the music playing in the background of the vlog she introduced herself to the world in were all created using on a record-breaking AI modeling system called Wudao 2.0. It was unveiled at the 2021 Beijing Academy of Artificial Intelligence (BAAI) Conference on June 1, and, according to its developers, it is the first trillion scale model in China and the largest in the world. Wudao 2.0 is designed to enable machines to think like humans and is reportedly close to passing the Turing test in poetry and couplets creation, text summaries, answering questions and painting. Tsinghua University's newest student will study in the Department of Computer Science and Technology and is expected to grow and learn faster than an average actual person.
- Education > Educational Setting > Higher Education (0.52)
- Media (0.39)
Automated Decision Making and the GDPR - Aphaia: Leading experts in ICT regulation and policy
Artificial Intelligence is increasingly becoming ingrained in all facets of our societies and lives. While it certainly heralds an age of cool futuristic technology and applications--facial recognition and self-driving cars for example!--what about when AI is utilized as an automated decision making tool? Can this pose an issue to an individual's right? What are the possible implications? Are there any legal provisions to ensure fairness?
- Law (0.82)
- Information Technology > Security & Privacy (0.62)