Goto

Collaborating Authors

 Xu, Junhao


Exploring the Panorama of Anxiety Levels: A Multi-Scenario Study Based on Human-Centric Anxiety Level Detection and Personalized Guidance

arXiv.org Artificial Intelligence

Faculty of Computer Science and Information Technology, University of Malaya, Malaysia Abstract More and more people are under p ressure from work, life and education. Under these pressures, people will develop an anxious state of mind, or even the initial symptoms of suicide. With the advancement of artificial intelligence technology,large language modeling is currently one of the hottest technologies. It is often used for detecting psychological disorders, however, the current study only gives the categorization result, but does not give an interpretable description of what led to this categorization result. Based on all these imma ture studies, this study adopts a person - centered perspective and focuses on GPT - generated multi - scenario simulated conversations. These simulated conversations were selected as data samples for the study. Various transformer - based encoder models were util ized in the study in order to integrate a classification model capable of identifying different anxiety levels. In addition, a knowledge base focusing on anxiety was constructed in this study using Langchain and GPT4. When analyzing the classification resu lts, this knowledge base was able to provide explanations and reasons that were most relevant to the interlocutor's anxiety situation. The study shows that the developed model achieves more than 94% accuracy in categorical prediction and that the advice pr ovided is highly personalized. Mental health is defined as a state of well - being on the mental, emotional, and social levels [8, 16, 34]. Abnormal anxiety is a very important factor that leads to mental health [3, 19, 43].


MiniMax-01: Scaling Foundation Models with Lightning Attention

arXiv.org Artificial Intelligence

We introduce MiniMax-01 series, including MiniMax-Text-01 and MiniMax-VL-01, which are comparable to top-tier models while offering superior capabilities in processing longer contexts. The core lies in lightning attention and its efficient scaling. To maximize computational capacity, we integrate it with Mixture of Experts (MoE), creating a model with 32 experts and 456 billion total parameters, of which 45.9 billion are activated for each token. We develop an optimized parallel strategy and highly efficient computation-communication overlap techniques for MoE and lightning attention. This approach enables us to conduct efficient training and inference on models with hundreds of billions of parameters across contexts spanning millions of tokens. The context window of MiniMax-Text-01 can reach up to 1 million tokens during training and extrapolate to 4 million tokens during inference at an affordable cost. Our vision-language model, MiniMax-VL-01 is built through continued training with 512 billion vision-language tokens. Experiments on both standard and in-house benchmarks show that our models match the performance of state-of-the-art models like GPT-4o and Claude-3.5-Sonnet while offering 20-32 times longer context window. We publicly release MiniMax-01 at https://github.com/MiniMax-AI.


StreetviewLLM: Extracting Geographic Information Using a Chain-of-Thought Multimodal Large Language Model

arXiv.org Artificial Intelligence

Traditional machine learning has played a key role in geospatial predictions, but its limitations have become more distinct over time. One significant drawback of traditional ML is that they often rely on structured geospatial data, such as raster or vector formats, affecting their ability to handle unstructured or multimodal data (Pierdicca & Paolanti, 2022). Additionally, traditional models may face challenges in capturing complex spatial patterns and regional variations, leading to challenges with data sparsity and uneven distribution, which could affect the accuracy and generalizability of predictions (Nikparvar & Thill, 2021). In contrast, large language models (LLMs) have shown great promise across various fields by processing vast amounts of data and reasoning across multiple modalities (Chang et al., 2024). By integrating textual, visual, and contextual information, LLMs can introduce novel covariates for geospatial predictions, thus enhancing traditional approaches. However, extracting geospatial knowledge from LLMs poses its challenges. Although using geographic coordinates (i.e., latitude and longitude) was a straightforward way to retrieve location-specific information, this approach often yields suboptimal results, particularly when dealing with complex spatial relationships and regional characteristics. As a result, the traditional model does not easily to harness the full potential of multi-modal data, hindering its effectiveness in applications demanding comprehensive, cross-modal insights.


The Future of Combating Rumors? Retrieval, Discrimination, and Generation

arXiv.org Artificial Intelligence

Artificial Intelligence Generated Content (AIGC) technology development has facilitated the creation of rumors with misinformation, impacting societal, economic, and political ecosystems, challenging democracy. Current rumor detection efforts fall short by merely labeling potentially misinformation (classification task), inadequately addressing the issue, and it is unrealistic to have authoritative institutions debunk every piece of information on social media. Our proposed comprehensive debunking process not only detects rumors but also provides explanatory generated content to refute the authenticity of the information. The Expert-Citizen Collective Wisdom (ECCW) module we designed aensures high-precision assessment of the credibility of information and the retrieval module is responsible for retrieving relevant knowledge from a Real-time updated debunking database based on information keywords. By using prompt engineering techniques, we feed results and knowledge into a LLM (Large Language Model), achieving satisfactory discrimination and explanatory effects while eliminating the need for fine-tuning, saving computational costs, and contributing to debunking efforts.