Portland
MentalChat16K: A Benchmark Dataset for Conversational Mental Health Assistance
Xu, Jia, Wei, Tianyi, Hou, Bojian, Orzechowski, Patryk, Yang, Shu, Jin, Ruochen, Paulbeck, Rachael, Wagenaar, Joost, Demiris, George, Shen, Li
We introduce MentalChat16K, an English benchmark dataset combining a synthetic mental health counseling dataset and a dataset of anonymized transcripts from interventions between Behavioral Health Coaches and Caregivers of patients in palliative or hospice care. Covering a diverse range of conditions like depression, anxiety, and grief, this curated dataset is designed to facilitate the development and evaluation of large language models for conversational mental health assistance. By providing a high-quality resource tailored to this critical domain, MentalChat16K aims to advance research on empathetic, personalized AI solutions to improve access to mental health support services. The dataset prioritizes patient privacy, ethical considerations, and responsible data usage. MentalChat16K presents a valuable opportunity for the research community to innovate AI technologies that can positively impact mental well-being.
Leveraging Convolutional Neural Network-Transformer Synergy for Predictive Modeling in Risk-Based Applications
Wang, Yuhan, Xu, Zhen, Yao, Yue, Liu, Jinsong, Lin, Jiating
With the development of the financial industry, credit default prediction, as an important task in financial risk management, has received increasing attention. Traditional credit default prediction methods mostly rely on machine learning models, such as decision trees and random forests, but these methods have certain limitations in processing complex data and capturing potential risk patterns. To this end, this paper proposes a deep learning model based on the combination of convolutional neural networks (CNN) and Transformer for credit user default prediction. The model combines the advantages of CNN in local feature extraction with the ability of Transformer in global dependency modeling, effectively improving the accuracy and robustness of credit default prediction. Through experiments on public credit default datasets, the results show that the CNN+Transformer model outperforms traditional machine learning models, such as random forests and XGBoost, in multiple evaluation indicators such as accuracy, AUC, and KS value, demonstrating its powerful ability in complex financial data modeling. Further experimental analysis shows that appropriate optimizer selection and learning rate adjustment play a vital role in improving model performance. In addition, the ablation experiment of the model verifies the advantages of the combination of CNN and Transformer and proves the complementarity of the two in credit default prediction. This study provides a new idea for credit default prediction and provides strong support for risk assessment and intelligent decision-making in the financial field. Future research can further improve the prediction effect and generalization ability by introducing more unstructured data and improving the model architecture.
AI-Driven Health Monitoring of Distributed Computing Architecture: Insights from XGBoost and SHAP
Sun, Xiaoxuan, Yao, Yue, Wang, Xiaoye, Li, Pochun, Li, Xuan
With the rapid development of artificial intelligence technology, its application in the optimization of complex computer systems is becoming more and more extensive. Edge computing is an efficient distributed computing architecture, and the health status of its nodes directly affects the performance and reliability of the entire system. In view of the lack of accuracy and interpretability of traditional methods in node health status judgment, this paper proposes a health status judgment method based on XGBoost and combines the SHAP method to analyze the interpretability of the model. Through experiments, it is verified that XGBoost has superior performance in processing complex features and nonlinear data of edge computing nodes, especially in capturing the impact of key features (such as response time and power consumption) on node status. SHAP value analysis further reveals the global and local importance of features, so that the model not only has high precision discrimination ability but also can provide intuitive explanations, providing data support for system optimization. Research shows that the combination of AI technology and computer system optimization can not only realize the intelligent monitoring of the health status of edge computing nodes but also provide a scientific basis for dynamic optimization scheduling, resource management and anomaly detection. In the future, with the in-depth development of AI technology, model dynamics, cross-node collaborative optimization and multimodal data fusion will become the focus of research, providing important support for the intelligent evolution of edge computing systems.
Amazon's Alexa has been spreading FAKE news on everything from MPs' expenses to the origins of the Northern Lights, shocking report reveals
It's supposed to be the reliable smart assistant that'makes your life easier' with instant titbits of information. But a shocking report has revealed that in many cases, Amazon's Alexa doesn't know the difference between right and wrong. An investigation by Full Fact has found that Alexa spouts incorrect information on topics ranging from MPs' expenses to the origins of the Northern Lights. Full Fact, the UK's independent fact checking organisation, called the findings'misleading' and'clearly a big problem'. What's more, staff at the organization have been furious to discover that Alexa was attributing the wrong answers to none other than Full Fact.
Data-efficient and Interpretable Inverse Materials Design using a Disentangled Variational Autoencoder
Zeng, Cheng, Khan, Zulqarnain, Post, Nathan L.
Inverse materials design has proven successful in accelerating novel material discovery. Many inverse materials design methods use unsupervised learning where a latent space is learned to offer a compact description of materials representations. A latent space learned this way is likely to be entangled, in terms of the target property and other properties of the materials. This makes the inverse design process ambiguous. Here, we present a semi-supervised learning approach based on a disentangled variational autoencoder to learn a probabilistic relationship between features, latent variables and target properties. This approach is data efficient because it combines all labelled and unlabelled data in a coherent manner, and it uses expert-informed prior distributions to improve model robustness even with limited labelled data. It is in essence interpretable, as the learnable target property is disentangled out of the other properties of the materials, and an extra layer of interpretability can be provided by a post-hoc analysis of the classification head of the model. We demonstrate this new approach on an experimental high-entropy alloy dataset with chemical compositions as input and single-phase formation as the single target property. While single property is used in this work, the disentangled model can be extended to customize for inverse design of materials with multiple target properties.
Quantification of total uncertainty in the physics-informed reconstruction of CVSim-6 physiology
De Florio, Mario, Zou, Zongren, Schiavazzi, Daniele E., Karniadakis, George Em
When predicting physical phenomena through simulation, quantification of the total uncertainty due to multiple sources is as crucial as making sure the underlying numerical model is accurate. Possible sources include irreducible aleatoric uncertainty due to noise in the data, epistemic uncertainty induced by insufficient data or inadequate parameterization, and model-form uncertainty related to the use of misspecified model equations. Physics-based regularization interacts in nontrivial ways with aleatoric, epistemic and model-form uncertainty and their combination, and a better understanding of this interaction is needed to improve the predictive performance of physics-informed digital twins that operate under real conditions. With a specific focus on biological and physiological models, this study investigates the decomposition of total uncertainty in the estimation of states and parameters of a differential system simulated with MC X-TFC, a new physics-informed approach for uncertainty quantification based on random projections and Monte-Carlo sampling. MC X-TFC is applied to a six-compartment stiff ODE system, the CVSim-6 model, developed in the context of human physiology. The system is analyzed by progressively removing data while estimating an increasing number of parameters and by investigating total uncertainty under model-form misspecification of non-linear resistance in the pulmonary compartment. In particular, we focus on the interaction between the formulation of the discrepancy term and quantification of model-form uncertainty, and show how additional physics can help in the estimation process. The method demonstrates robustness and efficiency in estimating unknown states and parameters, even with limited, sparse, and noisy data. It also offers great flexibility in integrating data with physics for improved estimation, even in cases of model misspecification.
LLMs' Understanding of Natural Language Revealed
Large language models (LLMs) are the result of a massive experiment in bottom-up, data-driven reverse engineering of language at scale. Despite their utility in a number of downstream NLP tasks, ample research has shown that LLMs are incapable of performing reasoning in tasks that require quantification over and the manipulation of symbolic variables (e.g., planning and general problem solving) - see for example [25][26]. In this document, however, we will focus on testing LLMs for their language understanding capabilities, their supposed forte. In this regard we believe that we have not been testing the language understanding capabilities of large language models (LLMs) properly. Prompting LLMs and asking for responses will always look impressive because that's how LLMs were designed, i.e., to generate text. The proper method of testing the understanding capabilities of LLMs, we argue, is to prompt LLMs in reverse: give the LLM a snippet of text and query their understanding of the input text by asking the LLM questions against the input text. As we will show here the language understanding capabilities of LLMs have been widely exaggerated. By testing the understanding capabilities properly - i.e., by giving the LLM snippets of text as input and then querying what the LLM "understood" it will become apparent that LLMs do not truly understand language, beyond very superficial inferences that are essentially the byproduct of the memorization of massive amounts of ingested text.
Multilingual Fine-Grained News Headline Hallucination Detection
Shen, Jiaming, Liu, Tianqi, Liu, Jialu, Qin, Zhen, Pavagadhi, Jay, Baumgartner, Simon, Bendersky, Michael
The popularity of automated news headline generation has surged with advancements in pre-trained language models. However, these models often suffer from the ``hallucination'' problem, where the generated headline is not fully supported by its source article. Efforts to address this issue have predominantly focused on English, using over-simplistic classification schemes that overlook nuanced hallucination types. In this study, we introduce the first multilingual, fine-grained news headline hallucination detection dataset that contains over 11 thousand pairs in 5 languages, each annotated with detailed hallucination types by experts. We conduct extensive experiments on this dataset under two settings. First, we implement several supervised fine-tuning approaches as preparatory solutions and demonstrate this dataset's challenges and utilities. Second, we test various large language models' in-context learning abilities and propose two novel techniques, language-dependent demonstration selection and coarse-to-fine prompting, to boost the few-shot hallucination detection performance in terms of the example-F1 metric. We release this dataset to foster further research in multilingual, fine-grained headline hallucination detection.