AITopics | Xu, Xuhai

Collaborating Authors

Xu, Xuhai

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Mental-LLM: Leveraging Large Language Models for Mental Health Prediction via Online Text Data

Xu, Xuhai, Yao, Bingsheng, Dong, Yuanzhe, Gabriel, Saadia, Yu, Hong, Hendler, James, Ghassemi, Marzyeh, Dey, Anind K., Wang, Dakuo

arXiv.org Artificial IntelligenceJan-28-2024

Advances in large language models (LLMs) have empowered a variety of applications. However, there is still a significant gap in research when it comes to understanding and enhancing the capabilities of LLMs in the field of mental health. In this work, we present a comprehensive evaluation of multiple LLMs on various mental health prediction tasks via online text data, including Alpaca, Alpaca-LoRA, FLAN-T5, GPT-3.5, and GPT-4. We conduct a broad range of experiments, covering zero-shot prompting, few-shot prompting, and instruction fine-tuning. The results indicate a promising yet limited performance of LLMs with zero-shot and few-shot prompt designs for mental health tasks. More importantly, our experiments show that instruction finetuning can significantly boost the performance of LLMs for all tasks simultaneously. Our best-finetuned models, Mental-Alpaca and Mental-FLAN-T5, outperform the best prompt design of GPT-3.5 (25 and 15 times bigger) by 10.9% on balanced accuracy and the best of GPT-4 (250 and 150 times bigger) by 4.8%. They further perform on par with the state-of-the-art task-specific language model. We also conduct an exploratory case study on LLMs' capability on mental health reasoning tasks, illustrating the promising capability of certain models such as GPT-4. We summarize our findings into a set of action guidelines for potential methods to enhance LLMs' capability for mental health tasks. Meanwhile, we also emphasize the important limitations before achieving deployability in real-world mental health settings, such as known racial and gender bias. We highlight the important ethical risks accompanying this line of research.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3643540

2307.14385

Country:

Europe (1.00)
Asia (0.67)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre:

Research Report > New Finding (1.00)
Instructional Material (1.00)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Health-LLM: Large Language Models for Health Prediction via Wearable Sensor Data

Kim, Yubin, Xu, Xuhai, McDuff, Daniel, Breazeal, Cynthia, Park, Hae Won

arXiv.org Artificial IntelligenceJan-12-2024

Large language models (LLMs) are capable of many natural language tasks, yet they are far from perfect. In health applications, grounding and interpreting domain-specific and non-linguistic data is important. This paper investigates the capacity of LLMs to deliver multi-modal health predictions based on contextual information (e.g. user demographics, health knowledge) and physiological data (e.g. resting heart rate, sleep minutes). We present a comprehensive evaluation of eight state-of-the-art LLMs with diverse prompting and fine-tuning techniques on six public health datasets (PM-Data, LifeSnaps, GLOBEM, AW_FB, MIT-BIH & MIMIC-III). Our experiments cover thirteen consumer health prediction tasks in mental health, activity, metabolic, sleep, and cardiac assessment. Our fine-tuned model, Health-Alpaca exhibits comparable performance to larger models (GPT-3.5 and GPT-4), achieving the best performance in 5 out of 13 tasks. Ablation studies highlight the effectiveness of context enhancement strategies, and generalization capability of the fine-tuned models across training datasets and the size of training samples. Notably, we observe that our context enhancement can yield up to 23.8% improvement in performance. While constructing contextually rich prompts (combining user context, health knowledge and temporal information) exhibits synergistic improvement, the inclusion of health knowledge context in prompts significantly enhances overall performance.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2401.06866

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Consumer Health (1.00)
Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MindShift: Leveraging Large Language Models for Mental-States-Based Problematic Smartphone Use Intervention

Wu, Ruolan, Yu, Chun, Pan, Xiaole, Liu, Yujia, Zhang, Ningning, Fu, Yue, Wang, Yuhan, Zheng, Zhi, Chen, Li, Jiang, Qiaolei, Xu, Xuhai, Shi, Yuanchun

arXiv.org Artificial IntelligenceSep-28-2023

Problematic smartphone use negatively affects physical and mental health. Despite the wide range of prior research, existing persuasive techniques are not flexible enough to provide dynamic persuasion content based on users' physical contexts and mental states. We first conduct a Wizard-of-Oz study (N=12) and an interview study (N=10) to summarize the mental states behind problematic smartphone use: boredom, stress, and inertia. This informs our design of four persuasion strategies: understanding, comforting, evoking, and scaffolding habits. We leverage large language models (LLMs) to enable the automatic and dynamic generation of effective persuasion content. We develop MindShift, a novel LLM-powered problematic smartphone use intervention technique. MindShift takes users' in-the-moment physical contexts, mental states, app usage behaviors, users' goals & habits as input, and generates high-quality and flexible persuasive content with appropriate persuasion strategies. We conduct a 5-week field experiment (N=25) to compare MindShift with baseline techniques. The results show that MindShift significantly improves intervention acceptance rates by 17.8-22.5% and reduces smartphone use frequency by 12.1-14.4%. Moreover, users have a significant drop in smartphone addiction scale scores and a rise in self-efficacy. Our study sheds light on the potential of leveraging LLMs for context-aware persuasion in other behavior change domains.

large language model, mental-state-based problematic smartphone use intervention, natural language, (2 more...)

arXiv.org Artificial Intelligence

2309.16639

Genre: Research Report > New Finding (0.53)

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Rethinking Human-AI Collaboration in Complex Medical Decision Making: A Case Study in Sepsis Diagnosis

Zhang, Shao, Yu, Jianing, Xu, Xuhai, Yin, Changchang, Lu, Yuxuan, Yao, Bingsheng, Tory, Melanie, Padilla, Lace M., Caterino, Jeffrey, Zhang, Ping, Wang, Dakuo

arXiv.org Artificial IntelligenceSep-17-2023

Today's AI systems for medical decision support often succeed on benchmark datasets in research papers but fail in real-world deployment. This work focuses on the decision making of sepsis, an acute life-threatening systematic infection that requires an early diagnosis with high uncertainty from the clinician. Our aim is to explore the design requirements for AI systems that can support clinical experts in making better decisions for the early diagnosis of sepsis. The study begins with a formative study investigating why clinical experts abandon an existing AI-powered Sepsis predictive module in their electrical health record (EHR) system. We argue that a human-centered AI system needs to support human experts in the intermediate stages of a medical decision-making process (e.g., generating hypotheses or gathering data), instead of focusing only on the final decision. Therefore, we build SepsisLab based on a state-of-the-art AI algorithm and extend it to predict the future projection of sepsis development, visualize the prediction uncertainty, and propose actionable suggestions (i.e., which additional laboratory tests can be collected) to reduce such uncertainty. Through heuristic evaluation with six clinicians using our prototype system, we demonstrate that SepsisLab enables a promising human-AI collaboration paradigm for the future of AI-assisted sepsis diagnosis and other high-stakes medical decision making.

artificial intelligence, complex medical decision, rethinking human-ai collaboration, (2 more...)

arXiv.org Artificial Intelligence

2309.12368

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)

Technology: Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Add feedback

XAIR: A Framework of Explainable AI in Augmented Reality

Xu, Xuhai, Yu, Mengjie, Jonker, Tanya R., Todi, Kashyap, Lu, Feiyu, Qian, Xun, Belo, João Marcelo Evangelista, Wang, Tianyi, Li, Michelle, Mun, Aran, Wu, Te-Yen, Shen, Junxiao, Zhang, Ting, Kokhlikyan, Narine, Wang, Fulton, Sorenson, Paul, Kim, Sophie Kahyun, Benko, Hrvoje

arXiv.org Artificial IntelligenceMar-28-2023

Explainable AI (XAI) has established itself as an important component of AI-driven interactive systems. With Augmented Reality (AR) becoming more integrated in daily lives, the role of XAI also becomes essential in AR because end-users will frequently interact with intelligent services. However, it is unclear how to design effective XAI experiences for AR. We propose XAIR, a design framework that addresses "when", "what", and "how" to provide explanations of AI output in AR. The framework was based on a multi-disciplinary literature review of XAI and HCI research, a large-scale survey probing 500+ end-users' preferences for AR-based explanations, and three workshops with 12 experts collecting their insights about XAI design in AR. XAIR's utility and effectiveness was verified via a study with 10 designers and another study with 12 end-users. XAIR can provide guidelines for designers, inspiring them to identify new design opportunities and achieve effective XAI designs in AR.

explanation, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3544548.3581500

2303.16292

Country:

Europe (1.00)
Asia (1.00)
North America > United States > New York (0.28)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Consumer Health (1.00)
Education (1.00)
(3 more...)

Technology:

Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
(5 more...)

Add feedback

Modeling the Trade-off of Privacy Preservation and Activity Recognition on Low-Resolution Images

Wang, Yuntao, Cheng, Zirui, Yi, Xin, Kong, Yan, Wang, Xueyang, Xu, Xuhai, Yan, Yukang, Yu, Chun, Patel, Shwetak, Shi, Yuanchun

arXiv.org Artificial IntelligenceMar-18-2023

A computer vision system using low-resolution image sensors can provide intelligent services (e.g., activity recognition) but preserve unnecessary visual privacy information from the hardware level. However, preserving visual privacy and enabling accurate machine recognition have adversarial needs on image resolution. Modeling the trade-off of privacy preservation and machine recognition performance can guide future privacy-preserving computer vision systems using low-resolution image sensors. In this paper, using the at-home activity of daily livings (ADLs) as the scenario, we first obtained the most important visual privacy features through a user survey. Then we quantified and analyzed the effects of image resolution on human and machine recognition performance in activity recognition and privacy awareness tasks. We also investigated how modern image super-resolution techniques influence these effects. Based on the results, we proposed a method for modeling the trade-off of privacy preservation and activity recognition on low-resolution images.

artificial intelligence, machine learning, resolution, (14 more...)

arXiv.org Artificial Intelligence

2303.10435

Country:

Europe (1.00)
Asia (1.00)
North America > United States > California (0.93)

Genre: Research Report > New Finding (0.93)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Government (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

GLOBEM Dataset: Multi-Year Datasets for Longitudinal Human Behavior Modeling Generalization

Xu, Xuhai, Zhang, Han, Sefidgar, Yasaman, Ren, Yiyi, Liu, Xin, Seo, Woosuk, Brown, Jennifer, Kuehn, Kevin, Merrill, Mike, Nurius, Paula, Patel, Shwetak, Althoff, Tim, Morris, Margaret E., Riskin, Eve, Mankoff, Jennifer, Dey, Anind K.

arXiv.org Artificial IntelligenceMar-4-2023

Recent research has demonstrated the capability of behavior signals captured by smartphones and wearables for longitudinal behavior modeling. However, there is a lack of a comprehensive public dataset that serves as an open testbed for fair comparison among algorithms. Moreover, prior studies mainly evaluate algorithms using data from a single population within a short period, without measuring the cross-dataset generalizability of these algorithms. We present the first multi-year passive sensing datasets, containing over 700 user-years and 497 unique users' data collected from mobile and wearable sensors, together with a wide range of well-being metrics. Our datasets can support multiple cross-dataset evaluations of behavior modeling algorithms' generalizability across different users and years. As a starting point, we provide the benchmark results of 18 algorithms on the task of depression detection. Our results indicate that both prior depression detection algorithms and domain generalization techniques show potential but need further research to achieve adequate cross-dataset generalizability. We envision our multi-year datasets can support the ML community in developing generalizable longitudinal behavior modeling algorithms. The GLOBEM website can be found at the-globem.github.io

data mining, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2211.02733

Country: North America > United States (1.00)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Epidemiology (1.00)
Health & Medicine > Consumer Health (1.00)
Information Technology > Security & Privacy (0.87)
(4 more...)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Data Science > Data Quality (1.00)
Information Technology > Data Science > Data Mining (1.00)
(7 more...)

Add feedback