Oceania
PsychAdapter: Adapting LLM Transformers to Reflect Traits, Personality and Mental Health
Vu, Huy, Nguyen, Huy Anh, Ganesan, Adithya V, Juhng, Swanie, Kjell, Oscar N. E., Sedoc, Joao, Kern, Margaret L., Boyd, Ryan L., Ungar, Lyle, Schwartz, H. Andrew, Eichstaedt, Johannes C.
Artificial intelligence-based language generators are now a part of most people's lives. However, by default, they tend to generate "average" language without reflecting the ways in which people differ. Here, we propose a lightweight modification to the standard language model transformer architecture - "PsychAdapter" - that uses empirically derived trait-language patterns to generate natural language for specified personality, demographic, and mental health characteristics (with or without prompting). We applied PsychAdapters to modify OpenAI's GPT-2, Google's Gemma, and Meta's Llama 3 and found generated text to reflect the desired traits. For example, expert raters evaluated PsychAdapter's generated text output and found it matched intended trait levels with 87.3% average accuracy for Big Five personalities, and 96.7% for depression and life satisfaction. PsychAdapter is a novel method to introduce psychological behavior patterns into language models at the foundation level, independent of prompting, by influencing every transformer layer. This approach can create chatbots with specific personality profiles, clinical training tools that mirror language associated with psychological conditionals, and machine translations that match an authors reading or education level without taking up LLM context windows. PsychAdapter also allows for the exploration psychological constructs through natural language expression, extending the natural language processing toolkit to study human psychology.
AI-Driven Day-to-Day Route Choice
Wang, Leizhen, Duan, Peibo, He, Zhengbing, Lyu, Cheng, Chen, Xin, Zheng, Nan, Yao, Li, Ma, Zhenliang
Understanding individual travel behaviors is critical for developing efficient and sustainable transportation systems. Travel behavioral analysis aims to capture the decision-making process of individual travel execution, including travel route choice, travel mode choice, departure time choice, and trip purpose. Among these choices, modeling route choice not only helps analyze and understand travelers' behaviors, but also constitutes the essential part of traffic assignment methods [1]. Specifically, it enables the evaluation of travelers' perceptions of route characteristics, the forecasting of behavior in hypothetical scenarios, the prediction of future traffic dynamics on transportation networks, and the understanding of travelers' responses to travel information. Real-world route choice is complex because of the inherent difficulties in accurately representing human behavior, travelers' limited knowledge of network composition, uncertainties in perceptions of route characteristics, and the lack of precise information about travelers' preferences [1]. To overcome these limitations, DTD traffic dynamics have attracted significant attention since they focus on drivers' dynamic shifts in route choices and the evolution of traffic flow over time, rather than merely static equilibrium states. DTD models are flexible to incorporate diverse behavioral rules such as forecasting [2, 3], bounded rationality [4, 5], decision-making based on prospects [6, 7], marginal utility effects [8, 9], and social interactions [10]. Despite these advantages identified in [11] and [12], DTD models still struggle to accurately reflect the observed fluctuations in traffic dynamics, particularly the persistent deviations around User Equilibrium (UE) noted in empirical studies [13, 14, 15]. To better understand traffic dynamics, Agent-Based Modeling (ABM) offers a promising alternative.
Monty Hall and Optimized Conformal Prediction to Improve Decision-Making with LLMs
Vishwakarma, Harit, Mishler, Alan, Cook, Thomas, Dalmasso, Niccolรฒ, Raman, Natraj, Ganesh, Sumitra
Large language models (LLMs) are empowering decision-making in several applications, including tool or API usage and answering multiple-choice questions (MCQs). However, they often make overconfident, incorrect predictions, which can be risky in high-stakes settings like healthcare and finance. To mitigate these risks, recent works have used conformal prediction (CP), a model-agnostic framework for distribution-free uncertainty quantification. CP transforms a \emph{score function} into prediction sets that contain the true answer with high probability. While CP provides this coverage guarantee for arbitrary scores, the score quality significantly impacts prediction set sizes. Prior works have relied on LLM logits or other heuristic scores, lacking quality guarantees. We address this limitation by introducing CP-OPT, an optimization framework to learn scores that minimize set sizes while maintaining coverage. Furthermore, inspired by the Monty Hall problem, we extend CP's utility beyond uncertainty quantification to improve accuracy. We propose \emph{conformal revision of questions} (CROQ) to revise the problem by narrowing down the available choices to those in the prediction set. The coverage guarantee of CP ensures that the correct choice is in the revised question prompt with high probability, while the smaller number of choices increases the LLM's chances of answering it correctly. Experiments on MMLU, ToolAlpaca, and TruthfulQA datasets with Gemma-2, Llama-3 and Phi-3 models show that CP-OPT significantly reduces set sizes while maintaining coverage, and CROQ improves accuracy over the standard inference, especially when paired with CP-OPT scores. Together, CP-OPT and CROQ offer a robust framework for improving both the safety and accuracy of LLM-driven decision-making.
Unrolled Creative Adversarial Network For Generating Novel Musical Pieces
Music generation has been established as a prominent topic in artificial intelligence and machine learning over recent years. In most recent works on RNN-based neural network methods have been applied for sequence generation. In contrast, generative adversarial networks (GANs) and their counterparts have been explored by very few researchersfor music generation. In this paper, a classical system was employed alongside a new system to generate creative music. Both systems were designed based on adversarial networks to generate music by learning from examples. The classical system was trained to learn a set of music pieces without differentiating between classes, whereas the new system was trained to learn the different composers and their styles to generate a creative music piece by deviating from the learned composers' styles. The base structure utilized was generative adversarial networks (GANs), which are capable of generating novel outputs given a set of inputs to learn from and mimic their distribution. It has been shown in previous work that GANs are limited in their original design with respect to creative outputs. Building on the Creative Adversarial Networks (CAN) , this work applied them in the music domain rather than the visual art domain. Additionally, unrolled CAN was introduced to prevent mode collapse. Experiments were conducted on both GAN and CAN for generating music, and their capabilities were measured in terms of deviation from the input set.
Warning as underwater volcano off US West Coast 'is primed to erupt' in 2025
Scientists have warned that an underwater volcano off the coast of the northwestern US is likely to blow sometime in 2025. The volcano, called Axial Seamount, is more than 3,600-feet-tall and sits half a mile underwater just 300 miles off the coast of Oregon. Experts made the prediction on December 10 after detecting seafloor swelling around Axial that mimicked a level seen immediately before an eruption in 2015. Seismic activity has also increased, with hundreds of earthquakes generated around the volcano per day and earthquake swarms greater than 500 per day. 'Based on the current trends, and the assumption that Axial will be primed to erupt when it reaches the 2015 inflation threshold, our current eruption forecast window is between now (July 2024) and the end of 2025,' researchers said in the new study.
CES 2025: LG's bendable 5K OLED, Hyundai's holographic windshield display and other tech we're expecting to see in Las Vegas
You might have thrown out the wrapping paper and are readying the drinks for the countdown to the new year, but some of us are already getting ready for CES 2025. Shortly after New Year's Day, many from the Engadget team will be packing our bags to fly to Las Vegas, where we'll be covering tech's biggest annual conference. As usual, our inboxes are already flooded with pitches from companies that are planning to be there, and our calendars are filling up with appointments for briefings and demos. Based on our experience, as well as observation of recent industry trends, it's fairly easy to make educated predictions about what we might see in January. Over the years, the focus of the conference has spanned areas like TVs, cars, smart home products and personal health, with a smattering of laptops and accessories thrown in.
Hgformer: Hyperbolic Graph Transformer for Recommendation
Yang, Xin, Li, Xingrun, Chang, Heng, Yang, Jinze, Yang, Xihong, Tao, Shengyu, Chang, Ningkang, Shigeno, Maiko, Wang, Junfeng, Yin, Dawei, Min, Erxue
The cold start problem is a challenging problem faced by most modern recommender systems. By leveraging knowledge from other domains, cross-domain recommendation can be an effective method to alleviate the cold start problem. However, the modelling distortion for long-tail data, which is widely present in recommender systems, is often overlooked in cross-domain recommendation. In this research, we propose a hyperbolic manifold based cross-domain collaborative filtering model using BiTGCF as the base model. We introduce the hyperbolic manifold and construct new propagation layer and transfer layer to address these challenges. The significant performance improvements across various datasets compared to the baseline models demonstrate the effectiveness of our proposed model.
Are LLMs Really Not Knowledgable? Mining the Submerged Knowledge in LLMs' Memory
Tao, Xingjian, Wang, Yiwei, Cai, Yujun, Yang, Zhicheng, Tang, Jing
Large language models (LLMs) have shown promise as potential knowledge bases, yet they often struggle with question-answering tasks and are prone to hallucinations. While previous research attributes these issues to knowledge gaps in the model's parameters, our investigation reveals a different phenomenon: LLMs often retain correct knowledge even when generating incorrect answers. Through analysis of model's internal representations, we find that correct answers frequently appear among high-probability tokens despite not being selected as final outputs. Based on this observation, we introduce Hits@k, a new metric to assess knowledge retention independent of expression accuracy. Our extensive experiments demonstrate that LLMs store significantly more knowledge than their QA performance suggests. Building on these findings, we develop SkipUnsure, a method to improve answer accuracy by leveraging detected but unexpressed knowledge. Experiments on both open-domain and specific-domain datasets show consistent improvements, with accuracy gains of up to 11.8% on DBPedia and 6.3% on IMDB, without requiring model retraining.
A Data-Centric Approach to Detecting and Mitigating Demographic Bias in Pediatric Mental Health Text: A Case Study in Anxiety Detection
Ive, Julia, Bondaronek, Paulina, Yadav, Vishal, Santel, Daniel, Glauser, Tracy, Cheng, Tina, Strawn, Jeffrey R., Agasthya, Greeshma, Tschida, Jordan, Choo, Sanghyun, Chandrashekar, Mayanka, Kapadia, Anuj J., Pestian, John
Introduction: Healthcare AI models often inherit biases from their training data. While efforts have primarily targeted bias in structured data, mental health heavily depends on unstructured data. This study aims to detect and mitigate linguistic differences related to non-biological differences in the training data of AI models designed to assist in pediatric mental health screening. Our objectives are: (1) to assess the presence of bias by evaluating outcome parity across sex subgroups, (2) to identify bias sources through textual distribution analysis, and (3) to develop a de-biasing method for mental health text data. Methods: We examined classification parity across demographic groups and assessed how gendered language influences model predictions. A data-centric de-biasing method was applied, focusing on neutralizing biased terms while retaining salient clinical information. This methodology was tested on a model for automatic anxiety detection in pediatric patients. Results: Our findings revealed a systematic under-diagnosis of female adolescent patients, with a 4% lower accuracy and a 9% higher False Negative Rate (FNR) compared to male patients, likely due to disparities in information density and linguistic differences in patient notes. Notes for male patients were on average 500 words longer, and linguistic similarity metrics indicated distinct word distributions between genders. Implementing our de-biasing approach reduced diagnostic bias by up to 27%, demonstrating its effectiveness in enhancing equity across demographic groups. Discussion: We developed a data-centric de-biasing framework to address gender-based content disparities within clinical text. By neutralizing biased language and enhancing focus on clinically essential information, our approach demonstrates an effective strategy for mitigating bias in AI healthcare models trained on text.
Plug-and-Play Training Framework for Preference Optimization
Ma, Jingyuan, Li, Rui, Li, Zheng, Sha, Lei, Sui, Zhifang
Recently, preference optimization methods such as DPO have significantly enhanced large language models (LLMs) in wide tasks including dialogue and question-answering. However, current methods fail to account for the varying difficulty levels of training samples during preference optimization, leading to mediocre performance in tasks with high accuracy requirements, particularly in mathematical reasoning. To address this limitation, we propose a novel training framework, which employs multiple sampling to analyze output distributions, assign different weights to samples, and incorporate these weights into the preference optimization process. This plug-and-play approach enables LLMs to prioritize challenging examples during training, improving learning efficiency. Experimental results demonstrate that our framework integrates seamlessly with various preference optimization methods and achieves consistent improvements in mathematical reasoning tasks.