Media
MTA: A Merge-then-Adapt Framework for Personalized Large Language Model
Li, Xiaopeng, Zheng, Yuanjin, Wang, Wanyu, zhang, wenlin, Jia, Pengyue, Wang, Yiqi, Wang, Maolin, Wei, Xuetao, Zhao, Xiangyu
Personalized Large Language Models (PLLMs) aim to align model outputs with individual user preferences, a crucial capability for user-centric applications. However, the prevalent approach of fine-tuning a separate module for each user faces two major limitations: (1) storage costs scale linearly with the number of users, rendering the method unscalable; and (2) fine-tuning a static model from scratch often yields suboptimal performance for users with sparse data. To address these challenges, we propose MTA, a Merge-then-Adapt framework for PLLMs. MTA comprises three key stages. First, we construct a shared Meta-LoRA Bank by selecting anchor users and pre-training meta-personalization traits within meta-LoRA modules. Second, to ensure scalability and enable dynamic personalization combination beyond static models, we introduce an Adaptive LoRA Fusion stage. This stage retrieves and dynamically merges the most relevant anchor meta-LoRAs to synthesize a user-specific one, thereby eliminating the need for user-specific storage and supporting more flexible personalization. Third, we propose a LoRA Stacking for Few-Shot Personalization stage, which applies an additional ultra-low-rank, lightweight LoRA module on top of the merged LoRA. Fine-tuning this module enables effective personalization under few-shot settings. Extensive experiments on the LaMP benchmark demonstrate that our approach outperforms existing SOTA methods across multiple tasks.
Generative Adversarial Post-Training Mitigates Reward Hacking in Live Human-AI Music Interaction
Wu, Yusong, Brade, Stephen, Ma, Teng, Fowler, Tia-Jane, Yang, Enning, Banar, Berker, Courville, Aaron, Jaques, Natasha, Huang, Cheng-Zhi Anna
Most applications of generative AI involve a sequential interaction in which a person inputs a prompt and waits for a response, and where reaction time and adaptiv-ity are not important factors. In contrast, live jamming is a collaborative interaction that requires real-time coordination and adaptation without access to the other player's future moves, while preserving diversity to sustain a creative flow. Reinforcement learning post-training enables effective adaptation through on-policy interaction, yet it often reduces output diversity by exploiting coherence-based rewards. This collapse, known as "reward hacking", affects many RL post-training pipelines, but is especially harmful in live jamming, where musical creativity relies on dynamic variation and mutual responsiveness. In this paper, we propose a novel adversarial training method on policy-generated trajectories to mitigate reward hacking in RL post-training for melody-to-chord accompaniment. A co-evolving discriminator separates policy trajectories from the data distribution, while the policy maximizes the discriminator output in addition to coherence rewards to prevent collapse to trivial outputs. We evaluate accompaniment quality and output diversity in simulation with both fixed test melodies and learned melody agents, and we conduct a user study with the model deployed in a real-time interactive system with expert musicians. Quantitative evaluation and user feedback demonstrate improved output diversity, harmonic coherence, adaptation speed and user agency. Our results demonstrate a simple yet effective method to mitigate reward hacking in RL post-training of generative sequence models. The combination of large-scale transformer-based models and reinforcement learning (RL) post-training has revolutionized AI, with over 1 billion people now using large language models (LLMs) trained with these techniques (OpenAI, 2025; Perez, 2025). However, most applications of generative AI still involve a slow back-and-forth interaction, where the user inputs a request, and then waits--sometimes several minutes--for a response.
Evaluating the Simulation of Human Personality-Driven Susceptibility to Misinformation with LLMs
Pratelli, Manuel, Petrocchi, Marinella
Large language models (LLMs) make it possible to generate synthetic behavioural data at scale, offering an ethical and low-cost alternative to human experiments. Whether such data can faithfully capture psychological differences driven by personality traits, however, remains an open question. We evaluate the capacity of LLM agents, conditioned on Big-Five profiles, to reproduce personality-based variation in susceptibility to misinformation, focusing on news discernment, the ability to judge true headlines as true and false headlines as false. Leveraging published datasets in which human participants with known personality profiles rated headline accuracy, we create matching LLM agents and compare their responses to the original human patterns. Certain trait-misinformation associations, notably those involving Agreeableness and Conscientiousness, are reliably replicated, whereas others diverge, revealing systematic biases in how LLMs internalize and express personality. The results underscore both the promise and the limits of personality-aligned LLMs for behavioral simulation, and offer new insight into modeling cognitive diversity in artificial agents.
ReSCORE: Label-free Iterative Retriever Training for Multi-hop Question Answering with Relevance-Consistency Supervision
Lee, Dosung, Oh, Wonjun, Kim, Boyoung, Kim, Minyoung, Park, Joonsuk, Seo, Paul Hongsuck
Multi-hop question answering (MHQA) involves reasoning across multiple documents to answer complex questions. Dense retrievers typically outperform sparse methods like BM25 by leveraging semantic embeddings; however, they require labeled query-document pairs for fine-tuning. This poses a significant challenge in MHQA due to the high variability of queries (reformulated) questions throughout the reasoning steps. To overcome this limitation, we introduce Retriever Supervision with Consistency and Relevance (ReSCORE), a novel method for training dense retrievers for MHQA without labeled documents. ReSCORE leverages large language models to capture each documents relevance to the question and consistency with the correct answer and use them to train a retriever within an iterative question-answering framework. Experiments on three MHQA benchmarks demonstrate the effectiveness of ReSCORE, with significant improvements in retrieval, and in turn, the state-of-the-art MHQA performance. Our implementation is available at: https://leeds1219.github.io/ReSCORE.
Campbell's FIRES executive secretly recorded saying its soups are full of 'bioengineered meat' and made for 'poor people'
Karoline Leavitt's family member was swarmed by ICE agents while picking up son from school as child's father tell her to'self deport' Deaths from highly infectious virus are growing... as states brace for widespread outbreaks My book on the Kennedys was used as a'mistress manual' by Olivia Nuzzi... then this wannabe Carolyn Bessette had the nerve to hound me with these outrageous texts: MAUREEN CALLAHAN Katy Perry's legal victory as judge orders disabled veteran to pay singer nearly $2m over Montecito mansion Trump reveals next DC renovation project to remove'Biden filth' after White House ballroom Cracker Barrel CEO whines that she got'fired by America' for woke redesign Kroger employee reveals shocking amount laundry products have increased by... 'biggest price jump I've seen in a single week' Hollywood heir, 23, whose mom Anne Heche died in horror car fireball has secret LOVE CHILD with 43-year-old... now she's telling all Missing Melodee Buzzard's mom'left her daughter with strangers she met at the zoo' Rachel Zoe reveals why she dumped husband of 26 years... and if she has started dating again Horrific moment cops found body of Cowboys star Marshawn Kneeland after he shot himself at end of 145 mph chase'This is pretty lurid' Jenny McCarthy, 53, reveals health emergency that involved NINE surgeries, her'teeth falling out' and'growth' on her eyeballs Maryland grandma, 58, dragged across floor after being deported to country she'has never even visited' Campbell's FIRES executive secretly recorded saying its soups are full of'bioengineered meat' and made for'poor people' Campbell's Soup has fired the executive caught in a secret recording insulting customers and claiming the company's products were filled with bioengineered meat. Vice President and Chief Information Security Officer Martin Bally was originally placed on administrative leave after a lawsuit against Campbell's was filed last week and the audio recording was released. In the audio, a speaker identified as Bally was heard saying: 'We have s**t for f***king poor people. It's not healthy now that I know what the f**'s in it.' The voice, alleged to be Bally, also claimed that the chicken used in the brand's soups'came from a 3D printer.' Campbell's revealed on Wednesday that their investigation concluded that the voice on the secret recording was Bally and the executive was removed from the company on Tuesday.
Easter Island mystery is SOLVED: Scientists finally pinpoint who built the iconic stone heads 900 years ago
Karoline Leavitt's family member was swarmed by ICE agents while picking up son from school as child's father tell her to'self deport' Deaths from highly infectious virus are growing... as states brace for widespread outbreaks My book on the Kennedys was used as a'mistress manual' by Olivia Nuzzi... then this wannabe Carolyn Bessette had the nerve to hound me with these outrageous texts: MAUREEN CALLAHAN Katy Perry's legal victory as judge orders disabled veteran to pay singer nearly $2m over Montecito mansion Trump reveals next DC renovation project to remove'Biden filth' after White House ballroom Cracker Barrel CEO whines that she got'fired by America' for woke redesign Kroger employee reveals shocking amount laundry products have increased by... 'biggest price jump I've seen in a single week' Hollywood heir, 23, whose mom Anne Heche died in horror car fireball has secret LOVE CHILD with 43-year-old... now she's telling all Missing Melodee Buzzard's mom'left her daughter with strangers she met at the zoo' Rachel Zoe reveals why she dumped husband of 26 years... and if she has started dating again Horrific moment cops found body of Cowboys star Marshawn Kneeland after he shot himself at end of 145 mph chase'This is pretty lurid' Jenny McCarthy, 53, reveals health emergency that involved NINE surgeries, her'teeth falling out' and'growth' on her eyeballs Maryland grandma, 58, dragged across floor after being deported to country she'has never even visited' READ MORE: New'stone head' statue mysteriously appears on Easter Island One of the biggest mysteries surrounding Easter Island may finally be solved - as scientists pinpoint who built the iconic stone heads over 900 years ago. In the past, researchers assumed that the 12 to 80-ton statues would have required the combined efforts of hundreds of labourers to build and move. However, new archaeological evidence shows that the statues, known as moai, were not carved by a single powerful chiefdom. Instead, each moai was carved by a small clan or by an individual family, with as few as four to six people working on a single statue. Using a new 3D model of the island's main moai quarry, which you can explore below, archaeologists identified 30 unique'workshops' where the statues were produced.
L.A. grand jury now probing mystery of dead teen stuffed in trunk of D4vd's Tesla, sources say
Things to Do in L.A. Tap to enable a layout that focuses on the article. L.A. grand jury now probing mystery of dead teen stuffed in trunk of D4vd's Tesla, sources say D4vd (David Anthony Burke) performs at the Bonnaroo Music and Arts Festival in Manchester, Tennessee, in June 2024. This is read by an automated voice. Please report any issues or inconsistencies here . A Los Angeles County grand jury is hearing evidence related to the death of a teenage girl whose body was discovered stuffed inside the trunk of singer D4vd's Tesla earlier this year, two law enforcement sources told The Times.
Galaxy NGC 2775 continues to baffle astronomers
The cosmic oddball that's 67 million light-years away has a puzzling shape. Breakthroughs, discoveries, and DIY tips sent every weekday. What does it look like in your mind? Chances are, it's a swirling circle of galactic energy . A galaxy is often described as one of a few broadly defined shapes--elliptical, spiral, or lenticular--as described by the Hubble sequence .
Thieves are stealing keyless cars in minutes. Here's how to protect your vehicle
Things to Do in L.A. Tap to enable a layout that focuses on the article. Thieves are stealing keyless cars in minutes. Here's how to protect your vehicle Cars are parked bumper to bumper in the Florence neighborhood on Nov. 18 in Los Angeles. This is read by an automated voice. Please report any issues or inconsistencies here .