Media
Beyond One-Way Influence: Bidirectional Opinion Dynamics in Multi-Turn Human-LLM Interactions
Jiang, Yuyang, Guo, Longjie, Wu, Yuchen, Caliskan, Aylin, Mitra, Tanu, Shen, Hua
Large language model (LLM)-powered chatbots are increasingly used for opinion exploration. Prior research examined how LLMs alter user views, yet little work extended beyond one-way influence to address how user input can affect LLM responses and how such bi-directional influence manifests throughout the multi-turn conversations. This study investigates this dynamic through 50 controversial-topic discussions with participants (N=266) across three conditions: static statements, standard chatbot, and personalized chatbot. Results show that human opinions barely shifted, while LLM outputs changed more substantially, narrowing the gap between human and LLM stance. Personalization amplified these shifts in both directions compared to the standard setting. Analysis of multi-turn conversations further revealed that exchanges involving participants' personal stories were most likely to trigger stance changes for both humans and LLMs. Our work highlights the risk of over-alignment in human-LLM interaction and the need for careful design of personalized chatbots to more thoughtfully and stably align with users.
Improving Transfer Learning for Sequence Labeling Tasks by Adapting Pre-trained Neural Language Models
This doctoral thesis improves the transfer learning for sequence labeling tasks by adapting pre-trained neural language models. The proposed improvements in transfer learning involve introducing a multi-task model that incorporates an additional signal, a method based on architectural modifications in autoregressive large language models, and a sequence labeling framework for autoregressive large language models utilizing supervised in-context fine-tuning combined with response-oriented adaptation strategies. The first improvement is given in the context of domain transfer for the event trigger detection task. The domain transfer of the event trigger detection task can be improved by incorporating an additional signal obtained from a domain-independent text processing system into a multi-task model. The second improvement involves modifying the model's architecture. For that purpose, a method is proposed to enable bidirectional information flow across layers of autoregressive large language models. The third improvement utilizes autoregressive large language models as text generators through a generative supervised in-context fine-tuning framework. The proposed model, method, and framework demonstrate that pre-trained neural language models achieve their best performance on sequence labeling tasks when adapted through targeted transfer learning paradigms.
Quantifying Feature Importance for Online Content Moderation
Tessa, Benedetta, Moreo, Alejandro, Cresci, Stefano, Fagni, Tiziano, Sebastiani, Fabrizio
Accurately estimating how users respond to moderation interventions is paramount for developing effective and user-centred moderation strategies. However, this requires a clear understanding of which user characteristics are associated with different behavioural responses, which is the goal of this work. We investigate the informativeness of 753 socio-behavioural, linguistic, relational, and psychological features, in predicting the behavioural changes of 16.8K users affected by a major moderation intervention on Reddit. To reach this goal, we frame the problem in terms of "quantification", a task well-suited to estimating shifts in aggregate user behaviour. We then apply a greedy feature selection strategy with the double goal of (i) identifying the features that are most predictive of changes in user activity, toxicity, and participation diversity, and (ii) estimating their importance. Our results allow identifying a small set of features that are consistently informative across all tasks, and determining that many others are either task-specific or of limited utility altogether. We also find that predictive performance varies according to the task, with changes in activity and toxicity being easier to estimate than changes in diversity. Overall, our results pave the way for the development of accurate systems that predict user reactions to moderation interventions. Furthermore, our findings highlight the complexity of post-moderation user behaviour, and indicate that effective moderation should be tailored not only to user traits but also to the specific objective of the intervention.
CourtGuard: A Local, Multiagent Prompt Injection Classifier
As large language models (LLMs) become integrated into various sensitive applications, prompt injection, the use of prompting to induce harmful behaviors from LLMs, poses an ever increasing risk. Prompt injection attacks can cause LLMs to leak sensitive data, spread misinformation, and exhibit harmful behaviors. To defend against these attacks, we propose CourtGuard, a locally-runnable, multiagent prompt injection classifier. In it, prompts are evaluated in a court-like multiagent LLM system, where a "defense attorney" model argues the prompt is benign, a "prosecution attorney" model argues the prompt is a prompt injection, and a "judge" model gives the final classification. CourtGuard has a lower false positive rate than the Direct Detector, an LLM as-a-judge. However, CourtGuard is generally a worse prompt injection detector. Nevertheless, this lower false positive rate highlights the importance of considering both adversarial and benign scenarios for the classification of a prompt. Additionally, the relative performance of CourtGuard in comparison to other prompt injection classifiers advances the use of multiagent systems as a defense against prompt injection attacks. The implementations of CourtGuard and the Direct Detector with full prompts for Gemma-3-12b-it, Llama-3.3-8B, and Phi-4-mini-instruct are available at https://github.com/isaacwu2000/CourtGuard.
The Man Who Makes AI Slop by Hand
Chinese creator Tianran Mu went viral for mimicking the eerie, unsettling aesthetic of AI videos, but his work is 100 percent human. Our fellow terminally online readers probably have seen this video, which originated on Chinese social media . In it, two guys who look at first like they are about to get into a fistfight suddenly break out into a romantic, yet slightly robotic tango dance routine. The next second, they pull a wine glass and a bowl of noodles out of nowhere. It looks like it's generated by AI, but it isn't.
Russian drone kills two Ukrainian journalists on Donetsk eastern front line
How much of Europe's oil still comes from Russia? A Russian drone has killed two Ukrainian journalists and wounded another in the eastern Ukrainian city of Kramatorsk, according to their outlet and the regional governor of the Donetsk region. Freedom Media, a state-funded news organisation, said on Thursday that Olena Gramova, 43, and Yevgen Karmazin, 33, had been killed by a Russian Lancet drone while in their car at a petrol station in the industrial city. Another reporter, Alexander Kolychev, was hospitalised after the attack. Freedom Media said that Gramova, a native of Yenakiieve in the Donetsk region, had originally trained as a "finance specialist", but turned to journalism in 2014, the year when Russia annexed Ukraine's Crimean peninsula, and started arming a separatist movement in Donetsk and Luhansk in the Donbas.
Amazon's delivery drivers will be forced to wear AI GLASSES that give them turn-by-turn directions to shave seconds off deliveries
Tearful Kim Kardashian, 45, reveals doctors found brain aneurysm after MRI... as she blames stressful Kanye West divorce As royal insiders dish the dirt, this is what I'm told is the truth about Prince Andrew's daughters This is the exact plan I followed to supercharge my weight loss... and the surprising jab side-effect that cured me of my REAL problem: SUSAN ANDERSON Finance guru storms out of podcast with illegal migrants $420K in debt who insist they'deserve' new car and pool Dakota Johnson reveals her biggest'red flag' in men after Chris Martin split'Gaslighting' and'black out' fights: Kristen Bell and Dax Shepard's'volatile' marriage laid bare by insiders The secret calls and frantic meetings over Congressman's alleged affair with aide who set herself on fire in scandal that could upend Trump's future Pete Hegseth dealt another blow as judge shoots down effort to rebrand Pentagon with'warrior ethos' There's a taboo most men find repulsive... but if they can handle it, says JANA HOCKING, it's the biggest turn on ever The real reason behind Cracker Barrel's disastrous logo change... and it makes complete sense Astonishing new video shows Louvre robbers escaping in a mechanical delivery basket with £76m-worth of jewels - after evading CCTV that was'pointing the wrong way' Elon Musk's ex Grimes baffles fans with bizarre circular face tattoo as they insist inking looks like RINGWORM Putin ally accuses Trump of an'act of war' against Russia after US president imposed new oil sanctions French girl Lola, 12, who was'raped and murdered by Algerian woman' begged'please don't hurt me' before she was brutally killed, court hears Dave Grohl on'thin ice' with wife Jordyn Blum as insiders reveal her strict list of rules to save their marriage... and his plans for daughters to build relationship with his love child Amazon's delivery drivers will be forced to wear AI GLASSES that give them turn-by-turn directions to shave seconds off deliveries READ MORE: Amazon workers claim'kill switch' triggered massive outage In a bid to shave seconds off deliveries, Amazon will soon force its delivery drivers to wear smart glasses. The futuristic glasses use artificial intelligence ( AI) to feed drivers turn-by-turn directions leading up to customers' doorsteps. They're also fitted with cameras so drivers can scan packages and capture proof of delivery. Amazon claims the dystopian device will make deliveries'as safe and seamless as possible'. However, it seems not everyone agrees.