interview
Learning the Wrong Lessons: Syntactic-Domain Spurious Correlations in Language Models
For an LLM to correctly respond to an instruction it must understand both the semantics and the domain (i.e., subject area) of a given task-instruction pair. However, syntax can also convey implicit information. Recent work shows that syntactic templates--frequent sequences of Part-of-Speech (PoS) tags--are prevalent in training data and often appear in model outputs. In this work we characterize syntactic templates, domain, and semantics in task-instruction pairs. We identify cases of spurious correlations between syntax and domain, where models learn to associate a domain with syntax during training; this can sometimes override prompt semantics.
Appendix
The DeceptionBench is designed as a research benchmark to systematically study deception behaviors in LLMs, fostering a deeper understanding of their decision-making processes in real-world scenarios. Our primary intent is to provide a standardized, transparent tool for the research community to evaluate and improve LLMs' ethical alignment, not to enable or encourage deceptive practices. To prevent potential misuse by malicious actors, we commit to publicly releasing all evaluation data under an open license. This transparency ensures that DeceptionBench's methodology and outcomes are subject to scrutiny, replication, and improvement by the research community, reducing the risk of hidden exploitation. By prioritizing openness, we aim to advance responsible AI development while safeguarding against misuse in harmful contexts. The field of Large Language Models (LLMs) has undergone remarkable evolution in recent years, reshaping the landscape of natural language processing.
Benchmark
Despite the remarkable advances of Large Language Models (LLMs) across diverse cognitive tasks, the rapid enhancement of these capabilities also introduces emergent deception behaviors that may induce severe risks in high-stakes deployments. More critically, the characterization of deception across realistic real-world scenarios remains underexplored. To bridge this gap, we establish DeceptionBench, the first benchmark that systematically evaluates how deceptive tendencies manifest across different societal domains, what their intrinsic behavioral patterns are, and how extrinsic factors affect them. Specifically, on the static count, the benchmark encompasses 150 meticulously designed scenarios in five domains, i.e., Economy, Healthcare, Education, Social Interaction, and Entertainment, with over 1,000 samples, providing sufficient empirical foundations for deception analysis. On the intrinsic dimension, we explore whether models exhibit self-interested egoistic tendencies or sycophantic behaviors that prioritize user appeasement. On the extrinsic dimension, we investigate how contextual factors modulate deceptive outputs under neutral conditions, reward-based incentivization, and coercive pressures. Moreover, we incorporate sustained multi-turn interaction loops to construct a more realistic simulation of real-world feedback dynamics. Extensive experiments across LLMs and Large Reasoning Models (LRMs) reveal critical vulnerabilities, particularly amplified deception under reinforcement dynamics, demonstrating that current models lack robust resistance to manipulative contextual cues and the urgent need for advanced safeguards against various deception behaviors.
AIhub monthly digest: June 2026 – biodiversity, resource allocation, and color metaphors
Welcome to our monthly digest, where you can catch up with any AIhub stories you may have missed, peruse the latest news, recap recent events, and more. This month, we found out how foundation models are being used for conservation efforts, how AI can help with scarce resource allocation, and how color metaphors and LLMs can teach us about human cognition. We also went to ICRA and captured some footage of cutting-edge robots. In this latest interview in our AAAI Fellow series, we found out about Tanya Berger-Wolf's research developing a foundation model for biology, the insights this model can provide for conservation and protecting ecosystems, interesting collaborations over the years, and what the future has in store. In this interview, we chat to Sanmay Das, who was elected as a Fellow "for development of multiagent interaction mechanisms and learning techniques in the public interest, and for leadership service to the profession".
AAAI presidential panel – AI agents
The Future of AI Research report, published in March 2025, aims to clearly identify the trajectory of AI research in a structured way. The report was led by outgoing AAAI President Francesca Rossi and covers 17 different AI topics . Members of the report team, and other selected AI practitioners, are taking part in a series of video panel discussions covering selected chapters from the report. In the fifth discussion in the collection, the three panellists tackle the topic of AI agents. How multi-agent systems evolved from rule-based systems to complex cooperative frameworks built on generative AI, and what is really different in the modern notion of an agentic AI system.
Flood of AI 'garbage' is pushing open-source developers to the limit
Flood of AI'garbage' is pushing open-source developers to the limit A viral cartoon about open-source software shows a teetering pile of boxes labelled "all modern digital infrastructure" and one tiny box right at the bottom, propping up the whole lot: "a project some random person in Nebraska has been thanklessly maintaining since 2003". That's the reality of open source: every website, application and operating system relies on it. Modern society couldn't function without it, and yet it's written by volunteers in their spare time. But the growing burden caused by a flood of AI-generated code is causing many to burn out and leave the community altogether, threatening the future of open-source software. 'Flashes of brilliance and frustration': I let an AI agent run my day AI models are making it easier and easier to generate code to build new features, fix bugs or create entire new projects at the click of a button.
AIhub monthly digest: May 2026 – AI for science, the lottery ticket hypothesis, and world models
Welcome to our monthly digest, where you can catch up with any AIhub stories you may have missed, peruse the latest news, recap recent events, and more. This month, we learn about AI for science, delve into world models, research transparent and trustworthy AI, and hear about the lottery ticket hypothesis. The latest interview in our series with the AAAI/SIGAI Doctoral Consortium participants featured Ximing Wen who is researching transparent and trustworthy AI systems. We found out more about her work, her experience as a research intern, and what inspired her to study AI. In this wide-ranging conversation, Jonathan Frankle delves into empiricism versus theoretical proofs, how the approach to computer science has changed (even if the fundamental problems haven't), how younger researchers are rapidly adapting to a world that values impact above all else, and what it means to be a researcher.
Burnham accuses Blair of ignoring inequality as he hits back at ex-PM
Andy Burnham has accused former Labour Prime Minister Sir Tony Blair of failing to understand what's going on in people's lives and underestimating the impact of inequality. Sir Tony used a 5,600 word essay to argue the Labour government had no coherent plan for the country and had introduced policies that had held back business. He urged Labour not to move to the left and to embrace the radical centre instead. But Burnham, who is widely expected to challenge Sir Keir Starmer for the Labour leadership if he wins a by-election next month, told the Observer Sir Tony doesn't mention inequality once in his critique of where the Labour government has gone wrong. If you don't get how that's driving politics now, if you are not rooting your analysis in the fact that people are unable to live and that things that were taken for granted are no longer affordable, then you are not understanding what's going on, said the mayor of Greater Manchester.
US college graduates face harsh job market amid economic uncertainty
Like clockwork each May, soon-to-be college graduates drift into New York City's Washington Square Park in caps and gowns, typically in purple, the school colour of nearby New York University. A sea of mostly 20-somethings gather for photographs that mark the moment when the predictability of collegiate life comes to a close and new graduates face the uncertainty of what's next. Julie Patel, who just finished a master's degree in public health, was one of those graduates. But a tight job market has dampened the joy of the graduation ceremony. Like millions of her peers around the country, she is headed into a precarious job market amid a surge in economic uncertainty driven by a range of reasons, including tariffs, the proliferation of artificial intelligence, global conflicts and, in her case, government funding cuts in her industry, slowing hiring, especially of new graduates.
Another LIV golfer remains committed to staying put: 'I have full faith in the future of LIV'
Megan Rapinoe, in a shock to no one, backs Angel Reese skipping interviews as'taking power back' White House calls out Newsom as California girls' track and field controversy reignites Here's why the coaches association's 24-team College Football Playoff could ruin the sport Boston Celtics star Jaylen Brown tells ESPN's Stephen A Smith to'be quiet and retire' President Trump on $1,000 World Cup ticket prices: 'I wouldn't pay it either, to be honest' Pirates vs. Diamondbacks betting preview targets the under as both offenses go cold in series Former LSU coach Brian Kelly uses AI to prepare for job interviews, proving he's just like the rest of us Mark Hamill is a'miserable human being': Sage Steele AOC is in'favor' of'robbing' the American people: Tiffany Smiley Iran's playbook is to talk and then fight, Lt Gen Keith Kellogg says Watters: If Iran doesn't sign this fast, the US will be a lot more violent US waits for Iran's response on peace proposal Authorities try to'connect the dots' on hantavirus infections Jesse Watters: Spencer Pratt is a'charismatic, common-sense populist' Greg Gutfeld: Dana White laughs off the'toxic masculinity thing' OutKick Another LIV golfer remains committed to staying put: 'I have full faith in the future of LIV' Thomas Detry says players'really love it' and calls on the entire roster to show cohesion and support Greg Palkot breaks down the announcement that Saudi Arabia's Public Investment Fund will cease funding for the LIV Golf tour, putting its future in jeopardy. LIV Golf now seeks new investors while players attempt to rejoin the PGA Tour. Out of seemingly nowhere, the future of the LIV Golf Tour has been put in serious jeopardy. The breakaway golf tour previously relied on funding from the Saudi Arabia-backed Public Investment Fund to back extremely high purses and bring in top players with massive signing bonuses. But that funding is coming to an end after the 2026 season, throwing all of that progress into jeopardy.