Country
MIRAGE: ABenchmark for Multimodal Information-Seeking and Reasoning in Agricultural Expert-Guided Conversations
We introduce MIRAGE, a new benchmark for multimodal expert-level reasoning and decision-making in consultative interaction settings. Designed for the agriculture domain, MIRAGE captures the full complexity of expert consultations by combining natural user queries, expert-authored responses, and image-based context, offering a high-fidelity benchmark for evaluating models on grounded reasoning, clarification strategies, and long-form generation in a real-world, knowledgeintensive domain. Grounded in over 35,000 real user-expert interactions and curated through a carefully designed multi-step pipeline, MIRAGE spans diverse crop health, pest diagnosis, and crop management scenarios. The benchmark includes more than 7,000 unique biological entities, covering plant species, pests, and diseases, making it one of the most taxonomically diverse benchmarks available for vision-language models, grounded in the real world. Unlike existing benchmarks that rely on well-specified user inputs and closed-set taxonomies, MIRAGE features underspecified, context-rich scenarios with open-world settings, requiring models to infer latent knowledge gaps, handle rare entities, and either proactively guide the interaction or respond. We evaluate more than 20 closed and open-source frontier vision-language models (VLMs), using an ensemble of reasoning language models as evaluators, highlighting the significant challenges posed by MIRAGE.
Bridging Expressivity and Scalability with Adaptive Unitary SSMs
Recent work has revealed that state space models (SSMs), while efficient for longsequence processing, are fundamentally limited in their ability to represent formal languages--particularly due to time-invariant and real-valued recurrence structures. In this work, we draw inspiration from adaptive and structured dynamics observed in biological neural systems and introduce the Adaptive Unitary State Space Model (AUSSM): a novel class of SSMs that leverages skew-symmetric, input-dependent recurrence to achieve unitary evolution and high expressive power. Using algebraic automata theory, we prove that AUSSM can perform modulo counting and simulate solvable group automata at precision logarithmically bounded in the input length, enabling SSMs to model a broad class of regular languages out of reach for other SSM architectures. To overcome the practical inefficiencies of adaptive recurrence, we develop a separable convolution formulation and a CUDA implementation that enables scalable parallel training. Empirically, we show that AUSSM and its hybrid variant--interleaved with Mamba--outperform prior SSMs on formal algorithmic tasks such as parity and modular arithmetic, and achieve competent performance on real-world long time-series classification benchmarks. Our results demonstrate that adaptive unitary recurrence provides a powerful and efficient inductive bias for both symbolic and continuous sequence modeling.
Supercomputer predicts who will win the World Cup - and which footballer will claim the Golden Boot
Inside America's new fattest town: Burgers are the size of your head, gyms lie empty and custom mobility scooters carry 800lb loads... as we investigate why Ozempic just DOESN'T work Ex-partner of dad who was berated for taking his daughters into women's bathroom claims he'exploited' girls and accuses him of failing to pay child support... before he hits back The'marry me' sex move that'll make even the most commitment-phobic of men beg to see you again... and it worked for THREE of my friends Stingy fast food giant named America's favorite restaurant AGAIN... and experts think they know why Netherlands vs Sweden - World Cup Group F LIVE: Liverpool's Cody Gakpo adds to Brian Brobbey's quickfire double as Ronald Koeman's side aim for first win Meghan went into'high-performance mode' when Serena Williams's mother'ignored her' at the US Open, body language expert claims - as visit to the UK raises the intriguing possibility of the Duchess attending Wimbledon Dua Lipa stuns in a bespoke Chanel bridal gown and parties into the early hours as she shares the first pictures from her ยฃ1.5million Little-known penis condition that SHORTENS manhood: Shockingly, 1 in 10 men have it... but most miss the signs until it's too late to reverse with easy cure: DR PETAR BAJIC Jeremy Clarkson, 66, reveals he is in remission after being diagnosed with'aggressive' prostate cancer as he says he's the'world's luckiest man' Capitol Hill glam girl shares the beauty secrets of Trump's leading ladies... from go-to makeup products to tips on achieving the perfect'Mar-A-Lago face' Harrowing chain of events behind The Ring star's death at just 35 laid bare by doctors in agonizing detail... and how it could have been prevented The four mistakes that led to bungee tragedy on Skeleton Bridge: FRED KELLY saw the scene for himself, now he retraces the prelude to disaster. So was it really an accident? Taylor Swift's bombshell wedding invite'olive branch' to Blake Lively: Insiders reveal every detail of reconciliation literally no one saw coming... and the actress has a dress picked out! Furious Trump hits back at Italian Prime Minister Meloni and gives her unusual'nickname' as their photo feud ramps up World Cup commentator denies making racist comment about Ciara live on air during USA's win over Australia TV star mom, 46, who appeared on'quitting everything to change your life' show died in fire at luxury Caribbean beach resort that sent 1,700 tourists running for their lives Swedish actress, 81, was in TWO James Bond movies and also worked with Charlton Heston, who is she?
Smart street sensors could be watching your city next
This material may not be published, broadcast, rewritten, or redistributed. Quotes displayed in real-time or delayed by at least 15 minutes. Market data provided by Factset . Powered and implemented by FactSet Digital Solutions . Mutual Fund and ETF data provided by LSEG . China's brain chip breakthrough raises big questions Should you change your phone number after a hack?
Many LLMs Are More Utilitarian Than One
Moral judgment is integral to large language models' (LLMs) social reasoning. As multi-agent systems gain prominence, it becomes crucial to understand how LLMs function when collaborating compared to operating as individual agents. In human moral judgment, group deliberation leads to a Utilitarian Boost: a tendency to endorse norm violations that inflict harm but maximize benefits for the greatest number of people. We study whether a similar dynamic emerges in multi-agent LLM systems. We test six models on well-established sets of moral dilemmas across two conditions: (1) Solo, where models reason independently, and (2) Group, where they engage in multi-turn discussions in pairs or triads.
Alchemist: Turning Public Text-to-Image Data into Generative Gold
Pre-training equips text-to-image (T2I) models with broad world knowledge, but this alone is often insufficient to achieve high aesthetic quality and alignment. Consequently, supervised fine-tuning (SFT) is crucial for further refinement. However, its effectiveness highly depends on the quality of the fine-tuning dataset. Existing public SFT datasets frequently target narrow domains (e.g., anime or specific art styles), and the creation of high-quality, general-purpose SFT datasets remains a significant challenge. Current curation methods are often costly and struggle to identify truly impactful samples.
DISCOVER: Automated Curricula for Sparse-Reward Reinforcement Learning
Sparse-reward reinforcement learning (RL) can model a wide range of highly complex tasks. Solving sparse-reward tasks is RL's core premise--requiring efficient exploration coupled with long-horizon credit assignment--and overcoming these challenges is key for building self-improving agents with superhuman ability. Prior work commonly explores with the objective of solving many sparse-reward tasks, making exploration of individual high-dimensional, long-horizon tasks intractable. We argue that solving such challenging tasks requires solving simpler tasks that are relevant to the target task, i.e., whose achieval will teach the agent skills required for solving the target task. We demonstrate that this sense of direction, necessary for effective exploration, can be extracted from existing RL algorithms, without leveraging any prior information. To this end, we propose a method for directed sparse-reward goal-conditioned very long-horizon RL (DISCOVER), which selects exploratory goals in the direction of the target task. We connect DISCOVER to principled exploration in bandits, formally bounding the time until the target task becomes achievable in terms of the agent's initial distance to the target, but independent of the volume of the space of all tasks. We then perform a thorough evaluation in high-dimensional environments. We find that the directed goal selection of DISCOVER solves exploration problems that are beyond the reach of prior state-of-the-art exploration methods in RL.
'Positive' or 'unnecessary'? - UK teens on social media ban
School children in Preston and Manchester had mixed feelings about a proposed social media ban for under-16s following an announcement from Prime Minister Sir Keir Starmer. On Monday, Starmer said under-16s will be banned from social media platforms such as Snapchat, TikTok, YouTube, Instagram, Facebook and X by spring 2027. Speaking to the BBC, some pupils described the ban as unnecessary as they asked for more responsibility for parents. One student said she hoped the ban will have a positive impact on young people's lives and their mental health. How much screen time is too much for under fives?
OmniTalker: One-shot Real-time Text-Driven Talking Audio-Video Generation With Multimodal Style Mimicking
Although significant progress has been made in audio-driven talking head generation, text-driven methods remain underexplored. In this work, we present OmniTalker, a unified framework that jointly generates synchronized talking audiovideo content from input text while emulating the target identity's speaking and facial movement styles, including speech characteristics, head motion, and facial dynamics. Our framework adopts a dual-branch diffusion transformer (DiT) architecture, with one branch dedicated to audio generation and the other to video synthesis. At the shallow layers, cross-modal fusion modules are introduced to integrate information between the two modalities. In deeper layers, each modality is processed independently, with the generated audio decoded by a vocoder and the video rendered using a GAN-based high-quality visual renderer. Leveraging DiT's in-context learning capability through a masked-infilling strategy, our model can simultaneously capture both audio and visual styles without requiring explicit style extraction modules. Thanks to the efficiency of the DiT backbone and the optimized visual renderer, OmniTalker achieves real-time inference at 25 FPS. To the best of our knowledge, OmniTalker is the first one-shot framework capable of jointly modeling speech and facial styles in real time. Extensive experiments demonstrate its superiority over existing methods in terms of generation quality, particularly in preserving style consistency and ensuring precise audio-video synchronization, all while maintaining efficient inference.