Law
The land use-climate change-biodiversity nexus in European islands stakeholders
Moustakas, Aristides, Christoforidi, Irene, Zittis, George, Demirel, Nazli, Fois, Mauro, Zotos, Savvas, Gallou, Eirini, Stamatiadou, Valentini, Tzirkalli, Elli, Zoumides, Christos, Košić, Kristina, Christopoulou, Aikaterini, Dragin, Aleksandra, Łowicki, Damian, Gil, Artur, Almeida, Bruna, Chrysos, Panos, Balzan, Mario V., Mansoldo, Mark D. C., Ólafsdóttir, Rannveig, Ayhan, Cigdem Kaptan, Atay, Lutfi, Tase, Mirela, Stojanović, Vladimir, Ladičorbić, Maja Mijatov, Díaz, Juan Pedro, Expósito, Francisco Javier, Quiroga, Sonia, Cano, Miguel Ángel Casquet, Wang, Haoran, Suárez, Cristina, Manolaki, Paraskevi, Vogiatzakis, Ioannis N.
To promote climate adaptation and mitigation, it is crucial to understand stakeholder perspectives and knowledge gaps on land use and climate changes. Stakeholders across 21 European islands were consulted on climate and land use change issues affecting ecosystem services. Climate change perceptions included temperature, precipitation, humidity, extremes, and wind. Land use change perceptions included deforestation, coastal degradation, habitat protection, renewable energy facilities, wetlands, and others. Additional concerns such as invasive species, water or energy scarcity, infrastructure problems, and austerity were also considered. Climate and land use change impact perceptions were analysed with machine learning to quantify their influence. The predominant climatic characteristic is temperature, and the predominant land use characteristic is deforestation. Water-related problems are top priorities for stakeholders. Energy-related problems, including energy deficiency and issues with wind and solar facilities, rank high as combined climate and land use risks. Stakeholders generally perceive climate change impacts on ecosystem services as negative, with natural habitat destruction and biodiversity loss identified as top issues. Land use change impacts are also negative but more complex, with more explanatory variables. Stakeholders share common perceptions on biodiversity impacts despite geographic disparity, but they differentiate between climate and land use impacts. Water, energy, and renewable energy issues pose serious concerns, requiring management measures.
OffTopicEval: When Large Language Models Enter the Wrong Chat, Almost Always!
Lei, Jingdi, Gumma, Varun, Bhardwaj, Rishabh, Lim, Seok Min, Li, Chuan, Zadeh, Amir, Poria, Soujanya
Large Language Model (LLM) safety is one of the most pressing challenges for enabling wide-scale deployment. While most studies and global discussions focus on generic harms, such as models assisting users in harming themselves or others, enterprises face a more fundamental concern: whether LLM-based agents are safe for their intended use case. To address this, we introduce operational safety, defined as an LLM's ability to appropriately accept or refuse user queries when tasked with a specific purpose. We further propose OffTopicEval, an evaluation suite and benchmark for measuring operational safety both in general and within specific agentic use cases. Our evaluations on six model families comprising 20 open-weight LLMs reveal that while performance varies across models, all of them remain highly operationally unsafe. Even the strongest models - Qwen-3 (235B) with 77.77% and Mistral (24B) with 79.96% - fall far short of reliable operational safety, while GPT models plateau in the 62-73% range, Phi achieves only mid-level scores (48-70%), and Gemma and Llama-3 collapse to 39.53% and 23.84%, respectively. While operational safety is a core model alignment issue, to suppress these failures, we propose prompt-based steering methods: query grounding (Q-ground) and system-prompt grounding (P-ground), which substantially improve OOD refusal. Q-ground provides consistent gains of up to 23%, while P-ground delivers even larger boosts, raising Llama-3.3 (70B) by 41% and Qwen-3 (30B) by 27%. These results highlight both the urgent need for operational safety interventions and the promise of prompt-based steering as a first step toward more reliable LLM-based agents.
Accuracy Law for the Future of Deep Time Series Forecasting
Wang, Yuxuan, Wu, Haixu, Ma, Yuezhou, Fang, Yuchen, Zhang, Ziyi, Liu, Yong, Wang, Shiyu, Ye, Zhou, Xiang, Yang, Wang, Jianmin, Long, Mingsheng
Deep time series forecasting has emerged as a booming direction in recent years. Despite the exponential growth of community interests, researchers are sometimes confused about the direction of their efforts due to minor improvements on standard benchmarks. In this paper, we notice that, unlike image recognition, whose well-acknowledged and realizable goal is 100% accuracy, time series forecasting inherently faces a non-zero error lower bound due to its partially observable and uncertain nature. To pinpoint the research objective and release researchers from saturated tasks, this paper focuses on a fundamental question: how to estimate the performance upper bound of deep time series forecasting? Going beyond classical series-wise predictability metrics, e.g., ADF test, we realize that the forecasting performance is highly related to window-wise properties because of the sequence-to-sequence forecasting paradigm of deep time series models. Based on rigorous statistical tests of over 2,800 newly trained deep forecasters, we discover a significant exponential relationship between the minimum forecasting error of deep models and the complexity of window-wise series patterns, which is termed the accuracy law. The proposed accuracy law successfully guides us to identify saturated tasks from widely used benchmarks and derives an effective training strategy for large time series models, offering valuable insights for future research. Despite these advancements, we notice that the latest proposed models have shown minor improvements on existing widely used benchmarks. As presented in Figure 1, the improvement in the performance of deep time series models on four standard benchmarks has slowed significantly over the past three years. For instance, on the ETT benchmark (Zhou et al., 2021), the relative forecasting performance improvements exhibited a continuous downward trend from 2022 to 2025, with values of 14.98%, 7.77%, 3.93%, and 3.51% respectively.
ARMs: Adaptive Red-Teaming Agent against Multimodal Models with Plug-and-Play Attacks
Chen, Zhaorun, Liu, Xun, Kang, Mintong, Zhang, Jiawei, Pan, Minzhou, Yang, Shuang, Li, Bo
As vision-language models (VLMs) gain prominence, their multimodal interfaces also introduce new safety vulnerabilities, making the safety evaluation challenging and critical. Existing red-teaming efforts are either restricted to a narrow set of adversarial patterns or depend heavily on manual engineering, lacking scalable exploration of emerging real-world VLM vulnerabilities. To bridge this gap, we propose ARMs, an adaptive red-teaming agent that systematically conducts comprehensive risk assessments for VLMs. Given a target harmful behavior or risk definition, ARMs automatically optimizes diverse red-teaming strategies with reasoning-enhanced multi-step orchestration, to effectively elicit harmful outputs from target VLMs. We propose 11 novel multimodal attack strategies, covering diverse adversarial patterns of VLMs (e.g., reasoning hijacking, contextual cloaking), and integrate 17 red-teaming algorithms into ARMs via model context protocol (MCP). To balance the diversity and effectiveness of the attack, we design a layered memory with an epsilon-greedy attack exploration algorithm. Extensive experiments on instance- and policy-based benchmarks show that ARMs achieves SOTA attack success rates, exceeding baselines by an average of 52.1% and surpassing 90% on Claude-4-Sonnet. We show that the diversity of red-teaming instances generated by ARMs is significantly higher, revealing emerging vulnerabilities in VLMs. Leveraging ARMs, we construct ARMs-Bench, a large-scale multimodal safety dataset comprising over 30K red-teaming instances spanning 51 diverse risk categories, grounded in both real-world multimodal threats and regulatory risks. Safety fine-tuning with ARMs-Bench substantially improves the robustness of VLMs while preserving their general utility, providing actionable guidance to improve multimodal safety alignment against emerging threats.
Orchestrating Human-AI Teams: The Manager Agent as a Unifying Research Challenge
Masters, Charlie, Vellanki, Advaith, Shangguan, Jiangbo, Kultys, Bart, Gilmore, Jonathan, Moore, Alastair, Albrecht, Stefano V.
While agentic AI has advanced in automating individual tasks, managing complex multi-agent workflows remains a challenging problem. This paper presents a research vision for autonomous agentic systems that orchestrate collaboration within dynamic human-AI teams. We propose the Autonomous Manager Agent as a core challenge: an agent that decomposes complex goals into task graphs, allocates tasks to human and AI workers, monitors progress, adapts to changing conditions, and maintains transparent stakeholder communication. We formalize workflow management as a Partially Observable Stochastic Game and identify four foundational challenges: (1) compositional reasoning for hierarchical decomposition, (2) multi-objective optimization under shifting preferences, (3) coordination and planning in ad hoc teams, and (4) governance and compliance by design. To advance this agenda, we release MA-Gym, an open-source simulation and evaluation framework for multi-agent workflow orchestration. Evaluating GPT-5-based Manager Agents across 20 workflows, we find they struggle to jointly optimize for goal completion, constraint adherence, and workflow runtime - underscoring workflow management as a difficult open problem. We conclude with organizational and ethical implications of autonomous management systems.
An Senegalese Legal Texts Structuration Using LLM-augmented Knowledge Graph
Kane, Oumar, Allaya, Mouhamad M., Samb, Dame, Bousso, Mamadou
Abstract--This study examines the application of artificial intelligence (AI) and large language models (LLM) to improve access to legal texts in Senegal's judicial system. The emphasis is on the difficulties of extracting and organizing legal documents, highlighting the need for better access to judicial information. The research successfully extracted 7,967 articles from various legal documents, particularly focusing on the Land and Public Domain Code. A detailed graph database was developed, which contains 2,872 nodes and 10,774 relationships, aiding in the visualization of interconnections within legal texts. In addition, advanced triple extraction techniques were utilized for knowledge, demonstrating the effectiveness of models such as GPT - 4o, GPT -4, and Mistral-Large in identifying relationships and relevant metadata. Through these technologies, the aim is to create a solid framework that allows Senegalese citizens and legal professionals to more effectively understand their rights and responsibilities. Artificial intelligence (AI) is a transformative technology that raises significant ethical considerations regarding its use. Initiatives like Microsoft's "AI for Humanitarian Action" and Google's "AI for Social Good" focus on enhancing jurisprudence and human rights [1]. Moreover, the Center for Social Good Data Science at the University of Chicago applies AI to improve criminal justice systems.
OpenAI launch of video app Sora plagued by violent and racist images: 'The guardrails are not real'
'In a video documented by 404 Media, SpongeBob was dressed like Adolf Hitler.' 'In a video documented by 404 Media, SpongeBob was dressed like Adolf Hitler.' OpenAI launch of video app Sora plagued by violent and racist images: 'The guardrails are not real' OpenAI launched the latest iteration of its artificial intelligence-powered video generator on Tuesday, adding a social feed that allows people to share their realistic videos. OpenAI's own terms of service for Sora as well as ChatGPT's image or text generation prohibit content that "promotes violence" or, more broadly, "causes harm". In prompts and clips reviewed by the Guardian, Sora generated several videos of bomb and mass-shooting scares, with panicked people screaming and running across college campuses and in crowded places like New York's Grand Central Station. Other prompts created scenes from war zones in Gaza and Myanmar, where children fabricated by AI spoke about their homes being burned. One video with the prompt "Ethiopia footage civil war news style" had a reporter in a bulletproof vest speaking into a microphone saying the government and rebel forces were exchanging fire in residential neighborhoods.
Apple and Google Pull ICE-Tracking Apps, Bowing to DOJ Pressure
Plus: China sentences scam bosses to death, Europe is ramping up its plans to build a "drone wall" to protect against Russian airspace violations, and more. If you're traveling to a country and, once you arrive, realize it's in the midst of a Gen Z-fueled revolution, what do you do? If you're Harry Jackson, a travel vlogger, you run straight into the action. This week, WIRED spoke with Jackson, who recounted his time documenting the overthrow of Nepal's government for his social media channels and the millions of people who watched his videos. Tile tracking tags can be a useful way to find your lost keys, wallet, or pets.
Hong Kong to install surveillance cameras with AI facial recognition
Hong Kong has already installed almost 4,000 CCTV cameras under a police crime-fighting program. That number will increase to a total of 60,000 by 2028, according to documents submitted to the legislature. Hong Kong - Hong Kong plans to install tens of thousands of surveillance cameras that will make use of AI-powered facial recognition, the city's security chief said on Friday, bringing it closer to China where authorities often monitor public spaces with cutting-edge technology. The Chinese finance hub has already installed almost 4,000 closed-circuit television (CCTV) cameras under a police crime-fighting program. That number will increase to a total of 60,000 by 2028, according to documents submitted to the legislature.
Agents race to Texas crash site as balloon from space is found in crops
Diddy FUMBLES as he speaks in public for first time in 13 months and begs his mother's forgiveness through tears Robert Griffin III involved in'scary' car crash with wife and kids as shocking photos emerge Shroud of Turin mystery deepens as surgeon spots hidden detail that points to Jesus' resurrection I was so happy after trying a trendy new cosmetic procedure. But 10 years later I suffered a devastating side effect... the doctor had lied I'm no longer sleeping with my husband - and never will again, says MOLLY RYDDELL. I love him, but counted down the moments until he climaxed. Then I couldn't bear it any more and the truth spilled out... so many women feel the same The'middle-class kinks' saving marriages: Wives reveal the eight buzzy sex trends that revived their lagging libidos - including the fantasy husbands are secretly obsessed with I'm a woman with autism... here are the signs you might be masking, even from yourself Lori Loughlin's husband Mossimo Giannulli seen with mystery brunette in tiny skirt day after shock split Body count from Houston's bayous rises as serial killer whispers grip city and residents are told: 'Be vigilant' Realtor with expensive ex-wife arrested over shocking $11.6m claims about how he was funding Palm Beach lifestyle Trump dollar coin design released by Treasury... and it's inspired by the most iconic political photo of the century I've loved Taylor Swift for years. Mystery deepens over Hulk Hogan's death as his widow faces fresh anguish Warning as pasta salad is recalled due to risk of'fatal infections' Agents rushed to a West Texas farm Thursday morning after a massive balloon from space crash-landed in a crop field.