flaw
Do you need to worry about Mythos, Anthropic's computer-hacking AI?
Do you need to worry about Mythos, Anthropic's computer-hacking AI? A powerful AI kept from public access because of its ability to hack computers with impunity is making headlines around the world. But what is Mythos, does it really represent a risk and might it even be used to improve cybersecurity? Anthropic's Project Glasswing aims to improve online security The past few weeks have brought apparently alarming news of Mythos, an AI that can identify cybersecurity flaws in a matter of moments, leaving operating systems and software vulnerable to hackers. The cybersecurity community is now beginning to get a better sense of how Mythos may change the face of cybersecurity - and not necessarily for the worse.
Vision Mamba Mender
Mamba, a state-space model with selective mechanisms and hardware-aware architecture, has demonstrated outstanding performance in long sequence modeling tasks, particularly garnering widespread exploration and application in the field of computer vision. While existing works have mixed opinions of its application in visual tasks, the exploration of its internal workings and the optimization of its performance remain urgent and worthy research questions given its status as a novel model. Existing optimizations of the Mamba model, especially when applied in the visual domain, have primarily relied on predefined methods such as improving scanning mechanisms or integrating other architectures, often requiring strong priors and extensive trial and error. In contrast to these approaches, this paper proposes the Vision Mamba Mender, a systematic approach for understanding the workings of Mamba, identifying flaws within, and subsequently optimizing model performance. Specifically, we present methods for predictive correlation analysis of Mamba's hidden states from both internal and external perspectives, along with corresponding definitions of correlation scores, aimed at understanding the workings of Mamba in visual recognition tasks and identifying flaws therein. Additionally, tailored repair methods are proposed for identified external and internal state flaws to eliminate them and optimize model performance. Extensive experiments validate the efficacy of the proposed methods on prevalent Mamba architectures, significantly enhancing Mamba's performance.
AI 'vibe-coding' platform's flaws allow BBC reporter to be hacked
AI coding platform's flaws allow BBC reporter to be hacked The BBC has been shown a significant - and unfixed - cyber-security risk in a popular AI coding platform. Orchids is a so-called vibe-coding tool, meaning people without technical skills can use it to build apps and games by typing a text prompt into a chatbot. Such platforms have exploded in popularity in recent months, and are often heralded as an early example of how various professional services could be done quickly and cheaply by AI. But experts say the ease with which Orchids can be hacked demonstrates the risks of allowing AI bots deep access to our computers in exchange for the convenience of allowing them to carry out tasks autonomously. The BBC has repeatedly asked the company for comment but it has not replied.
AI is promising to revolutionise how we diagnose mental illness
As rates of mental health conditions like depression spike, we desperately need new ways of identifying and treating people in distress. The last big breakthrough in treating depression was all the way back in the 1980s. That was when Prozac, the first SSRI antidepressant, was released. It and its subsequent copycats soon swept the globe, and hundreds of millions of people have now taken this kind of medication. But while three-quarters of people say the pills have helped them feel better, they don't work for everyone.
The Missing Invariance Principle found -- the Reciprocal Twin of Invariant Risk Minimization
Machine learning models often generalize poorly to out-of-distribution (OOD) data as a result of relying on features that are spuriously correlated with the label during training. Recently, the technique of Invariant Risk Minimization (IRM) was proposed to learn predictors that only use invariant features by conserving the feature-conditioned label expectation $\mathbb{E}_e[y|f(x)]$ across environments. However, more recent studies have demonstrated that IRM-v1, a practical version of IRM, can fail in various settings. Here, we identify a fundamental flaw of IRM formulation that causes the failure. We then introduce a complementary notion of invariance, MRI, based on conserving the label-conditioned feature expectation $\mathbb{E}_e[f(x)|y]$, which is free of this flaw. Further, we introduce a simplified, practical version of the MRI formulation called MRI-v1. We prove that for general linear problems, MRI-v1 guarantees invariant predictors given sufficient number of environments. We also empirically demonstrate that MRI-v1 strongly out-performs IRM-v1 and consistently achieves near-optimal OOD generalization in image-based nonlinear problems.
VERIRAG: A Post-Retrieval Auditing of Scientific Study Summaries
Mohole, Shubham, Choi, Hongjun, Liu, Shusen, Klymko, Christine, Kushwaha, Shashank, Shi, Derek, Sakla, Wesam, Galhotra, Sainyam, Glatt, Ruben
Can democratized information gatekeepers and community note writers effectively decide what scientific information to amplify? Lacking domain expertise, such gatekeepers rely on automated reasoning agents that use RAG to ground evidence to cited sources. But such standard RAG systems validate summaries via semantic grounding and suffer from "methodological blindness," treating all cited evidence as equally valid regardless of rigor. To address this, we introduce VERIRAG, a post-retrieval auditing framework that shifts the task from classification to methodological vulnerability detection. Using private Small Language Models (SLMs), VERIRAG audits source papers against the Veritable taxonomy of statistical rigor. We contribute: (1) a benchmark of 1,730 summaries with realistic, non-obvious perturbations modeled after retracted papers; (2) the auditable Veritable taxonomy; and (3) an operational system that improves Macro F1 by at least 19 points over baselines using GPT-based SLMs, a result that replicates across MISTRAL and Gemma architectures. Given the complexity of detecting non-obvious flaws, we view VERIRAG as a "vulnerability-detection copilot," providing structured audit trails for human editors. In our experiments, individual human testers found over 80% of the generated audit trails useful for decision-making. We plan to release the dataset and code to support responsible science advocacy.
CALM Before the STORM: Unlocking Native Reasoning for Optimization Modeling
Tang, Zhengyang, Ye, Zihan, Huang, Chenyu, Huang, Xuhan, Li, Chengpeng, Li, Sihang, Chen, Guanhua, Yan, Ming, Wang, Zizhuo, Zha, Hongyuan, Liu, Dayiheng, Wang, Benyou
Large Reasoning Models (LRMs) have demonstrated strong capabilities in complex multi-step reasoning, opening new opportunities for automating optimization modeling. However, existing domain adaptation methods, originally designed for earlier instruction-tuned models, often fail to exploit the advanced reasoning patterns of modern LRMs -- In particular, we show that direct fine-tuning on traditional \textit{non-reflective} datasets leads to limited gains. To fully leverage LRMs' inherent reasoning abilities, we propose \textbf{CALM} (\textit{Corrective Adaptation with Lightweight Modification}), a framework that progressively refines LRMs within their native reasoning modes for optimization modeling tasks. In CALM, an expert intervener identifies reasoning flaws and provides concise corrective hints, which the LRM incorporates to produce improved reasoning trajectories. These interventions modify fewer than 2.6\% of generated tokens, but generate high-quality data for soft adaptation through supervised fine-tuning. The adapted model is then further improved through reinforcement learning. Building on CALM, we develop \textbf{STORM} (\textit{Smart Thinking Optimization Reasoning Model}), a 4B-parameter LRM that achieves a new state-of-the-art average accuracy of 68.9\% across five popular optimization modeling benchmarks, matching the performance of a 671B LRM. These results demonstrate that dynamic, hint-based data synthesis both preserves and amplifies the native reasoning patterns of modern LRMs, offering a more effective and scalable path towards expert-level performance on challenging optimization modeling tasks.
CryptoScope: Utilizing Large Language Models for Automated Cryptographic Logic Vulnerability Detection
Li, Zhihao, Ji, Zimo, Zheng, Tao, Ren, Hao, Lan, Xiao
Cryptographic algorithms are fundamental to modern security, yet their implementations frequently harbor subtle logic flaws that are hard to detect. We introduce CryptoScope, a novel framework for automated cryptographic vulnerability detection powered by Large Language Models (LLMs). CryptoScope combines Chain-of-Thought (CoT) prompting with Retrieval-Augmented Generation (RAG), guided by a curated cryptographic knowledge base containing over 12,000 entries. We evaluate CryptoScope on LLM-CLVA, a benchmark of 92 cases primarily derived from real-world CVE vulnerabilities, complemented by cryptographic challenges from major Capture The Flag (CTF) competitions and synthetic examples across 11 programming languages. CryptoScope consistently improves performance over strong LLM baselines, boosting DeepSeek-V3 by 11.62%, GPT-4o-mini by 20.28%, and GLM-4-Flash by 28.69%. Additionally, it identifies 9 previously undisclosed flaws in widely used open-source cryptographic projects.