Industry
On the Empirical Power of Goodness-of-Fit Tests in Watermark Detection
Large language models (LLMs) raise concerns about content authenticity and integrity because they can generate human-like text at scale. Text watermarks, which embed detectable statistical signals into generated text, offer a provable way to verify content origin. Many detection methods rely on pivotal statistics that are i.i.d.
Quantum speedup of non-linear Monte Carlo problems
The mean of a random variable can be understood as a linear functional on the space of probability distributions. Quantum computing is known to provide a quadratic speedup over classical Monte Carlo methods for mean estimation. In this paper, we investigate whether a similar quadratic speedup is achievable for estimating non-linear functionals of probability distributions. We propose a quantum-insidequantum algorithm that achieves this speedup for the broad class of nonlinear estimation problems known as nested expectations. Our algorithm improves upon the direct application of the quantum-accelerated multilevel Monte Carlo algorithm introduced by An et al. (2021). The existing lower bound indicates that our algorithm is optimal up to polylogarithmic factors. A key innovation of our approach is a new sequence of multilevel Monte Carlo approximations specifically designed for quantum computing, which is central to the algorithm's improved performance.
Unextractable Protocol Models: Collaborative Training and Inference without Weight Materialization
We consider a decentralized setup in which the participants collaboratively train and serve a large neural network, and where each participant only processes a subset of the model. In this setup, we explore the possibility of unmaterializable weights, where a full weight set is never available to any one participant. We introduce Unextractable Protocol Models (UPMs): a training and inference framework that leverages the sharded model setup to ensure model shards (i.e., subsets) held by participants are incompatible at different time steps. UPMs periodically inject timevarying, random, invertible transforms at participant boundaries; preserving the overall network function yet rendering cross-time assemblies incoherent. On Qwen2.5-0.5B and Llama-3.2-1B, 10 000 transforms leave FP32 perplexity unchanged ( PPL< 0.01; Jensen-Shannon drift < 4 10 5), and we show how to control growth for lower precision datatypes. Applying a transform every 30s adds 3% latency, 0.1% bandwidth, and 10% GPU-memory overhead at inference, while training overhead falls to 1.6% time and < 1% memory. We consider several attacks, showing that the requirements of direct attacks are impractical and easy to defend against, and that gradient-based fine-tuning of stitched partitions consumes 60% of the tokens required to train from scratch. By enabling models to be collaboratively trained yet not extracted, UPMs make it practical to embed programmatic incentive mechanisms in community-driven decentralized training.
Andrew Hastie compares AI to cold-war nuclear arms race and warns Australia may fall behind
Andrew Hastie has said the education system should be overhauled so'we can unleash Australian hearts and minds on AI'. Andrew Hastie has said the education system should be overhauled so'we can unleash Australian hearts and minds on AI'. Liberal MP says Australia risks sovereignty and strategic independence being'constrained by the AI superpowers reshaping the global order' Liberal MP Andrew Hastie says Australia should dramatically scale up investment in artificial intelligence to preserve strategic independence and warns the country risks being "a supplicant state" tethered to the US in an era of possible hot conflict with China. In a major address to Liberal members in Sydney on Monday night, the shadow minister for industry and sovereign capability likened the development of AI to the nuclear arms race of the cold-war era and proposed Australia position itself as a technology hub in the southern hemisphere. Delivering the annual Tom Hughes Oration, Hastie called for a new AI ambassador to be appointed and said the education system should be overhauled "so we can unleash Australian hearts and minds on AI". He said prime ministers, including Robert Menzies and John Gorton, had wrestled with the question of Australia pursuing nuclear capability, but ultimately aligned our security settings with Washington.
1af83ab66b4f07a3f55788e67dab5782-Paper-Conference.pdf
Large vision-language models (LVLMs) are increasingly deployed in interactive applications such as virtual and augmented reality, where a first-person (egocentric) view captured by head-mounted cameras serves as key input. While this view offers fine-grained cues about user attention and hand-object interactions, its narrow field of view and lack of global context often lead to failures on spatially or contextually demanding queries. To address this, we introduce a framework that augments egocentric inputs with third-person (exocentric) views, providing complementary information such as global scene layout and object visibility to LVLMs. We present E3VQA, the first benchmark for multi-view question answering with 4K high-quality question-answer pairs grounded in synchronized ego-exo image pairs. Additionally, we propose M3CoT, a training-free prompting technique that constructs a unified scene representation by integrating scene graphs from three complementary perspectives. M3CoT enables LVLMs to reason more effectively across views, yielding consistent performance gains (4.84% for GPT4o and 5.94% for Gemini 2.0 Flash) over a recent CoT baseline. Our extensive evaluation reveals key strengths and limitations of LVLMs in multi-view reasoning and highlights the value of leveraging both egocentric and exocentric inputs. The dataset and source code are available at https://github.com/Leeinsu1/
1ae5c1db7569a6c2f395020765b119a4-Paper-Position_Paper_Track.pdf
Artificial intelligence (AI) now permeates critical infrastructures and decisionmaking systems where failures produce social, economic, and democratic harm. This position paper challenges the entrenched belief that regulation and innovation are opposites. As evidenced by analogies from aviation, pharmaceuticals, and welfare systems and recent cases of synthetic misinformation, bias and unaccountable decision-making, the absence of well-designed regulation has already created immeasurable damage. Regulation, when thoughtful and adaptive, is not a brake on innovation--it is its foundation. The present position paper examines the EU AIAct as a model of risk-based, responsibility-driven regulation that addresses the Collingridge Dilemma: acting early enough to prevent harm, yet flexibly enough to sustain innovation. Its adaptive mechanisms--regulatory sandboxes, small and medium enterprises (SMEs) support, real-world testing, fundamental rights impact assessment (FRIA)--demonstrate how regulation can accelerate responsibly, rather than delay, technological progress. The position paper summarises how governance tools transform perceived burdens into tangible advantages: legal certainty, consumer trust, and ethical competitiveness.
MIP against Agent: Malicious Image Patches Hijacking Multimodal OSAgents
Large language models (LLMs) and vision-language models (VLMs) have demonstrated remarkable capabilities, driving significant advancements across a wide range of applications. These models are typically fine-tuned to align with specific objectives, such as being "helpful and harmless" [39]. However, recent work on adversarial attacks has demonstrated that carefully crafted inputs can bypass these alignment safeguards [65, 10, 4, 26, 52]. While such adversarial attacks can elicit harmful responses, the output is usually constrained to text that is not directly actionable, limiting the scope of possible harm. While malicious text outputs are concerning, it remains unclear whether the associated risks exceed those posed by information already accessible through the internet [18].
UniZyme: AUnified Protein Cleavage Site Predictor Enhanced with Enzyme Active-Site Knowledge
Enzyme-catalyzed protein cleavage is essential for many biological functions. Accurate prediction of cleavage sites can facilitate various applications such as drug development, enzyme design, and a deeper understanding of biological mechanisms. However, most existing models are restricted to an individual enzyme, which neglects shared knowledge of enzymes and fails to generalize to novel enzymes. Thus, we introduce a unified protein cleavage site predictor named UniZyme, which can generalize across diverse enzymes. To enhance the enzyme encoding for the protein cleavage site prediction, UniZyme employs a novel biochemically-informed model architecture along with active-site knowledge of proteolytic enzymes. Extensive experiments demonstrate that UniZyme achieves high accuracy in predicting cleavage sites across a range of proteolytic enzymes, including unseen enzymes. The code is available in https://github.com/Ao-LiChen/UniZyme.
1aa1fde3661b23ba9b043082069fd144-Paper-Conference.pdf
While diffusion models have achieved remarkable success in text-to-image generation, they encounter significant challenges with instruction-driven image editing. Our structurally-inconsistent research highlights a edits key that challenge: involve these substantial models layout particularly changes.