Goto

Collaborating Authors

 Europe


PAIR-CI: Calibrated Conditional Independence Testing for Causal Discovery with Incomplete Data

arXiv.org Machine Learning

The standard constraint-based paradigm for causal discovery with incomplete data -- impute first, test second -- is frequently miscalibrated: any consistent conditional independence (CI) test rejects a true null with probability approaching 1 when imputation error induces spurious conditional dependence. We introduce PAIR-CI, a nonparametric CI test that restores calibration by integrating multiple imputation directly into the inferential procedure via a paired permutation design. PAIR-CI compares cross-validated models that include and exclude the candidate variable while receiving the same imputed conditioning set, forcing imputation error to cancel in their loss difference rather than contaminate the test statistic. A provably consistent variance estimator jointly accounts for uncertainty arising from cross-validation and multiple imputation -- to our knowledge, the first formal unification of these two inferential frameworks. In simulations, existing imputation-based CI tests exhibit false positive rates of 28--45% when data are missing not at random (MNAR), whereas PAIR-CI averages below the nominal 5% level across data-generating processes and missingness mechanisms. These gains are largest in nonlinear settings and grow with causal graph size: when integrated into the PC algorithm, PAIR-CI reduces structural Hamming distance by 8% on 10-variable nonlinear graphs, 15% on 30-variable equivalents, and up to 44% on the 56-variable HAILFINDER network, with stable performance in all settings.


Self-Attention as Transport: Limits of Symmetric Spectral Diagnostics

arXiv.org Machine Learning

Large language models hallucinate in predictable ways: attention routing fails by over-concentrating on a narrow set of positions, or by spreading so diffusely that relevance is diluted, and the shape of the failure carries diagnostic signal. A widely used family of spectral methods analyzes the symmetric component of the degree-normalized attention operator, which governs transport capacity; we prove that every transpose-invariant spectral diagnostic of this operator is structurally orientation-blind (it cannot distinguish an operator from its transpose, and therefore cannot detect information-flow direction), with a quantitative converse establishing the asymmetry coefficient $G$ as the unique control parameter for direction. Pairing this with a closed-form bipartite-Cheeger landscape for canonical causal architectures, we show that uniform causal attention satisfies an $n$-independent floor $ฯ•\ge 1/5$ with worst cut at $t^\ast/n \approx 0.32$, while window attention pierces the floor as $O(w/n)$; failure modes are shape-different, not just value-different. The resulting two-axis diagnostic ($ฯ•$ for capacity, $G$ for direction) yields a falsifiable polarity prediction: bottleneck- and diffuse-dominated benchmarks should exhibit opposite polarity. Under length-controlled evaluation, transport features retain interpretable signal (LC-AUROC from 0.62 to 0.84) on tested models up to 8B parameters, with polarity reversing as predicted between HaluEval and MedHallu.


When Does Gene Regulatory Network Inference Break? A Controlled Diagnostic Study of Causal and Correlational Methods on Single-Cell Data

arXiv.org Machine Learning

Despite theoretical advantages, causal methods for Gene Regulatory Network (GRN) inference from single-cell RNA-seq data consistently fail to match or outperform correlation-based baselines in many realistic benchmarks, a persistent puzzle which casts doubt on the value of causality for this task. We argue that existing benchmarks are insufficiently controlled to answer this question because they evaluate on real or semi-real data where multiple pathologies co-occur, confounding failure modes, and obscuring the specific conditions under which different inference methods excel or fail. To address this gap, we introduce a controlled diagnostic framework that isolates seven biologically motivated pathologies (dropout, latent confounders, cell-type mixing, feedback loops, network density, sample size, and pseudotime drift) and measure how six representative methods spanning three inference paradigms degrade as each pathology intensifies. Across 6,120 controlled experiments, we find that causal methods genuinely dominate in clean and structurally favorable regimes, but specific pathologies (notably dropout and latent confounders) selectively neutralize their advantages. We further introduce an errortype decomposition that reveals methods with similar aggregate accuracy commit qualitatively different errors. To probe whether single-pathology effects persist when multiple stressors co-occur, we perform an interaction sweep over the three most impactful pathologies and find that their joint effects are sub-additive, while also exposing density-conditional cross-overs invisible to single-dial analysis. Our findings offer a nuanced understanding of when and why different methods succeed or fail for GRN inference, providing actionable insights for method development and practical guidance for practitioners.3


Adaptivity Under Realizability Constraints: Comparing In-Context and Agentic Learning

arXiv.org Machine Learning

We compare in-context learning with fixed queries and agentic learning with adaptive queries for uniform approximation of task families. We consider two settings: an unrestricted regime, where querying and approximation are arbitrary functions, and a realizable regime, where we require these operations to be implemented by ReLU neural networks. In both settings, adaptivity never hinders approximation performance. However, this advantage can change when one passes from the unrestricted regime to the realizable regime. We identify four distinct approximation scenarios, each witnessed by an explicit task family: (a) no advantage of adaptivity; (b) an advantage in the unrestricted regime that persists under ReLU realizability; (c) an advantage that arises only under realizability; and (d) an advantage that disappears under realizability. This demonstrates that representational constraints interact profoundly with the effect of adaptivity.


'We had people come just to see it': Amazon delivers its first UK parcels by drone

BBC News

'We had people come just to see it': Amazon delivers its first UK parcels by drone Amazon has become the first retailer in the UK to start a drone delivery service with a limited launch in Darlington, County Durham. Packages weighing less than 5lb (2.2kg) and containing everyday items such as beauty products, batteries and cables are now being delivered within a 7.5 mile (12km) radius of Amazon's fulfilment centre. The tech giant is convinced there is demand for ultra-fast deliveries and hopes to slowly expand the service. Rob Shield let Amazon use an Airbnb on his farm for its first test runs. Initially it was a novelty, so we were ordering everything under the sun, he says.


Russia tells diplomats to leave Kyiv in case Moscow launches mass strikes

Al Jazeera

What are Russia's gains from the Iran war? 'We are not losers; we are winners' Russia's Ministry of Foreign Affairs says it has warned diplomatic missions to promptly evacuate their staff from the Ukrainian capital, Kyiv, in case Moscow launches a mass strike on the city in response to potential Ukrainian attempts to disrupt Russia's May 9 Victory Day commemorations. In a video posted on Telegram on Wednesday, Russian Foreign Ministry spokeswoman Maria Zakharova urged diplomats to heed the Defence Ministry's warning of a strike, issued on Monday, in the event of a Ukrainian attack during the commemorations of the Soviet Union's victory against Nazi Germany in World War II and a military parade in Red Square. Zakharova said that Ukrainian President Volodymyr Zelenskyy had made "aggressive and threatening statements" about disrupting the commemorations at a meeting of the European Political Community in Armenia on Monday. "Several EU countries were present," she said. In his remarks in Armenia, Zelenskyy noted a Russian announcement that the commemorations were being scaled down and taking place without military hardware for security reasons.


A Kid With a Fake Mustache Tricked an Online Age-Verification Tool

WIRED

To stop children from bypassing its age checks, Meta is revamping its age-verification tools with an AI system that analyzes images and videos for "visual cues," such as height and bone structure. Meta is beefing up its age-verification mechanisms with an AI system that analyzes images and videos on Instagram and Facebook for "visual cues," such as height and bone structure, to identify and delete accounts of users under the age of 13. The company announced the move amid a wave of cases in which hundreds of children have managed to evade social network access restrictions, even through simple tricks such as drawing on a mustache. The new approach is part of a series of measures Meta adopted as part of an AI-based security strategy designed to correct the limitations of traditional methods, which rely heavily on self-reported age. With this change, the company seeks to reduce the ease with which minors access platforms that, in theory, are restricted to them.


Russia cuts mobile internet in Moscow citing drone security concerns

Al Jazeera

What are Russia's gains from the Iran war? 'We are not losers; we are winners' Russia has begun rolling mobile internet shutdowns in Moscow and other cities, which authorities say is to counter drone threats. As Dmitry Medvedenko reports, the measures come ahead of the May 9 Victory Day parade, which has been scaled down this year due to security concerns. Why are Pentagon officials talking about Iran's'deadly dolphins'? Iran'has attained an elevated international standing' says FM Ben-Gvir'dreams' of nooses in video posted to TikTok


Using AI for Just 10 Minutes Might Make You Lazy and Dumb, Study Shows

WIRED

New research suggests that reliance on AI assistants can have a negative impact on people's ability to think and problem solve. Using AI chatbots for even just for 10 minutes may have a shockingly negative impact on people's ability to think and problem-solve, according to a new study from researchers at Carnegie Mellon, MIT, Oxford, and UCLA. Researchers tasked people with solving various problems, including simple fractions and reading comprehension, through an online platform that paid them for their work. They conducted three experiments, each involving several hundred people. Some participants were given access to an AI assistant capable of solving the problem autonomously.


Hackers Hate AI Slop Even More Than You Do

WIRED

Hackers and other cybercriminals are complaining about "AI shit" flooding platforms where they discuss cyberattacks and other illegal activity. "I'm disappointed that you are working to incorporate AI garbage into the site," one annoyed person, posting anonymously, said in an online message. "No-one is asking for this--we want you to improve the site, stop charging for new features." Only, this is not a regular internet user moaning about AI being forced into their favorite app . Instead, they are complaining about a cybercrime forum's plans to introduce more generative AI.