Goto

Collaborating Authors

 Government


Perturbation is All You Need for Extrapolating Language Models

arXiv.org Machine Learning

We introduce a simple yet powerful framework for training large language models. In contrast to the standard autoregressive next-token prediction based on an exact prefix, we propose a perturbation-based procedure that first transforms the prefix into a semantic neighbor and then conditions on this perturbed variant for next-token prediction. This yields a hierarchical model with a pre-post-additive noise structure. Within this framework, we develop a rigorous theory of extrapolability, namely, the capacity of a model class to make reliable predictions for token sequences that lie outside the empirical support of the training corpus. We evaluate the finite-sample performance of the proposed procedure using both synthetic and real-world language data. Results show that the proposed method consistently improves out-of-support prediction while maintaining competitive in-support performance, demonstrating that perturbation offers a practical route to language modeling.


Hypergraph Generation via Structured Stochastic Diffusion

arXiv.org Machine Learning

Hypergraphs model higher-order interactions, but realistic hypergraph generation remains difficult because incidence, hyperedge-size heterogeneity, and overlap structure are not faithfully captured by pairwise reductions. We propose \HEDGE, a generative model defined directly on relaxed incidence matrices via a structured stochastic diffusion. The forward process combines a hypergraph-specific two-sided heat operator with an Ornstein--Uhlenbeck component, preserving structure-aware noising near the data while yielding an explicit Gaussian terminal law. Conditional on an observed hypergraph, this forward process is linear-Gaussian, so conditional means, covariances, scores, and reverse-drift targets are available in closed form. We therefore learn a permutation-equivariant state-only reverse-drift field in incidence space by regressing onto exact conditional targets, and generate samples by simulating a learned reverse-time SDE from the Gaussian base law. We establish exactness in the ideal state-only setting together with finite-horizon stability guarantees, and empirically show improved hypergraph generation quality relative to strong baselines.


'We had people come just to see it': Amazon delivers its first UK parcels by drone

BBC News

'We had people come just to see it': Amazon delivers its first UK parcels by drone Amazon has become the first retailer in the UK to start a drone delivery service with a limited launch in Darlington, County Durham. Packages weighing less than 5lb (2.2kg) and containing everyday items such as beauty products, batteries and cables are now being delivered within a 7.5 mile (12km) radius of Amazon's fulfilment centre. The tech giant is convinced there is demand for ultra-fast deliveries and hopes to slowly expand the service. Rob Shield let Amazon use an Airbnb on his farm for its first test runs. Initially it was a novelty, so we were ordering everything under the sun, he says.


Russia tells diplomats to leave Kyiv in case Moscow launches mass strikes

Al Jazeera

What are Russia's gains from the Iran war? 'We are not losers; we are winners' Russia's Ministry of Foreign Affairs says it has warned diplomatic missions to promptly evacuate their staff from the Ukrainian capital, Kyiv, in case Moscow launches a mass strike on the city in response to potential Ukrainian attempts to disrupt Russia's May 9 Victory Day commemorations. In a video posted on Telegram on Wednesday, Russian Foreign Ministry spokeswoman Maria Zakharova urged diplomats to heed the Defence Ministry's warning of a strike, issued on Monday, in the event of a Ukrainian attack during the commemorations of the Soviet Union's victory against Nazi Germany in World War II and a military parade in Red Square. Zakharova said that Ukrainian President Volodymyr Zelenskyy had made "aggressive and threatening statements" about disrupting the commemorations at a meeting of the European Political Community in Armenia on Monday. "Several EU countries were present," she said. In his remarks in Armenia, Zelenskyy noted a Russian announcement that the commemorations were being scaled down and taking place without military hardware for security reasons.


SpaceX backs Anthropic with data centre deal amidst Musk's OpenAI lawsuit

Al Jazeera

SpaceX backs Anthropic with data centre deal amidst Musk's OpenAI lawsuit Anthropic has reached a deal to tap the computing resources of Elon Musk's SpaceX, marking a detente with its one-time critic and a boost for both companies in the high-stakes artificial intelligence race. Under the agreement announced on Wednesday, Anthropic will use the full computing power of SpaceX's Colossus 1 facility in Memphis, Tennessee, which houses more than 220,000 Nvidia processors and will give the Claude chatbot maker 300 megawatts of new capacity within a month. That's enough electricity to power more than 300,000 homes - as the Dario Amodei-led company seeks to boost the capacity of its Claude Pro and Claude Max AI assistants for subscribers. The tool allows AI systems to review work between sessions, spot patterns, and update files that store user preferences and other context. Available as a research preview, "dreaming" comes with software for managing agents, or AI programmes that perform tasks with little human involvement.


Canadian officials claim OpenAI violated federal and provincial privacy laws

Engadget

Philippe Dufresne, the Privacy Commissioner of Canada, has found OpenAI was not compliant with Canadian federal and provincial privacy laws in the training of its AI models. Following an investigation, Dufresne and his counterparts in Alberta, Quebec and British Columbia say OpenAI's approach to things like data collection and consent stepped on multiple laws, including Canada's Personal Information Protection and Electronic Documents Act (PIPEDA), which governs how companies collect and use personal information during the normal course of business. The commissioners participating in the investigation identified multiple privacy issues with OpenAI's approach, including that the company gathered vast amounts of personal information without adequate safeguards to prevent use of that information to train its models, and that it failed to acquire consent to collect and use that personal information in the first place. Warnings in ChatGPT note that interactions with the AI could be used in training, but third-party data OpenAI has purchased or scraped also includes personal details people likely aren't even aware of. The fact that ChatGPT users have no way to access, correct or delete that data was another issue that the commissioners identified, according to a summary of the investigation's findings, along with OpenAI's lackluster attempts to acknowledge the inaccuracy of some of ChatGPT's responses.


Trump's Team Wants Him to Accept an Iran Deal He's Already Rejected

WIRED

As chaotic negotiations over the end of the Iran war continue, US negotiators think they have the framework for a deal in place. Now they just have to sell the president on it. President Donald Trump's negotiators face the arduous task of trying to convince the president that a deal he previously rejected is their best option in Iran . Last month, Trump initially gave his blessing for a so-called "cash for uranium" deal, under which the US would release around $20 billion in frozen funds in exchange for Iran handing over its stockpile of highly enriched uranium, sources familiar with the matter tell WIRED. Trump's negotiators, vice president JD Vance, special envoy Steve Witkoff, and Jared Kushner, Trump's son-in-law, received repeated approvals from the president while they were in Islamabad, giving them confidence a deal was close.



Using AI for Just 10 Minutes Might Make You Lazy and Dumb, Study Shows

WIRED

New research suggests that reliance on AI assistants can have a negative impact on people's ability to think and problem solve. Using AI chatbots for even just for 10 minutes may have a shockingly negative impact on people's ability to think and problem-solve, according to a new study from researchers at Carnegie Mellon, MIT, Oxford, and UCLA. Researchers tasked people with solving various problems, including simple fractions and reading comprehension, through an online platform that paid them for their work. They conducted three experiments, each involving several hundred people. Some participants were given access to an AI assistant capable of solving the problem autonomously.


New AI brain lets robots move like humans

FOX News

Genesis AI unveils GENE-26.5, a robotic brain designed to help general-purpose robots perform complex physical tasks with human-level dexterity and coordination.