Goto

Collaborating Authors

 Large Language Model


US-China head-to-head: Explained in 11 maps and charts

Al Jazeera

US President Donald Trump will meet Chinese President Xi Jinping in Beijing on May 14 and 15, following weeks of delays due to the US-Israel war on Iran. The talks are expected to focus on trade relations and mark the first time a US president has visited China in nearly a decade. In recent decades, the US and China have emerged as the world's dominant superpowers, frequently seen as locked in a contest for who sits atop the world order. A quarter of a century ago, by contrast, the US dwarfed China in most major indicators, but today, Beijing is regarded as the factory of the world and is outpacing its Western counterpart in many regards. Who is the world's top trading power?


Family sues OpenAI, alleging ChatGPT advice led to accidental overdose

Engadget

OpenAI is facing another wrongful death lawsuit . Leila Turner-Scott and Angus Scott filed a lawsuit against the company, alleging that it designed and distributed a defective product that led to the death of their son Sam Nelson from an accidental overdose. Specifically, they're alleging that Sam died following the exact medical advice GPT-4o had provided and approved. In the lawsuit, the plaintiffs described how Sam, a 19-year-old junior at the University of California, Merced, started using ChatGPT in 2023 when he was in high school to help with homework and to troubleshoot computer problems. Sam then started asking the chatbot about safe drug use, but ChatGPT initially refused to answer his question, telling him that it couldn't assist him and warning him that taking drugs can have serious consequences for his health and well-being.


Elon Musk Had 'Hair-Raising' Idea of Passing OpenAI Onto His Kids, Sam Altman Says

WIRED

Elon Musk Had'Hair-Raising' Idea of Passing OpenAI Onto His Kids, Sam Altman Says Musk's lawyers questioned Altman over allegations of deception and his network of financial investments, but the OpenAI CEO painted a picture of Musk as obsessed with controlling the company. Sam Altman took to the witness stand to defend his reputation in the trial on Tuesday, as Elon Musk's lawyers peppered the OpenAI CEO with hours of questions regarding his alleged history of deceptive behavior . The cross examination was a much needed win for Musk, who has so far struggled to make a convincing case. Tuesday's testimony included several heated exchanges in which the OpenAI CEO had to respond to allegations from former colleagues suggesting he's untrustworthy . Highlighting this evidence is not only important for Musk winning over a jury, but also for beating OpenAI in the court of public opinion.


Uniform Scaling Limits in AdamW-Trained Transformers

arXiv.org Machine Learning

We study the large-depth limit of transformers trained with AdamW, by modelling the hidden-state dynamics as an interacting particle system (IPS) coupled through the attention mechanism. Under appropriate scaling of the attention heads, we prove that the joint dynamics of the hidden states and backpropagated variables converge in $L^2$, uniformly over the initial condition, to the solution of a forward--backward system of ODEs at rate $\mathcal O(L^{-1}+L^{-1/3}H^{-1/2})$. Here, $L$ and $H$ denote the depth and number of heads of the transformer, respectively. The limiting system of ODEs can be identified with a McKean--Vlasov ODE (MVODE) when the attention heads do not incorporate causal masking. By using the flow maps associated with this MVODE and applying concentration of measure techniques, we obtain bounds on the difference between the discrete and continuous models that are uniform over compact sets of initial conditions. As this is achieved without resorting to a covering argument, the constants in our bounds are independent of the number of tokens. Furthermore, under a suitable adaptation to AdamW, the bounds become independent of the token embedding dimension.


FibQuant: Universal Vector Quantization for Random-Access KV-Cache Compression

arXiv.org Machine Learning

Long-context inference is increasingly a memory-traffic problem. The culprit is the key--value (KV) cache: it grows with context length, batch size, layers, and heads, and it is read at every decoding step. Rotation-based scalar codecs meet this systems constraint by storing a norm, applying a shared random rotation, and quantizing one coordinate at a time. They are universal and random-access, but they discard the geometry created by the normalization step. After a Haar rotation, a block of $k$ consecutive coordinates is not a product source; it is a spherical-Beta source on the unit ball. We introduce \textsc{FibQuant}, a universal fixed-rate vector quantizer that keeps the same normalize--rotate--store interface while replacing scalar tables by a shared radial--angular codebook matched to this canonical source. The codebook combines Beta-quantile radii, Fibonacci\,/\,Roberts--Kronecker quasi-uniform directions, and multi-restart Lloyd--Max refinement. We prove that the resulting vector code strictly improves on its scalar product specialization at matched rate, with a high-rate gain that separates into a cell-shaping factor and a density-matching factor. The same construction gives a dense rate axis, including fractional-bit and sub-one-bit operating points, without calibration or variable-length addresses. On GPT-2 small KV caches, \textsc{FibQuant} traces a memory--fidelity frontier from $5\times$ compression at $0.99$ attention cosine similarity to $34\times$ at $0.95$. End-to-end on TinyLlama-1.1B, it is within $0.10$ perplexity of fp16 at $4\times$ compression and has $3.6\times$ lower perplexity than scalar \textsc{TurboQuant} at $b = 2$ ($8\times$ compression), where scalar random-access quantization begins to fail.


LOFT: Low-Rank Orthogonal Fine-Tuning via Task-Aware Support Selection

arXiv.org Machine Learning

Orthogonal parameter-efficient fine-tuning (PEFT) adapts pretrained weights through structure-preserving multiplicative transformations, but existing methods often conflate two distinct design choices: the subspace in which adaptation occurs and the transformation applied within that subspace. This paper introduces LOFT, a low-rank orthogonal fine-tuning framework that explicitly separates these two components. By viewing orthogonal adaptation as a multiplicative subspace rotation, LOFT provides a unified formulation that recovers representative orthogonal PEFT methods, including coordinate-, butterfly-, Householder-, and principal-subspace-based variants. More importantly, this perspective exposes support selection as a central design axis rather than a byproduct of a particular parameterization. We develop a first-order analysis showing that useful adaptation supports should be informed by the downstream training signal, motivating practical task-aware support selection strategies. Across language understanding, visual transfer, mathematical reasoning, and multilingual out-of-distribution adaptation, LOFT recovers principal-subspace orthogonal adaptation while gradient-informed supports improve the efficiency-performance trade-off under matched parameter, memory, and compute budgets. These results suggest that principled support selection is an important direction for improving orthogonal PEFT.


Pion: A Spectrum-Preserving Optimizer via Orthogonal Equivalence Transformation

arXiv.org Machine Learning

We introduce Pion, a spectrum-preserving optimizer for large language model (LLM) training based on orthogonal equivalence transformation. Unlike additive optimizers such as Adam and Muon, Pion updates each weight matrix through left and right orthogonal transformations, preserving its singular values throughout training. This yields an optimization mechanism that modulates the geometry of weight matrices while keeping their spectral norm fixed. We derive the Pion update rule, systematically examine its design choices, and analyze its convergence behavior along with several key properties. Empirical results show that Pion offers a stable and competitive alternative to standard optimizers for both LLM pretraining and finetuning.


Sam Altman says Elon Musk wanted 90 percent of OpenAI in high-stakes trial

Al Jazeera

In a United States court, OpenAI chief executive Sam Altman has rejected claims from fellow tech mogul Elon Musk that he betrayed the artificial intelligence company's original vision. Tuesday marked the start of Altman's testimony in a contentious trial unfolding in Oakland, California, between some of tech's richest and most powerful titans. He alleged that OpenAI's leader persuaded him to invest $38bn, based on a goal of improving humanity, only to see the company pivot to a for-profit venture in 2019. On the witness stand on Tuesday, Altman instead framed Musk as a competitor obsessed with exercising control over OpenAI. "It does not fit with my conception of the words'stealing a charity' to look at what has actually happened here," Altman told the court.


Sam Altman defends OpenAI in courtroom showdown with Elon Musk

The Guardian

Sam Altman is questioned by OpenAI's attorney, Bill Savitt, before Yvonne Gonzalez Rogers, a US district judge, at a federal courthouse in Oakland, California, on 12 May 2026 in a courtroom sketch. Sam Altman is questioned by OpenAI's attorney, Bill Savitt, before Yvonne Gonzalez Rogers, a US district judge, at a federal courthouse in Oakland, California, on 12 May 2026 in a courtroom sketch. The OpenAI CEO, Sam Altman, took the stand on Tuesday to defend himself and his company against a lawsuit by Elon Musk . Altman is set to be one of the final witnesses in the trial, which has pitted two of the tech industry's most powerful men against each other in a dramatic courtroom showdown. Musk has accused Altman and OpenAI of breaking the AI firm's founding agreement by restructuring it into a for-profit enterprise, alleging that Altman essentially swindled him into co-founding the company and providing tens of millions in financial backing.


Elon Musk said control of OpenAI should go to his children, Sam Altman tells jury

BBC News

Elon Musk tried to take control of OpenAI, even suggesting it could pass to his children when he dies, Sam Altman said on Tuesday. Altman is co-founder and chief executive of the artificial intelligence (AI) company behind ChatGPT. He is being sued by Musk, who accuses him of having looted a charity given OpenAI began as a non-profit. Appearing before a federal jury in Oakland, California, Altman said Musk not only backed the idea of OpenAI becoming a for-profit business, he wanted control of it for the long-run. A particularly hair-raising moment was when my cofounders asked, 'If you have control, what happens when you die?'