Large Language Model
The Download: inside the Musk v. Altman trial, and AI for democracy
Plus: The Pentagon has struck sweeping AI deals for classified work. Week one of the Musk v. Altman trial: what it was like in the room Two of the most powerful figures in AI--Sam Altman and Elon Musk--are in the middle of a landmark legal showdown, with Musk alleging he was misled about OpenAI becoming a for-profit company. Our reporter Michelle Kim, who also happens to be a lawyer, has been in court each day, and has broken down the first week's key moments in her latest report . In a new Q&A, she also reveals what it was like in the room, the new details that have emerged about how Musk and OpenAI operate--and what we can expect from this week's proceedings. Find out what she's discovered so far, and if you want to keep up with MIT Technology Review's ongoing coverage of the Musk v. Altman trial, follow @techreview or @michelletomkim on X. Faster than many realize, AI is becoming the primary interface through which we form beliefs and participate in democratic self-governance. This shift could further strain already fragile institutions, but it could also help address problems like polarization and declining civic engagement.
He Couldn't Land a Job Interview. Was AI to Blame?
Armed with some Python and a white-hot sense of injustice, one medical student spent six months trying to figure out whether an algorithm trashed his job application. It was mid-October, peak leaf-peeping season in Hanover, New Hampshire, and Chad Markey was on a rare break between clinical rotations during his last year of medical school. He should have been inhaling Green Mountain air and gossiping with his Dartmouth classmates about life after graduation. In a few months, they'd all be going their separate ways to start residency training at hospitals around the country. Instead, Markey was alone in his apartment, deep down a rabbit hole, preparing to go to war. He'd wake each morning, eat breakfast, open his laptop at the kitchen table or settle into the tan armchair with the good back support, and start coding . Some days, he wouldn't notice the sun had gone down until one of his roommates came home and asked why the lights weren't on. For days, Markey had been scrolling through a Discord group about medical residency, a font of crowdsourced knowledge where students report back to their peers on every stage of the application and selection process. He'd watched as other students, lots of them, posted about the interview invitations they'd received.
Evaluating LLMs on Large-Scale Graph Property Estimation via Random Walks
With the rapidly improving reasoning abilities of Large Language Models (LLMs), there is also a rising demand to use them in a wide variety of domains. This brings about the need to carefully evaluate the limits of the capabilities of these models with various tests and benchmarks. Graph structures are ubiquitous in real-world data, and are often used to represent and analyze relationship patterns within data. Many benchmarks have already been proposed in the graph literature to test the reasoning ability of LLMs to follow and execute graph algorithms. However, due to the limited context length of LLMs, these benchmarks consist of very small graphs. In real-world data, the size of graphs can be significantly larger, and in many cases, not fully accessible. In this paper, we examine a class of problems that arises with very large graphs having limited accessibility. We propose a large graph benchmark dataset, EstGraph, and introduce four distinct tasks designed to estimate large graph properties. We evaluate the reasoning abilities of LLMs on these tasks using a wide variety of graph datasets. In addition, we provide task-specific prompt constructions based on random walk sampling of large graphs (up to millions of nodes) that effectively convey sufficient information to LLMs within the limits of context length.
Greg Brockman Defends 30B OpenAI Stake: 'Blood, Sweat, and Tears'
OpenAI's cofounder and president revealed in federal court on Monday that he's one of the largest individual stakeholders in the AI lab. Two days before the Musk v. Altman trial began, Elon Musk asked OpenAI cofounder and president Greg Brockman about reaching a settlement. When Brockman suggested both sides drop their claims, Musk responded, "By the end of this week, you and Sam [Altman] will be the most hated men in America. If you insist, so be it." The message --which OpenAI's lawyers made public on Sunday, and which Judge Yvonne Gonzalez Rogers subsequently refused to let the jury hear about--underscores what may be Musk's larger goal in this trial.
I love my new Codex AI pet -- and now I want one in every app
PCWorld explores OpenAI's new Codex AI pets, which provide visual status indicators for desktop AI agents through customizable on-screen companions. These pets address a key user experience issue by displaying red clocks when agent approval is needed and green checks upon task completion. The feature enhances multitasking efficiency by keeping users informed of AI agent activity without constant monitoring of the main interface. Whether I'm using Claude's desktop Cowork application or OpenAI's Codex coding app, I prefer that my AI agents check back with me before making high-stakes decisions. But while that makes for a safer setup, it also means my agents are often waiting around, twiddling their thumbs as they wait for me to approve their next steps. Now, if I'm sitting and watching the Cowork or Codex apps in action, I'll see right away when an agent is awaiting my approval. But if I'm working in another window or multitasking, I could easily miss the fact that an idled Cowork or Codex agent is sitting around, staring vacantly into space.
Uniform-Correct Policy Optimization: Breaking RLVR's Indifference to Diversity
Lochab, Anamika, Li, Bolian, Zhang, Ruqi
Reinforcement Learning with Verifiable Rewards (RLVR) has achieved substantial gains in single-attempt accuracy (Pass@1) on reasoning tasks, yet often suffers from reduced multi-sample coverage (Pass@K), indicating diversity collapse. We identify a structural cause for this degradation: common RLVR objectives, such as GRPO, are indifferent to how probability mass is distributed among correct solutions. Combined with stochastic training dynamics, this indifference induces a self-reinforcing collapse, in which probability mass concentrates on a narrow subset of correct outputs while alternative valid solutions are suppressed. We formalize this collapse mechanism and further characterize the optimal policy structure under two complementary criteria: robustness and entropy-regularized optimality, which identify the Uniform-Correct Policy as uniquely optimal. Motivated by this analysis, we propose Uniform-Correct Policy Optimization (UCPO), a modification to GRPO that adds a conditional uniformity penalty on the policy's distribution over correct solutions. The penalty redistributes gradient signal toward underrepresented correct responses, encouraging uniform allocation of probability mass within the correct set. Across three models (1.5B-7B parameters) and five mathematical reasoning benchmarks, UCPO improves Pass@K and diversity while maintaining competitive Pass@1, achieving up to +10\% absolute improvement on AIME24 at Pass@64 and up to 45\% higher equation-level diversity within the correct set. The code is available at https://github.com/AnamikaLochab/UCPO.
UK 'invention agency' grants 50m of public money to US tech and venture capital firms
OpenAI's Sam Altman, left, is a backer of Rain Neuromophics, one of the companies that received funds from the UK's Aria, the brainchild of Dominic Cummings, right OpenAI's Sam Altman, left, is a backer of Rain Neuromophics, one of the companies that received funds from the UK's Aria, the brainchild of Dominic Cummings, right Exclusive: Brainchild of Dominic Cummings, Aria is aimed at funding'crazy' scientific projects to benefit the UK Britain's "invention agency" has pledged £50m of UK taxpayer money to US tech companies and venture capital projects. Dreamed up by Dominic Cummings to fund "crazy" ideas, the Advanced Research and Invention Agency (Aria) is meant to " restore Britain's place as a scientific superpower ". But a joint investigation by the Guardian and Democracy for Sale, an investigative website, has established that more than an eighth of the agency's £400m in research and development funding over the past two years has gone to 14 US tech companies and venture capital groups, in some cases, with no clear return for the UK or Aria. One of these companies, Rain Neuromorphics, is also backed by the OpenAI chief executive, Sam Altman, and was reported to be near collapse last year, shortly after winning Aria money. It did not respond to a request for comment; two of its founders appear to have left the company.
OpenAI introduces AI-generated pets for its Codex app
Vibe coding just got a whole lot more adorable. OpenAI introduced AI-generated pets to the Codex app, its agentic tool that helps with coding. These optional animated companions don't do any coding themselves, but serve as a floating overlay that can tell you what Codex is working on, notify you when Codex completes a task or whether it needs your input on something. The new feature lets developers see Codex's active thread, without having to switch away from your current open app. Users can type /pet in to the Codex app to summon or dismiss the companion.
Musk v. Altman week 1: Elon Musk says he was duped, warns AI could kill us all, and admits that xAI distills OpenAI's models
Musk v. Altman week 1: Elon Musk says he was duped, warns AI could kill us all, and admits that xAI distills OpenAI's models Musk kept his cool, and OpenAI's lawyer bulldozed him with piercing questions about his motivations for suing the company. In the first week of the landmark trial between Elon Musk and OpenAI, Musk took the stand in a crisp black suit and tie and argued that OpenAI CEO Sam Altman and president Greg Brockman had deceived him into bankrolling the company. Along the way, he warned that AI could destroy us all and sat through revelations that he had poached OpenAI employees for his own companies. He even confessed, to some audible gasps in the courtroom, that his own AI company, xAI, which makes the chatbot Grok, uses OpenAI's models to train its own. The federal courthouse in Oakland, California, was packed with armies of lawyers carrying boxes of exhibits, journalists typing away at their laptops, and a handful of concerned OpenAI employees. Outside, protesters lined the streets, carrying signs urging people to quit ChatGPT, boycott Tesla, or both.
OpenAI Enables Marketing Cookies by Default for Free ChatGPT Users
ChatGPT's new privacy policy states how the company uses cookies for tracking, to turn free users into paying subscribers. OpenAI is ready to target free users of its services with advertisements around the web, based on what it knows about them. On Thursday, OpenAI sent an email to users laying out major changes to the AI company's privacy policy in the US. "We'll now use cookies to promote OpenAI products and services on other websites," reads the email sent on April 30. "This does not impact your conversations in ChatGPT. Your conversations with ChatGPT are private and are not shared with marketing partners."