Large Language Model
Block-Diagonal LoRA for Eliminating Communication Overhead in Tensor Parallel LoRA Serving
When serving a single base LLM with several different LoRA adapters simultaneously, the adapters cannot simply be merged with the base model's weights as the adapter swapping would create overhead and requests using different adapters could not be batched. Rather, the LoRA computations have to be separated from the base LLM computations, and in a multi-device setup the LoRA adapters can be sharded in a way that is well aligned with the base model's tensor parallel execution, as proposed in S-LoRA. However, the S-LoRA sharding strategy encounters some communication overhead, which may be small in theory, but can be large in practice. In this paper, we propose to constrain certain LoRA factors to be block-diagonal, which allows for an alternative way of sharding LoRA adapters that does not require any additional communication for the LoRA computations. We demonstrate in extensive experiments that our block-diagonal LoRA approach is similarly parameter efficient as standard LoRA (i.e., for a similar number of parameters it achieves similar downstream performance) and that it leads to significant end-to-end speed-up over S-LoRA. For example, when serving on eight A100 GPUs, we observe up to 1.79x (1.23x) end-to-end speed-up with 0.87x (1.74x) the number of adapter parameters for Llama-3.1-70B, and up to 1.63x (1.3x) end-to-end speed-up with 0.86x (1.73x) the number of adapter parameters for Llama-3.1-8B.
An AI solution to an 80‑year‑old problem has shocked mathematicians
Last week, OpenAI shocked the mathematical community by revealing that one of its internal artificial intelligence (AI) models had found a counterexample to a famous conjecture made by legendary Hungarian mathematician Paul Erdős in 1946. The planar unit distance problem, or Erdős problem 90, has intrigued mathematicians for decades. The new result is no mere curiosity. Canadian mathematician Daniel Litt described it as "the first result produced autonomously by an AI that I find interesting in itself". The breakthrough, produced with a general-purpose AI model rather than one specialised for mathematics, also highlights how AI is changing mathematical research itself.
OpenAI makes move to go public one week after rival Anthropic
OpenAI, founded in San Francisco in 2015 as a nonprofit research lab, burst into the mainstream with the launch of ChatGPT in November 2022. It has since restructured as a for-profit corporation. SAN FRANCISCO, UNITED STATES - ChatGPT-maker OpenAI on Monday took the first step toward going public, one week after archrival Anthropic announced its own filing, as both companies look to raise the massive sums needed to expand. In a social media post, the Sam Altman-led company said it had confidentially submitted an S-1 registration statement to U.S. securities regulators but had "not decided on timing yet" for any potential debut. OpenAI's move follows a confidential filing by Anthropic, the maker of the Claude chatbot, which announced last Monday that it had taken the same step. In a time of both misinformation and too much information, quality journalism is more crucial than ever.
OpenAI files SEC paperwork to go public
We expect it to leak so we're just announcing it. Exactly a week after Anthropic announced its plan to go public, OpenAI has followed suit. The company said on Monday that it confidentially submitted a S-1 form with the Securities and Exchange Commission. No date or offer price has been set by OpenAI yet for the initial public offering. We recently submitted a confidential S-1. We expect it to leak so we're just announcing it.
Google cuts the price of its AI Plus plan and doubles the storage
The subscription now starts at $5 per month. Google is lowering the cost of its cheapest AI subscription to make Gemini models even easier to access. The Google AI Plus plan will now cost $5 per month, according to a post from Vikas Kansal, the company's Product Lead focused on Gemini AI subscriptions, down from its original $8 per month price. It now also comes with double the storage, 400GB instead of 200GB. The subscription plan became available in January 2026 as a cheaper way to access Google's Gemini 3 Pro model, Nano Banana Pro and Deep Research.
You don't need to worry about recursive-self-improving AI – yet
You don't need to worry about recursive-self-improving AI - yet One of the world's leading artificial intelligence companies has implored the industry to pause development on AI, because the latest models could be reaching a tipping point where they become capable of redesigning themselves, growing ever more powerful and finally escaping our control. At least, that's what the headlines said. In truth, Anthropic's co-founder Jack Clark and the boss of spin-out think-tank The Anthropic Institute, Marina Favaro, have published a long blog post bigging up the capabilities of their Claude model, shortly before the company floats on the stock exchange in an initial public offering (IPO) for a rumoured $1 trillion. Let's, for a moment, ignore the vast financial elephant in the room and look at the technological claims. An AI that becomes capable of designing a more powerful version of itself, which is in turn able to pull off the same feat, is an obvious gamechanger, but it is also not a new idea.
The Download: how the World Cup ball will fly and OpenAI's "super app"
The Download: how the World Cup ball will fly and OpenAI's "super app" Plus: OpenAI plans to turn ChatGPT into a'super app' before its IPO. Why this year's World Cup ball may not fly as far Much is new about this month's FIFA World Cup tournament. It hosts more teams than ever before. It's the first to occur in three different host countries. And, like every World Cup for over half a century, it will employ a football with a brand-new design. Through wind-tunnel experiments, researchers found that long-distance kicks with Adidas's new Trionda ball might not travel as far as they did in the past.
Is Elon Musk's SpaceX Really Worth 1.75 Trillion?
Is Elon Musk's SpaceX Really Worth $1.75 Trillion? The billionaire spent more than two decades creating a successful space company. Now he's pitching it as an A.I. play. Later this week, Elon Musk's SpaceX is expected to issue stock to investors in what is shaping up to be the biggest initial public offering ever. The company has said it will issue 555,555,555 shares at a price of $135, which would value it at about $1.75 trillion.
Instead of Taking Your Job, A.I. Might Transform It
Proponents and critics of artificial intelligence often compare the technology to industrial automation--really, it's more like an intern. One summer during high school, I took a temporary job writing computer programs for a consulting firm. Each morning, I drove through rush-hour traffic to an office park near Princeton, New Jersey, on the crowded Route 1 corridor. At a desk in some sort of equipment room, I coded quick-and-dirty database tools for internal use. One of my programs simplified the process of logging hours into timesheets.
Elon Musk Is Dropping a Boulder in a Kiddie Pool
He is about to take SpaceX public--pushing other AI companies to do the same. Elon Musk is about to set in motion a chain of events that will reshape the global financial order. For starters, when SpaceX formally goes public next week, he is all but guaranteed to become the world's first trillionaire. His rocket company is targeting a valuation of $1.77 trillion, which would make it one of the 10 biggest companies in the world--bigger than Meta, Walmart, and, for that matter, Tesla. All of this activity is less about colonizing Mars and more about providing the infrastructure for the AI boom: Musk wants to use his rockets to launch data centers into space, where there is abundant solar power to harvest.