Large Language Model
The Download: cut through AI coding hype, and biotech trends to watch
AI coding is now everywhere. But not everyone is convinced. Depending who you ask, AI-powered coding is either giving software developers an unprecedented productivity boost or churning out masses of poorly designed code that saps their attention and sets software projects up for serious long term-maintenance problems. The problem is right now, it's not easy to know which is true. As tech giants pour billions into large language models (LLMs), coding has been touted as the technology's killer app. Executives enamored with the potential are pushing engineers to lean into an AI-powered future.
Interview with Anindya Das Antar: Evaluating effectiveness of moderation guardrails in aligning LLM outputs
In their paper presented at AIES 2025, "Do Your Guardrails Even Guard?" Method for Evaluating Effectiveness of Moderation Guardrails in Aligning LLM Outputs with Expert User Expectations, Anindya Das Antar Xun Huan and Nikola Banovic propose a method to evaluate and select guardrails that best align LLM outputs with domain knowledge from subject-matter experts. Here, Anindya tells us more about their method, some case studies, and plans for future developments. Could you give us some background to your work - why are guardrails such an important area for study? Ensuring that large language models (LLMs) produce desirable outputs without harmful side effects and align with user expectations, organizational goals, and existing domain knowledge is crucial for their adoption in high-stakes decision-making. However, despite training on vast amounts of data, LLMs can still produce incorrect, misleading, or otherwise unexpected and undesirable outputs.
Run one prompt, get multiple AI answers at once--just 80 for lifetime access
When you purchase through links in our articles, we may earn a small commission. ChatPlayground AI is on sale for $79.99 (MSRP $619) and lets you run one prompt across multiple AI models at once. If you use AI regularly, you already know the problem: different models give very different answers, and switching between platforms wastes time. ChatPlayground AI solves that by putting the world's top AI models into a single, side-by-side workspace . With ChatPlayground AI, you enter one prompt and instantly see how multiple models respond--making it easy to compare tone, accuracy, depth, and creativity in one view.
OpenAI and Microsoft lose last chance to avoid trial with Elon Musk
OpenAI and Microsoft failed to escape a trial over Elon Musk's claims that Sam Altman's startup betrayed its founding mission as a public charity when it took billions in funding from the software giant and made plans to operate as a for-profit business. A federal judge in Oakland, California, on Thursday rejected requests by OpenAI and Microsoft to dismiss claims by Musk and ordered the case to proceed to a jury trial set for late April. Musk helped Altman and others launch OpenAI in 2015 and went on to found his own artificial intelligence company in 2023. Musk's lawsuit continues to be baseless and a part of his ongoing pattern of harassment, and we look forward to demonstrating this at trial," OpenAI said in a statement. "We remain focused on empowering the OpenAI Foundation, which is already one of the best resourced nonprofits ever."
On the origin of neural scaling laws: from random graphs to natural language
Barkeshli, Maissam, Alfarano, Alberto, Gromov, Andrey
Scaling laws have played a major role in the modern AI revolution, providing practitioners predictive power over how the model performance will improve with increasing data, compute, and number of model parameters. This has spurred an intense interest in the origin of neural scaling laws, with a common suggestion being that they arise from power law structure already present in the data. In this paper we study scaling laws for transformers trained to predict random walks (bigrams) on graphs with tunable complexity. We demonstrate that this simplified setting already gives rise to neural scaling laws even in the absence of power law structure in the data correlations. We further consider dialing down the complexity of natural language systematically, by training on sequences sampled from increasingly simplified generative language models, from 4,2,1-layer transformer language models down to language bigrams, revealing a monotonic evolution of the scaling exponents. Our results also include scaling laws obtained from training on random walks on random graphs drawn from Erdรถs-Renyi and scale-free Barabรกsi-Albert ensembles. Finally, we revisit conventional scaling laws for language modeling, demonstrating that several essential results can be reproduced using 2 layer transformers with context length of 50, provide a critical analysis of various fits used in prior literature, demonstrate an alternative method for obtaining compute optimal curves as compared with current practice in published literature, and provide preliminary evidence that maximal update parameterization may be more parameter efficient than standard parameterization.
Step-by-Step Causality: Transparent Causal Discovery with Multi-Agent Tree-Query and Adversarial Confidence Estimation
Ding, Ziyi, Ye-Hao, Chenfei, Wang, Zheyuan, Zhang, Xiao-Ping
Causal discovery aims to recover ``what causes what'', but classical constraint-based methods (e.g., PC, FCI) suffer from error propagation, and recent LLM-based causal oracles often behave as opaque, confidence-free black boxes. This paper introduces Tree-Query, a tree-structured, multi-expert LLM framework that reduces pairwise causal discovery to a short sequence of queries about backdoor paths, (in)dependence, latent confounding, and causal direction, yielding interpretable judgments with robustness-aware confidence scores. Theoretical guarantees are provided for asymptotic identifiability of four pairwise relations. On data-free benchmarks derived from Mooij et al. and UCI causal graphs, Tree-Query improves structural metrics over direct LLM baselines, and a diet--weight case study illustrates confounder screening and stable, high-confidence causal conclusions. Tree-Query thus offers a principled way to obtain data-free causal priors from LLMs that can complement downstream data-driven causal discovery. Code is available at https://anonymous.4open.science/r/Repo-9B3E-4F96.
Meta's Layoffs Leave Supernatural Fitness Users in Mourning
Meta's Layoffs Leave Supernatural Fitness Users in Mourning Users of the VR fitness service are distraught that Supernatural has had its staff cut and won't receive any more content updates. I hear a stranger's heavy breathing through the rollicking dude-bro anthem blasting my eardrums, courtesy of the pop-rock band Imagine Dragons. Me and two people I just met are punching digital blocks that fly at our heads in the VR workout platform Supernatural . My new friends have nameplates floating above their heads that say Chip and Alisa. That's all I know about them.
Take an Extra 50 Off My Favorite Mattress With This Code
Life's Too Short to Put Up With a Bad Mattress--My Favorite Is on Sale Nolah's Evolution is the mattress I personally sleep on as a certified sleep coach, and it's on sale pre-Presidents' Day. I'm a picky person when it comes to mattresses. Of the hundreds I've tested over my career as a certified sleep coach and mattress tester, I have a short list of ones that have exceeded my expectations. At the top of my list is the Nolah Evolution, with the distinct honor of being the mattress I personally own. Given my job, that's got to be like the equivalent of a Grammy or Emmy in the mattress world.
Inside OpenAI's Raid on Thinking Machines Lab
OpenAI is planning to bring over more researchers from Thinking Machines Lab after nabbing two cofounders, a source familiar with the situation says. If someone ever makes an HBO Max series about the AI industry, the events of this week will make quite the episode. On Wednesday, OpenAI's CEO of applications, Fidji Simo, announced the company had rehired Barret Zoph and Luke Metz, cofounders of Mira Murati's AI startup, Thinking Machines Lab. We reported last night on two narratives forming around what led to the departures, and have since learned new information. A source with direct knowledge says that Thinking Machines leadership believed Zoph engaged in an incident of serious misconduct while at the company last year.