Goto

Collaborating Authors

 Large Language Model


The moment that kicked off the AI revolution

New Scientist

Has the technology lived up to its potential? The first time that AlphaGo revealed its full power, it prompted a visceral reaction . Lee Sedol, the world's greatest player of the ancient Chinese board game Go, had grown visibly agitated at the artificial intelligence's prowess. The hushed crowd in downtown Seoul, South Korea, could barely contain its gasps. It was quickly dawning on Lee, and the tens of millions watching at home, that this AI was different to those that had come before. It wasn't just beating Lee, but it was doing so with an almost human-like aptitude.


OpenAI Is Opening the Door to Government Spying

The Atlantic - Technology

Outside OpenAI's headquarters, a handful of people gathered on Monday holding pieces of colorful chalk. They got down on their knees and started writing messages on the sidewalk. Please no legal mass surveillance. At issue was a business deal that the company recently signed with the Department of Defense, following the Pentagon's sudden turn against Anthropic . OpenAI will now supply its technology to the military for use in classified settings, the sorts that may involve wartime decisions and intelligence-gathering--an agreement, many legal experts told me, that could give the government wide-ranging powers.


JUCAL: Jointly Calibrating Aleatoric and Epistemic Uncertainty in Classification Tasks

arXiv.org Machine Learning

We study post-calibration uncertainty for trained ensembles of classifiers. Specifically, we consider both aleatoric (label noise) and epistemic (model) uncertainty. Among the most popular and widely used calibration methods in classification are temperature scaling (i.e., pool-then-calibrate) and conformal methods. However, the main shortcoming of these calibration methods is that they do not balance the proportion of aleatoric and epistemic uncertainty. Not balancing these uncertainties can severely misrepresent predictive uncertainty, leading to overconfident predictions in some input regions while being underconfident in others. To address this shortcoming, we present a simple but powerful calibration algorithm Joint Uncertainty Calibration (JUCAL) that jointly calibrates aleatoric and epistemic uncertainty. JUCAL jointly calibrates two constants to weight and scale epistemic and aleatoric uncertainties by optimizing the negative log-likelihood (NLL) on the validation/calibration dataset. JUCAL can be applied to any trained ensemble of classifiers (e.g., transformers, CNNs, or tree-based methods), with minimal computational overhead, without requiring access to the models' internal parameters. We experimentally evaluate JUCAL on various text classification tasks, for ensembles of varying sizes and with different ensembling strategies. Our experiments show that JUCAL significantly outperforms SOTA calibration methods across all considered classification tasks, reducing NLL and predictive set size by up to 15% and 20%, respectively. Interestingly, even applying JUCAL to an ensemble of size 5 can outperform temperature-scaled ensembles of size up to 50 in terms of NLL and predictive set size, resulting in up to 10 times smaller inference costs. Thus, we propose JUCAL as a new go-to method for calibrating ensembles in classification.


OpenAI will reportedly release an AI-powered smart speaker in 2027

Engadget

Samsung Galaxy Unpacked 2026 is Feb. 25 The company is also said to be working on smart glasses and a smart lamp. OpenAI is reportedly hard at work developing a series of AI-powered devices, including smart glasses, a smart speaker and a smart lamp. According to reporting by, the AI company has a team of over 200 employees dedicated to the project. The first product scheduled to be released is reported to be a smart speaker that would include a camera, allowing it to better absorb information about its users and surroundings. According to a person familiar with the project, this would extend to identifying objects on a nearby table, as well as conversations being held in the vicinity of the speaker.


AI hit: India hungry to harness US tech giants' technology at Delhi summit

The Guardian

From left: India's prime minister, Narendra Modi, with the chief executives of OpenAI, Sam Altman, and Anthropic, Dario Amodei, at the AI Impact summit in Delhi. From left: India's prime minister, Narendra Modi, with the chief executives of OpenAI, Sam Altman, and Anthropic, Dario Amodei, at the AI Impact summit in Delhi. AI hit: India hungry to harness US tech giants' technology at Delhi summit Narendra Modi's thirst to supercharge economic growth is matched by US desire to inject AI into world's biggest democracy I ndia celebrates 80 years of independence from the UK in August 2027. At about that same moment, "early versions of true super intelligence" could emerge, Sam Altman, the co-founder of OpenAI, said this week. It's a looming coincidence that raised a charged question at the AI Impact summit in Delhi, hosted by India's prime minister, Narendra Modi: can India avoid returning to the status of a vassal state when it imports AI to raise the prospects of its 1.4 billion people? Modi's hunger to harness AI's capability is great.


India's AI Summit Brings Big Names, Little Impact

TIME - Tech

India's Prime Minister Narendra Modi takes a group photo with AI company leaders at the AI Impact Summit in New Delhi on Feb. 19, 2026. India's Prime Minister Narendra Modi takes a group photo with AI company leaders at the AI Impact Summit in New Delhi on Feb. 19, 2026. The world's largest-ever AI summit took place in India this week, with hundreds of thousands of people, including world leaders and CEOs of AI companies, descending upon New Delhi for five days. It was the fourth in a series of summits that were initially designed as a place for governments to coordinate global action in the face of threats from advanced AI. But the India summit, like one in Paris before it, functioned more as a trade fair and an advertisement for the host nation's AI prowess than a venue for meaningful international diplomacy.


India chases 'DeepSeek moment' with homegrown AI models

The Japan Times

Indian Prime Minister Narendra Modi takes a group photo with leaders of artificial intelligence companies at the AI Impact Summit in New Delhi on Thursday. But analysts said the country was unlikely to have a "DeepSeek moment" -- the sort of boom China had last year with a high-performance, low-cost chatbot -- any time soon. Still, building custom AI tools could bring benefits to the world's most populous nation. In a time of both misinformation and too much information, quality journalism is more crucial than ever. By subscribing, you can help us get the story right. With your current subscription plan you can comment on stories.


Towards Anytime-Valid Statistical Watermarking

arXiv.org Machine Learning

The proliferation of Large Language Models (LLMs) necessitates efficient mechanisms to distinguish machine-generated content from human text. While statistical watermarking has emerged as a promising solution, existing methods suffer from two critical limitations: the lack of a principled approach for selecting sampling distributions and the reliance on fixed-horizon hypothesis testing, which precludes valid early stopping. In this paper, we bridge this gap by developing the first e-value-based watermarking framework, Anchored E-Watermarking, that unifies optimal sampling with anytime-valid inference. Unlike traditional approaches where optional stopping invalidates Type-I error guarantees, our framework enables valid, anytime-inference by constructing a test supermartingale for the detection process. By leveraging an anchor distribution to approximate the target model, we characterize the optimal e-value with respect to the worst-case log-growth rate and derive the optimal expected stopping time. Our theoretical claims are substantiated by simulations and evaluations on established benchmarks, showing that our framework can significantly enhance sample efficiency, reducing the average token budget required for detection by 13-15% relative to state-of-the-art baselines.


When to Trust the Cheap Check: Weak and Strong Verification for Reasoning

arXiv.org Machine Learning

Reasoning with LLMs increasingly unfolds inside a broader verification loop. Internally, systems use cheap checks, such as self-consistency or proxy rewards, which we call weak verification. Externally, users inspect outputs and steer the model through feedback until results are trustworthy, which we call strong verification. These signals differ sharply in cost and reliability: strong verification can establish trust but is resource-intensive, while weak verification is fast and scalable but noisy and imperfect. We formalize this tension through weak--strong verification policies, which decide when to accept or reject based on weak verification and when to defer to strong verification. We introduce metrics capturing incorrect acceptance, incorrect rejection, and strong-verification frequency. Over population, we show that optimal policies admit a two-threshold structure and that calibration and sharpness govern the value of weak verifiers. Building on this, we develop an online algorithm that provably controls acceptance and rejection errors without assumptions on the query stream, the language model, or the weak verifier.


The Chinese AI app sending Hollywood into a panic

BBC News

A new artificial intelligence (AI) model developed by the Chinese company behind TikTok rocked Hollywood this week - not just because of what it can do, but what it could mean for creative industries. Created by tech giant ByteDance, Seedance 2.0 can generate cinema-quality video, complete with sound effects and dialogue, from just a few written prompts. Many of the clips said to have been made using Seedance, and featuring popular characters like Spider-Man and Deadpool, went viral. What is Seedance - and why the stir? Seedance was launched to little fanfare in June 2025 but it is the second version that came eight months later that has caused a major stir.