AITopics

Al JazeeraMay-29-2026, 03:16:25 GMT

Anthropic soars to 965bn valuation, leapfrogging OpenAI

Anthropic has usurped OpenAI as the world's most valuable artificial intelligence startup, soaring to a $965bn valuation ahead of expected public listings by the rival firms. Anthropic, the maker of the Claude family of chatbots, said on Thursday that it had raised $65bn from private investors after a fundraising round led by Altimeter Capital, Greenoaks, Dragoneer and Sequoia Capital. "This funding will help us serve the historic demand we are experiencing, stay at the research frontier, and bring Claude to more of the places where work happens," Anthropic's Chief Financial Officer Krishna Rao said in a statement. Altimeter Capital CEO Brad Gerstner hailed the adoption of Claude among the "world's most demanding organisations" as evidence of Anthropic's command in the field. "This momentum positions Anthropic to lead the next phase of AI innovation and capture the enormous opportunity ahead," Gerstner said.

large language model, machine learning, natural language, (12 more...)

Al Jazeera

Country: North America > United States > California (0.16)

Genre: Financial News (0.36)

Industry:

Information Technology (0.73)
Consumer Products & Services > Restaurants (0.32)
Government > Regional Government > North America Government > United States Government (0.31)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.66)

The Japan TimesMay-29-2026, 01:49:00 GMT

Taiyo Yuden sees 'scary' levels of AI parts demand risking supply chain

Taiyo Yuden sees'scary' levels of AI parts demand risking supply chain Multilayer ceramic capacitors, which are tiny components that regulate and stabilize power flow in electronic devices, are becoming a growing bottleneck in the construction of artificial intelligence data centers. Taiyo Yuden is fielding "scary" levels of demand for its high-end artificial intelligence server components, stretching capacity and increasing the risk of supply chain hiccups. The Tokyo-based company, which makes multilayer ceramic capacitors, will likely need to accelerate spending to expand output capacity, Chief Executive Officer Katsuya Sase said in an interview. MLCCs, which are tiny components that regulate and stabilize power flow in electronic devices, are becoming a growing bottleneck in the construction of artificial intelligence data centers. Taiyo Yuden and Murata Manufacturing comprise the bulk of the world's supplies of high-end MLCCs. "The volumes we are seeing today -- it's scary," Sase said.

artificial intelligence, press release, social media, (11 more...)

The Japan Times

Country:

Asia > Middle East > Iran (0.41)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.29)

Genre: Press Release (0.71)

Industry:

Energy (0.92)
Law (0.80)
Information Technology > Services (0.60)
Semiconductors & Electronics (0.57)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Communications > Social Media (0.79)

The Japan TimesMay-29-2026, 01:37:00 GMT

BYD debuts China's most advanced EV chip in smart-driving push

BYD debuts China's most advanced EV chip in smart-driving push BYD on Thursday unveiled what it calls China's first automotive-grade 4-nanometer chip for self-driving cars. BYD, the world's largest electric vehicle maker, unveiled a series of technology advances, including what it calls China's first automotive-grade 4-nanometer chip for self-driving cars. The semiconductor breakthrough approaches the lead of Chinese tech giant Huawei Technologies, which currently makes chips with a geometry of 7 nm but has pledged to debut 1.4 nm chips by 2031. It's designed to allow BYD's computer-assisted driving to stand out from a crowded Chinese EV market that includes rivals such as Xpeng and Xiaomi. Facing eight months in a row of falling sales and intense competition for more advanced charging and intelligent driving technologies, BYD is looking to spark more demand for its vehicles.

artificial intelligence, iran war endgame philippines-japan summit, social media, (10 more...)

The Japan Times

Country:

Asia > Japan (0.73)
Asia > China (0.68)
Asia > Middle East > Iran (0.41)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Electric Vehicle (1.00)
Automobiles & Trucks > Manufacturer (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.99)
Information Technology > Communications > Social Media (0.79)

The Japan TimesMay-29-2026, 00:43:00 GMT

Anthropic reaches near-trillion dollar valuation, topping OpenAI

Anthropic's rise came by doubling down on delivering generative artificial intelligence to enterprise clients rather than general users. Artificial intelligence company Anthropic said Thursday it had raised $65 billion in a new funding round that values the Claude maker at $965 billion, more than its archrival OpenAI, the maker of ChatGPT. The latest fundraising round confirms Anthropic's place as one of the most significant players in AI, with the startup led by Dario Amodei having drawn fans for its coding powers and state-of-the-art models. Anthropic's rise came by doubling down on delivering generative AI to enterprise clients rather than general users, the path initially chosen by OpenAI. In a time of both misinformation and too much information, quality journalism is more crucial than ever. By subscribing, you can help us get the story right.

artificial intelligence, machine learning, natural language, (13 more...)

The Japan Times

Country:

Asia > Japan (0.88)
Asia > Middle East > Iran (0.43)

Industry:

Law (0.81)
Media > News (0.72)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Conf-Gen: Conformal Uncertainty Quantification for Generative Models

Loaiza-Ganem, Gabriel, Zhang, Kevin, Cui, Wei, Law, Marc T., Leung, Kin Kwan

Conformal prediction (CP) and its extension, conformal risk control (CRC), are established frameworks for quantifying uncertainty in supervised machine learning through formal guarantees. However, recent breakthroughs in artificial intelligence (AI) have been driven by unsupervised generative models, such as large language models (LLMs) and image generators, which are not directly compatible with CP or CRC. In this work we introduce conformal generation (Conf-Gen), a general framework adapting CRC to generative tasks while relaxing its theoretical assumptions. Conf-Gen unifies and generalizes previous attempts to apply CP to LLMs, and extends conformal methodology to entirely new domains. We demonstrate the flexibility of Conf-Gen through some novel applications, including obtaining conformal guarantees on: image generators producing non-memorized images, conversational AI systems having asked enough clarifying questions, and the output of AI agents being correct.

large language model, machine learning, natural language, (20 more...)

2605.2892

Country:

North America > United States (0.28)
North America > Canada (0.28)

Genre: Research Report (0.81)

Industry:

Energy > Renewable (0.93)
Government (0.93)
Health & Medicine > Therapeutic Area (0.92)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Dubey, Prasanjit, Huo, Xiaoming

Anytime-Valid Federated Conformal RAG for LLM Swarms

Federated Conformal RAG (FC-RAG) provides distribution-free coverage for a bandwidth-limited swarm of weak language models, but only at a fixed horizon. We extend it to anytime-valid sequential coverage: validity at every stopping time, preserved under predictable adaptive control (recalibration, per-node bandwidth escalation, distilled-student refresh), at no extra cost in assumptions over fixed-horizon FC-RAG. Naive composition fails because FC-RAG's marginal coverage bound makes the betting e-process a non-supermartingale on adverse calibration draws, and Ville's inequality cannot be invoked. We give Anytime-FC-RAG, a sequential extension built on a summable per-step calibration-deviation budget that converts the marginal bound into a strict conditional bound on a calibration-good event, paired with a truncated betting e-process that is a nonnegative supermartingale on the entire probability space. From these two ingredients, we obtain four guarantees: time-uniform alarm validity $\mathbb{P}(\sup_t E_t \ge 1/δ_e) \le δ_e + δ_{\mathrm{cal}}$, a Hoeffding-stitched cumulative-miscoverage envelope at the same total budget, safety under any predictable controller (recalibration, bandwidth escalation, student refresh), and training-side error propagation across an unbounded sequence of Federated Probe-Logit Distillation (FPLD) refreshes via a summable training budget. As a practical consequence, an adaptive controller that escalates retrieval bandwidth only when the e-process crosses a warning threshold matches the alarm rate of a fixed-high-bandwidth schedule at substantially lower communication cost. Experiments on a GPT-2-small + MiniLM swarm across MMLU, DBpedia, and AG News verify the predicted alarm rate, detection delay, envelope coverage, and $14$-$57\%$ bandwidth savings; the alarm fires when and only when coverage genuinely breaks.

artificial intelligence, large language model, natural language, (18 more...)

2605.29139

Country:

North America > United States (0.28)
North America > Mexico (0.28)

Genre: Research Report (0.40)

Industry: Education (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)

Cesari, Tommaso, Colomboni, Roberto

Optimal Gap-Dependent Regret for Private Stochastic Decision-Theoretic Online Learning

We study stochastic decision-theoretic online learning with full information and event-level pure differential privacy. A COLT open problem of Hu and Mehta asks to determine the optimal gap-dependent regret rate for stochastic decision-theoretic online learning under pure event-level differential privacy. For $K$ actions, losses in $[0,1]$, and a unique best action separated from the second-best action by gap $Δ_{\min}$, the known lower bound is of order $ \frac{\log K}{\min\{Δ_{\min},\varepsilon\}}, $ or equivalently, up to universal constants, of order \[ \frac{\log K}{Δ_{\min}}+\frac{\log K}{\varepsilon}. \] We give a horizon-free pure-DP algorithm and prove the explicit regret bound \[ \operatorname{Reg}_T \le 1000 \cdot \left(\frac{\log K}{Δ_{\min}}+\frac{\log K}{\varepsilon}\right) \] for every horizon $T$. The numerical constant is not optimized. The algorithm partitions time into blocks of exponentially increasing size, plays a single action throughout each block, and chooses the next action by an exponential mechanism applied to a data-independent random prefix of the previous block. The random prefix converts block regret into a sum, over all prefix lengths, of softmax selection errors. A single entropy-potential argument controls all privacy-dominated large-gap actions at cost $\log K/\varepsilon$.

artificial intelligence, machine learning, privacy, (15 more...)

2605.29148

Country: Europe > United Kingdom (0.28)

Genre: Research Report (0.64)

Industry: Education > Educational Setting > Online (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.92)

Das, Mohua, Beneventano, Pierfrancesco, Dey, Shibshankar, McKinkey, Gareth H., Poggio, Tomaso

Do Deep Networks Forget Initialization? A Forgetting-Time View of Practical Inductive Bias

Randomly initialized neural networks induce a prior over functions, but the predictor used in practice is produced only after training. We ask how much of this initial bias survives the training pipeline. To make the question measurable, we introduce initialization memory: the dependence of the validation-selected predictor on the scale of the random initialization. We perform controlled CIFAR-10 experiments on ResNets where initialization memory already sharply separates training regimes. Low-learning-rate SGD can interpolate while still remembering its initialization: on ResNet-9 with batch size $b=128$, test accuracy varies by $26.5$ percentage points across initialization scales despite $\ge99.5\%$ training accuracy. This is not undertraining: extending the same low-learning-rate regime to $5{,}000$ epochs leaves the spread essentially unchanged. In contrast, Adam-family methods largely erase the dependence. SGD can also be made to forget when larger learning rates are paired with explicit $L_2$ norm control. We interpret these findings in terms of the time scale of forgetting: gradient-flow-like dynamics can preserve initialization memory, whereas stochastic finite-step effects, explicit norm decay, and adaptive preconditioning erase it on scales governed by the size of explicit or implicit regularization. The practical inductive bias of a trained network is therefore not the architectural prior alone, but the architectural prior after being filtered by the forgetting dynamics of the training pipeline; and the same regularizers that improve generalization are precisely those that erase memory of initialization.

artificial intelligence, international conference, machine learning, (15 more...)

2605.29152

Country: North America > United States (0.45)

Genre: Research Report > New Finding (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

The Good, the Bad, and the Ugly of Markov Boundary for Tabular Prediction

Wan, Shu, Gorantla, Abhinav, Liu, Huan, Candan, K. Selçuk

Under standard graphical assumptions, the Markov boundary of a target variable is the smallest set of features that renders every other feature redundant. Once the boundary is observed, the target is conditionally independent of the rest of the table. This is a tempting object for tabular prediction, since it names exactly the columns a model should need. Yet modern regressors are still trained on the full feature set. We ask whether the Markov boundary is genuinely useful for prediction on SCM3K, a 3,450-task synthetic SCM benchmark with feature counts from 40 to 1000 and six SCM families, evaluated with six regressors. The answer is more nuanced than the theory suggests. Restricting a regressor to the oracle boundary often improves prediction substantially, and the improvement grows as the feature space becomes larger and sparser. But the natural pipeline of recovering the boundary with causal discovery and training on the recovered mask does not deliver. Existing estimators exhaust the compute budget before reaching the regime where the boundary helps most, and even where they run they rarely beat the full feature set. We trace this to three causes. Discovery optimizes structural recovery rather than prediction. False negatives and false positives carry sharply asymmetric predictive cost. The exact boundary is only one of many feature sets that beat all features. We then develop what these facts imply for prediction-aligned feature selection and for tabular models that learn to use causal structure.

artificial intelligence, boundary, machine learning, (15 more...)

2605.29411

Country: North America > United States > Arizona (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)