AITopics | budget

Collaborating Authors

budget

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

State lawmakers cry foul over new cap placed on film tax credits

Los Angeles TimesJul-13-2026, 01:10:56 GMT

Things to Do in L.A. Tap to enable a layout that focuses on the article. This is read by an automated voice. Please report any issues or inconsistencies here . See more from the L.A. Times in Google Search. More than three dozen California legislators are calling for Gov. Gavin Newsom to exempt the state's film and TV production incentive program from a recently approved cap on corporate tax credits, warning that without action it will be "significantly kneecapped."

artificial intelligence, hollywood inc, social media, (8 more...)

Los Angeles Times

Country: North America > United States > California (0.95)

Industry:

Government > Tax (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Law > Taxation Law (0.97)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

The Papers: Burnham's 'bumper Budget' and Widdecombe murder 'not political'

BBC NewsJul-13-2026, 01:00:39 GMT

Image caption, Andy Burnham is exploring holding an expanded Budget this autumn to set out strategic priorities, reports the Financial Times. New strikes on Iran by the US pose biggest test for interim deal, it headlines. Image caption, As part of Burnham's Budget, the Telegraph reports he has a plan for £38bn tax raid. The paper leads with the latest in the murder of Ann Widdecombe, saying that the suspect drove 300 miles to her house. Image caption, The Metro also leads with the Widdecombe murder, leading on police comments that the killing was not political.

artificial intelligence, image caption, widdecombe, (11 more...)

BBC News

Country:

Europe > United Kingdom (0.97)
North America (0.72)
Asia > Middle East > Iran (0.25)

Industry:

Leisure & Entertainment (0.99)
Law (0.92)
Government > Regional Government > Europe Government > United Kingdom Government (0.48)
Government > Regional Government > North America Government > United States Government (0.30)

Technology: Information Technology > Artificial Intelligence (0.32)

Add feedback

GRPO, Dr. GRPO, and DAPO Are Three Operations on One Number: The Group-Standard-Deviation Identity

Bay, Yong Yi, Yearick, Kathleen A.

arXiv.org Machine LearningJul-2-2026

Three of the most popular methods for training language models to reason look like three different tricks. They are not. All three adjust a single number: standard deviation, reflecting how much a prompt's sampled answers disagree. When such a model is trained, it answers each problem many times, and an automatic checker marks every answer right or wrong. The standard deviation of those marks measures the disagreement: largest when the answers split evenly between right and wrong, and zero when they all agree. Group Relative Policy Optimization (GRPO) divides by this number, GRPO Done Right (Dr. GRPO) drops the division, and Decoupled Clip and Dynamic Sampling Policy Optimization (DAPO) discards the groups where it is zero. Each is presented as its own fix, yet this paper proves they are three settings of one dial. That dial is not cosmetic: for right-or-wrong rewards, the disagreement is exactly the size of the training update, the group-standard-deviation identity. A split group teaches the most, while a unanimous group teaches nothing and falls silent. The same result says which problems deserve the most weight and how many tries each one needs. This paper confirms the intuition on a large real difficulty dataset (Big-Math) and in a controlled training run. What looks like a harmless normalization step is the dial that decides where learning happens and how strongly.

grpo, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2607.00152

Country: North America > United States > Illinois (0.40)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Training for the Model You Return: Improving Optimization for Iterate-Averaged Language Models

Au, Kwok Chun, Block, Adam

arXiv.org Machine LearningJul-1-2026

Many modern Language Model (LM) pipelines return an averaged model, such as an exponential moving average of the training iterates, rather than the final iterate itself. This raises a fundamental question: given that we will return an iterate average, how should we change training to improve the performance of this average? We study this question by formulating optimizer design for the iterate-average estimator as an optimal-control problem. In a continuous-time stochastic quadratic model, we solve for the control strategy that minimizes the error of the returned average subject to a penalty on the size of the intervention. A practical approximation to this controller yields PACE, a lightweight wrapper around AdamW that pulls the live weights toward their exponential moving average with a clipped, per-coordinate control strength. We prove that a stylized version of PACE converges at the standard stochastic convex optimization rate, up to a factor depending on the averaging rule, while in the quadratic setting it can strictly improve the limiting squared error of the iterate-average estimator and can do so by an arbitrarily large factor on some instances. Empirically, our results suggest that PACE improves over AdamW and EMA-evaluated AdamW in supervised fine-tuning of 1-2B parameter LMs and in GPT-2 pretraining on FineWeb for a wide range of learning rates, decay schedules, and other hyperparameters.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2606.25086

Genre: Research Report > New Finding (0.65)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

When More Sampling Hurts: The Modal Ceiling and Correlation Ceiling of Test-Time Scaling

Bay, Yong Yi, Yearick, Kathleen A.

arXiv.org Machine LearningJun-30-2026

People overthink; language models over-sample, and the extra effort can talk both into a worse answer. Reasoning systems answer a hard question by sampling it many times (test-time scaling), and the more they draw, the more often a correct answer turns up somewhere, so coverage, the fraction of problems with at least one correct try, climbs and appears to be progress. But a deployed system must return one answer, and choosing it, not knowing which try is right, is selection; selection is capped, and past a point extra samples only make the model surer of a confident mistake, even as every draw adds cost. The gap between climbing coverage and stalled selection, the identifiability gap, is the answer a model can produce but not pick. So the real question is not whether to sample but how far, and the answer is: not far. For picking an answer, the vote has already settled within a few dozen draws, the modal ceiling; for scoring a benchmark, sooner still, the correlation ceiling. Beyond that, extra draws cost compute and add nothing, and can even make the answer worse. This paper turns the cutoff into a single number, the effective number of samples, that any sampling run already reveals. The bottleneck is recognizing a right answer, not generating one.

correlation, large language model, machine learning, (21 more...)

arXiv.org Machine Learning

2606.28661

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

ITSPACE: Monotone Gaussian Optimal Transport Updates

Na, Woojoo, Dy, Jennifer

arXiv.org Machine LearningJun-30-2026

Covariance matrices serve as compact descriptors of feature distributions in many machine-learning pipelines, including domain adaptation and Gaussian embeddings. Under a centered Gaussian approximation, the unregularized Wasserstein-2 optimal-transport (OT) discrepancy admits a closed form on covariances given by the Bures-Wasserstein (BW) objective on the symmetric positive definite (SPD) cone. We propose ITSPACE (Iterative Transport for Stable Proximal Alignment of Covariance Embeddings), a proximal majorization-minimization method that directly optimizes this exact BW objective through closed-form updates in a square-root factorization. In exact arithmetic, each iteration satisfies a sufficient-decrease inequality for the BW objective; under inexact polar computations, we provide an explicit certificate-gap bound controlling deviations from exact descent. The resulting iterations preserve PSD structure by construction and naturally support rank-restricted factors, making ITSPACE well-suited as a lightweight inner-loop primitive in settings where adaptation must be performed from unlabeled target batches under strict step and compute budgets. Across real-world covariance-alignment benchmarks, ITSPACE reaches low-BW-gap solutions substantially faster than BW-gradient descent, methods based on other covariance geometries, and entropically regularized sample-OT baselines.

artificial intelligence, machine learning, objective, (16 more...)

arXiv.org Machine Learning

2606.30523

Country: Asia (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)

Add feedback

Japan to stop focusing on public works cost effectiveness

The Japan TimesJun-29-2026, 07:05:00 GMT

Draft guidelines list expressways and some shinkansen lines as key areas of focus for infrastructure development. The government plans to stop putting too much focus on cost effectiveness when assessing the feasibility of public works projects, sources said Monday. The government will shift its focus to an overall assessment that places an emphasis on basic functions to protect lives and livelihoods, under annual guidelines for economic and fiscal reform policy, due out next month. Draft guidelines list expressways, some shinkansen lines and the planned Chuo Shinkansen high-speed magnetic levitation train line as key areas of focus for infrastructure development. The government will accelerate its push to build what are known as autoflow roads, or dedicated road lanes used by automated vehicles to transport goods. It will also promote the use of robots that can operate autonomously at construction sites to address labor shortages.

artificial intelligence, social media, world cup china falling yen, (10 more...)

The Japan Times

Country: Asia > Japan > Honshū > Kantō (0.16)

Industry:

Government (1.00)
Transportation > Passenger (0.80)

Technology:

Information Technology > Artificial Intelligence > Robots (0.91)
Information Technology > Communications > Social Media (0.78)

Add feedback

Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL

Neural Information Processing SystemsJun-23-2026, 12:32:28 GMT

Chain-of-thought (CoT) reasoning in large language models (LLMs) can be formalized as a latent variable problem, where the model needs to generate intermediate reasoning steps. While prior approaches such as iterative reward-ranked fine-tuning (RAFT) have relied on such formulations, they typically apply uniform inference budgets across prompts, which fails to account for variability in difficulty and convergence behavior. This work identifies the main bottleneck in CoT training as inefficient stochastic gradient estimation due to static sampling strategies. We propose GVM-RAFT, a prompt-specific Dynamic Sample Allocation Strategy designed to minimize stochastic gradient variance under a computational budget constraint. The method dynamically allocates computational resources by monitoring prompt acceptance rates and stochastic gradient norms, ensuring that the resulting gradient variance is minimized. Our theoretical analysis shows that the proposed dynamic sampling strategy leads to accelerated convergence guarantees under suitable conditions. Experiments on mathematical reasoning show that GVM-RAFT achieves a 2-4 speedup and considerable accuracy improvements over vanilla RAFT. The proposed dynamic sampling strategy is general and can be incorporated into other reinforcement learning algorithms, such as GRPO, leading to similar improvements in convergence and test accuracy.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.88)

Add feedback

Combining Cost-Constrained Runtime Monitors for AISafety

Neural Information Processing SystemsJun-23-2026, 01:48:19 GMT

Monitoring AIs at runtime can help us detect and stop harmful actions. In this paper, we study how to efficiently combine multiple runtime monitors into a single monitoring protocol. The protocol's objective is to maximize the probability of applying a safety intervention on misaligned outputs (i.e., maximize recall). Since running monitors and applying safety interventions are costly, the protocol also needs to adhere to an average-case budget constraint. Taking the monitors' performance and cost as given, we develop an algorithm to find the best protocol. The algorithm exhaustively searches over when and which monitors to call, and allocates safety interventions based on the Neyman-Pearson lemma. By focusing on likelihood ratios and strategically trading off spending on monitors against spending on interventions, we more than double our recall rate compared to a naive baseline in a code review setting. We also show that combining two monitors can Pareto dominate using either monitor alone. Our framework provides a principled methodology for combining existing monitors to detect undesirable behavior in cost-sensitive settings.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
(2 more...)

Add feedback

Compute-Optimal Scaling for Value-Based Deep RL

Neural Information Processing SystemsJun-23-2026, 01:22:05 GMT

As models grow larger and training them becomes expensive, it becomes increasingly important to scale training recipes not just to larger models and more data, but to do so in a compute-optimal manner that extracts maximal performance per unit of compute. While such scaling has been well studied for language modeling, reinforcement learning (RL) has received less attention in this regard. In this paper, we investigate compute scaling for online, value-based deep RL. These methods present two primary axes for compute allocation: model capacity and the updateto-data (UTD) ratio. Given a fixed compute budget, we ask: how should resources be partitioned across these axes to maximize data efficiency? Our analysis reveals a nuanced interplay between model size, batch size, and UTD. In particular, we identify a phenomenon we call TD-overfitting: increasing the batch quickly harms Q-function accuracy for small models, but this effect is absent in large models, enabling effective use of large batch size at scale. We provide a mental model for understanding this phenomenon and build guidelines for choosing batch size and UTD to optimize compute usage. Our findings provide a grounded starting point for compute-optimal scaling in deep RL, mirroring studies in supervised learning but adapted to TD learning.

machine learning, natural language, reinforcement learning, (17 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.87)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback