Goto

Collaborating Authors

 slowdown



The US economy is growing - so where are all the jobs?

BBC News

The US economy is growing - so where are all the jobs? When 42-year-old Jacob Trigg lost his job as a project manager in the tech industry he didn't think it would take too long to find a new one - he always had before. But more than 2,000 job applications later he is still hunting, trying to make ends meet with jobs in package delivery and landscaping. It's a huge surprise because I've always been able to get a job very easily, said Trigg, who lives in Texas. It wasn't even on my radar to be prepared for more than six months of unemployment.


BatchNormalizationOrthogonalizesRepresentations inDeepRandomNetworks

Neural Information Processing Systems

More precisely, under a mild assumption, we prove that the deviation of the representations from orthogonality rapidly decays with depth up to a term inversely proportional to the network width. This result has two main implications: 1) Theoretically, as the depth grows, the distribution of the representation -after the linear layers-contracts to a Wasserstein-2 ball around an isotropic Gaussian distribution.



Tesla sales jump as buyers scramble before EV tax credit expires

Al Jazeera

Tesla sales have surged in the third quarter as buyers in the United States rushed to take advantage of electric vehicle (EV) tax credits that were eliminated under President Donald Trump's sweeping tax bill passed this year. On Thursday, the automaker reported a 7.4 percent increase in sales compared with the same period last year as demand was driven by customers looking to buy before the credits officially expired at the end of September. Tesla also delivered 481,166 units of its Model 3 compact sedan and Model Y crossover in the quarter, well above Wall Street expectations. The Elon Musk-led carmaker frequently talked up the expiry of the tax credits, using it alongside discounts and financing deals to spur sales and leases of its EVs. Investors are worried because sales are now expected to slump as the $7,500 federal tax credit disappears.


How to measure the returns on R&D spending

MIT Technology Review

Forget the glorious successes of past breakthroughs--the real justification for research investment is what we get for our money. MIT Technology Review You can read more from the series here. Given the draconian cuts to US federal funding for science, including the administration's proposal to reduce the 2026 budgets of the National Institutes of Health by 40% and the National Science Foundation by 57%, it's worth asking some hard-nosed money questions: How much we be spending on R&D? How much value do we get out of such investments, anyway? To answer that, it's important to look at both successful returns and investments that went nowhere. How Trump's policies are affecting early-career scientists--in their own words Every year, we recognize extraordinary young researchers on our Innovators Under 35 list. Recent honorees told us how they're faring under the new administration.


NEXICA: Discovering Road Traffic Causality (Extended arXiv Version)

Srikanth, Siddharth, Krumm, John, Qin, Jonathan

arXiv.org Artificial Intelligence

Road traffic congestion is a persistent problem. Focusing resources on the causes of congestion is a potentially efficient strategy for reducing slowdowns. We present NEXICA, an algorithm to discover which parts of the highway system tend to cause slowdowns on other parts of the highway. We use time series of road speeds as inputs to our causal discovery algorithm. Finding other algorithms inadequate, we develop a new approach that is novel in three ways. First, it concentrates on just the presence or absence of events in the time series, where an event indicates the temporal beginning of a traffic slowdown. Second, we develop a probabilistic model using maximum likelihood estimation to compute the probabilities of spontaneous and caused slowdowns between two locations on the highway. Third, we train a binary classifier to identify pairs of cause/effect locations trained on pairs of road locations where we are reasonably certain a priori of their causal connections, both positive and negative. We test our approach on six months of road speed data from 195 different highway speed sensors in the Los Angeles area, showing that our approach is superior to state-of-the-art baselines in both accuracy and computation speed.


Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity

Becker, Joel, Rush, Nate, Barnes, Elizabeth, Rein, David

arXiv.org Artificial Intelligence

Despite widespread adoption, the impact of AI tools on software development in the wild remains understudied. We conduct a randomized controlled trial (RCT) to understand how AI tools at the February-June 2025 frontier affect the productivity of experienced open-source developers. 16 developers with moderate AI experience complete 246 tasks in mature projects on which they have an average of 5 years of prior experience. Each task is randomly assigned to allow or disallow usage of early 2025 AI tools. When AI tools are allowed, developers primarily use Cursor Pro, a popular code editor, and Claude 3.5/3.7 Sonnet. Before starting tasks, developers forecast that allowing AI will reduce completion time by 24%. After completing the study, developers estimate that allowing AI reduced completion time by 20%. Surprisingly, we find that allowing AI actually increases completion time by 19%--AI tooling slowed developers down. This slowdown also contradicts predictions from experts in economics (39% shorter) and ML (38% shorter). To understand this result, we collect and evaluate evidence for 20 properties of our setting that a priori could contribute to the observed slowdown effect--for example, the size and quality standards of projects, or prior developer experience with AI tooling. Although the influence of experimental artifacts cannot be entirely ruled out, the robustness of the slowdown effect across our analyses suggests it is unlikely to primarily be a function of our experimental design.


Utility-Driven Speculative Decoding for Mixture-of-Experts

Saxena, Anish, Tsai, Po-An, Taneja, Hritvik, Jaleel, Aamer, Qureshi, Moinuddin

arXiv.org Artificial Intelligence

GPU memory bandwidth is the main bottleneck for low-latency Large Language Model (LLM) inference. Speculative decoding leverages idle GPU compute by using a lightweight drafter to propose K tokens, which the LLM verifies in parallel, boosting token throughput. In conventional dense LLMs, all model weights are fetched each iteration, so speculation adds no latency overhead. Emerging Mixture of Experts (MoE) models activate only a subset of weights per token, greatly reducing data movement. However, we show that speculation is ineffective for MoEs: draft tokens collectively activate more weights, increasing data movement and verification time by 2-3x. When token throughput gains fail to offset this overhead, speculation causes slowdowns up to 1.5x, making it infeasible. Even when useful, the optimal K varies by task, model, and even between requests and iterations. Thus, despite widespread use in dense LLMs, speculation remains impractical in leading MoEs. We present Cascade, a utility-driven framework that selectively enables speculation to avoid slowdowns and dynamically tunes K to accelerate MoE serving. Cascade uses a lightweight metric, speculation utility, the ratio of token gains to verification cost, which shows iteration-level locality, enabling periodic decisions via short test and longer set phases. For each request, Cascade disables speculation if utility drops below one during testing, and when utility exceeds one, tests multiple K-values to choose the utility-maximizing K for the set phase. We implement Cascade in vLLM and evaluate it on five popular MoEs with workloads spanning code, math, extraction, and mixed tasks. Cascade limits slowdown to 5% (vs. 1.5x) and improves throughput by 7-14% over static K, making speculative decoding practical for MoEs.


Understanding Stragglers in Large Model Training Using What-if Analysis

Lin, Jinkun, Jiang, Ziheng, Song, Zuquan, Zhao, Sida, Yu, Menghan, Wang, Zhanghan, Wang, Chenyuan, Shi, Zuocheng, Shi, Xiang, Jia, Wei, Liu, Zherui, Wang, Shuguang, Lin, Haibin, Liu, Xin, Panda, Aurojit, Li, Jinyang

arXiv.org Artificial Intelligence

Large language model (LLM) training is one of the most demanding distributed computations today, often requiring thousands of GPUs with frequent synchronization across machines. Such a workload pattern makes it susceptible to stragglers, where the training can be stalled by few slow workers. At ByteDance we find stragglers are not trivially always caused by hardware failures, but can arise from multiple complex factors. This work aims to present a comprehensive study on the straggler issues in LLM training, using a five-month trace collected from our ByteDance LLM training cluster. The core methodology is what-if analysis that simulates the scenario without any stragglers and contrasts with the actual case. We use this method to study the following questions: (1) how often do stragglers affect training jobs, and what effect do they have on job performance; (2) do stragglers exhibit temporal or spatial patterns; and (3) what are the potential root causes for stragglers?