expenditure
ParaScopes: What do Language Models Activations Encode About Future Text?
Pochinkov, Nicky, Volkova, Yulia, Vasileva, Anna, Chereddy, Sai V R
Interpretability studies in language models often investigate forward-looking representations of activations. However, as language models become capable of doing ever longer time horizon tasks, methods for understanding activations often remain limited to testing specific concepts or tokens. We develop a framework of Residual Stream Decoders as a method of probing model activations for paragraph-scale and document-scale plans. We test several methods and find information can be decoded equivalent to 5+ tokens of future context in small models. These results lay the groundwork for better monitoring of language models and better understanding how they might encode longer-term planning information.
- North America > United States > Alaska (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- (6 more...)
- Health & Medicine > Therapeutic Area (0.68)
- Banking & Finance > Economy (0.46)
The Feasibility of Training Sovereign Language Models in the Global South: A Study of Brazil and Mexico
Malagon, Sandra, Ruiz, Monica A. Ulloa, Plaza, Tatiana Elizabeth Sandoval, Bolívar, Gabriel Rafael Rosario, Mesa, Valentina García, Morales, Ivanna Alvarado
The rapid escalation of computational requirements for training large-scale language models has reinforced structural asymmetries between high-capacity jurisdictions and countries in the Global South. This paper examines the technical and fiscal feasibility of sovereign-scale language model training in Brazil and Mexico under conditions of constrained hardware access, energy availability, and fiscal ceilings. Using a dual-axis design that varies accelerator generation (NVIDIA H100 vs. A100) and training duration (90 vs. 150 days), we estimate compute demand, energy consumption, capital expenditures, and regulatory compatibility for the training of a 10-trillion-token model. Our findings show that while all configurations remain below export-control and electrical infrastructure thresholds, fiscal viability is determined by hardware efficiency. H100-based scenarios achieve training feasibility at a total cost of 8-14 million USD, while A100 deployments require 19-32 million USD due to higher energy and hardware demand. We argue that extending training timelines should be treated as a policy lever to mitigate hardware constraints, enabling the production of usable, auditable, and locally aligned models without competing at the global frontier. This study contributes to the discourse on AI compute governance and technological sovereignty by highlighting context-sensitive strategies that allow middle-income countries to establish sustainable and strategically sufficient AI capabilities.
- North America > United States (0.14)
- South America > Brazil > São Paulo (0.05)
- North America > Mexico > Querétaro (0.05)
- (4 more...)
- Law (1.00)
- Energy > Power Industry (1.00)
- Government > Commerce (0.90)
Intelligent Healthcare Ecosystems: Optimizing the Iron Triangle of Healthcare (Access, Cost, Quality)
Abstract--The United States spends more on healthcare than any other nation - nearly 17% of GDP as of the early 2020s - yet struggles with uneven access and outcomes [1] [2]. This paradox of high cost, variable quality, and inequitable access is often described by the "Iron Triangle" of healthcare [3], which posits that improvements in one dimension (access, cost, or quality) often come at the expense of the others. This paper explores how an Intelligent Healthcare Ecosystem (iHE) - an integrated system leveraging advanced technologies and data-driven innovation - can "bend" or even break this iron triangle, enabling simultaneous enhancements in access, cost-efficiency, and quality of care. We review historical and current trends in U.S. healthcare spending, including persistent waste and international comparisons, to underscore the need for transformative change. We then propose a conceptual model and strategic framework for iHE, incorporating emerging technologies such as generative AI and large language models (LLMs), federated learning, interoperability standards (FHIR) and nationwide networks (TEFCA), and digital twins. We introduce an updated healthcare value equation that integrates all three corners of the iron triangle, and we hypothesize that an intelligently coordinated ecosystem can maximize this value by delivering high-quality care to more people at lower cost. Methods include a narrative synthesis of recent literature and policy reports, and Results highlight key components and enabling technologies of an iHE. We discuss how such ecosystems can reduce waste, personalize care, enhance interoperability, and support value-based models, all while addressing challenges like privacy, bias, and stakeholder adoption. The paper is formatted per MDPI guidelines, with APA-style numbered references, illustrative figures (U.S. spending trends, waste breakdown, international spending comparison, conceptual models), equations, and a structured layout. Our findings suggest that embracing an Intelligent Healthcare Ecosystem is pivotal for optimizing the long-standing trade-offs in healthcare's iron triangle, moving towards a system that is more accessible, affordable, and of higher quality for all.
- Oceania > New Zealand (0.04)
- Europe > Germany (0.04)
- Asia > South Korea (0.04)
- (2 more...)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Therapeutic Area (1.00)
- Health & Medicine > Public Health (1.00)
- (6 more...)
Double or Nothing: Multiplicative Incentive Mechanisms for Crowdsourcing
Nihar Bhadresh Shah, Dengyong Zhou
Crowdsourcing has gained immense popularity in machine learning applications for obtaining large amounts of labeled data. Crowdsourcing is cheap and fast, but suffers from the problem of low-quality data. To address this fundamental challenge in crowdsourcing, we propose a simple payment mechanism to incentivize workers to answer only the questions that they are sure of and skip the rest. We show that surprisingly, under a mild and natural "no-free-lunch" requirement, this mechanism is the one and only incentive-compatible payment mechanism possible. We also show that among all possible incentive-compatible mechanisms (that may or may not satisfy no-free-lunch), our mechanism makes the smallest possible payment to spammers. Interestingly, this unique mechanism takes a "multiplicative" form. The simplicity of the mechanism is an added benefit. In preliminary experiments involving over several hundred workers, we observe a significant reduction in the error rates under our unique mechanism for the same or lower monetary expenditure.
- Pacific Ocean > North Pacific Ocean > San Francisco Bay > Golden Gate (0.05)
- North America > United States > California > Alameda County > Berkeley (0.04)
Big tech has spent 155bn on AI this year. It's about to spend hundreds of billions more
The US's largest companies have spent 2025 locked in a competition to spend more money than one another, lavishing 155bn on the development of artificial intelligence, more than the US government has spent on education, training, employment and social services in the 2025 fiscal year so far. Based on the most recent financial disclosures of Silicon Valley's biggest players, the race is about to accelerate to hundreds of billions in a single year. Over the past two weeks, Meta, Microsoft, Amazon, and Alphabet, Google's parent, have shared their quarterly public financial reports. Each disclosed that their year-to-date capital expenditure, a figure that refers to the money companies spend to acquire or upgrade tangible assets, already totals tens of billions. Capex, as the term is abbreviated, is a proxy for technology companies' spending on AI because the technology requires gargantuan investments in physical infrastructure, namely data centers, which require large amounts of power, water and expensive semiconductor chips.
- North America > United States > California (0.25)
- North America > United States > New York > New York County > New York City (0.05)
- Asia > China (0.05)
- Information Technology > Services (0.39)
- Government > Regional Government (0.36)
Zuckerberg claims 'superintelligence is now in sight' as Meta lavishes billions on AI
Whether it's poaching top talent away from competitors, acquiring AI startups or proclaiming that it will build data centers the size of Manhattan, Meta has been on a spending spree to boost its artificial intelligence capabilities for months now. The massive splurge is paying off, according to Meta's chief executive. In a new memo posted on Wednesday ahead of the company's quarterly earnings report, Mark Zuckerberg, describes his ambitions for developing what he calls "superintelligence". "Over the last few months we have begun to see glimpses of our AI systems improving themselves," Zuckerberg wrote. "The improvement is slow for now, but undeniable. Developing superintelligence is now in sight."
Chart Question Answering from Real-World Analytical Narratives
Hutchinson, Maeve, Jianu, Radu, Slingsby, Aidan, Wood, Jo, Madhyastha, Pranava
We present a new dataset for chart question answering (CQA) constructed from visualization notebooks. The dataset features real-world, multi-view charts paired with natural language questions grounded in analytical narratives. Unlike prior benchmarks, our data reflects ecologically valid reasoning workflows. Benchmarking state-of-the-art multimodal large language models reveals a significant performance gap, with GPT-4.1 achieving an accuracy of 69.3%, underscoring the challenges posed by this more authentic CQA setting.
- Europe > France (0.06)
- Europe > United Kingdom > Wales (0.06)
- Asia > China > Shanghai > Shanghai (0.04)
- (9 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.91)
- Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.89)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.38)
Automatic Robustness Stress Testing of LLMs as Mathematical Problem Solvers
Hou, Yutao, Xiao, Zeguan, Yu, Fei, Jiang, Yihan, Wei, Xuetao, Huang, Hailiang, Chen, Yun, Chen, Guanhua
Large language models (LLMs) have achieved distinguished performance on various reasoning-intensive tasks. However, LLMs might still face the challenges of robustness issues and fail unexpectedly in some simple reasoning tasks. Previous works evaluate the LLM robustness with hand-crafted templates or a limited set of perturbation rules, indicating potential data contamination in pre-training or fine-tuning datasets. In this work, inspired by stress testing in software engineering, we propose a novel framework, Automatic Robustness Checker (AR-Checker), to generate mathematical problem variants that maintain the semantic meanings of the original one but might fail the LLMs. The AR-Checker framework generates mathematical problem variants through multi-round parallel streams of LLM-based rewriting and verification. Our framework can generate benchmark variants dynamically for each LLM, thus minimizing the risk of data contamination. Experiments on GSM8K and MATH-500 demonstrate the strong performance of AR-Checker on mathematical tasks. We also evaluate AR-Checker on benchmarks beyond mathematics, including MMLU, MMLU-Pro, and CommonsenseQA, where it also achieves strong performance, further proving the effectiveness of AR-Checker.
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- North America > Mexico > Mexico City > Mexico City (0.04)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
The Longitudinal Health, Income, and Employment Model (LHIEM): a discrete-time microsimulation model for policy analysis
Propp, Adrienne M., Vardavas, Raffaele, Price, Carter C., Kapinos, Kandice A.
Dynamic microsimulation has long been recognized as a powerful tool for policy analysis, but in fact most major health policy simulations lack path dependency, a critical feature for evaluating policies that depend on accumulated outcomes such as retirement savings, wealth, or debt. We propose the Longitudinal Health, Income and Employment Model (LHIEM), a path-dependent discrete-time microsimulation that predicts annual health care expenditures, family income, and health status for the U.S. population over a multi-year period. LHIEM advances the population from year to year as a Markov chain with modules capturing the particular dynamics of each predictive attribute. LHIEM was designed to assess a health care financing proposal that would allow individuals to borrow from the U.S. government to cover health care costs, requiring careful tracking of medical expenditures and medical debt over time. However, LHIEM is flexible enough to be used for a range of modeling needs related to predicting health care spending and income over time. In this paper, we present the details of the model and all dynamic modules, and include a case study to demonstrate how LHIEM can be used to evaluate proposed policy changes.
- North America > United States > Maryland > Montgomery County > Rockville (0.04)
- North America > United States > Massachusetts (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- (7 more...)