AITopics

Genre:

Research Report > Experimental Study (1.00)
Workflow (0.66)

Industry: Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)

Neural Information Processing SystemsFeb-18-2026, 10:41:31 GMT

e0e956681b04ac126679e8c7dd706b2e-Paper-Conference.pdf

large language model, machine learning, natural language, (23 more...)

Country:

Asia > India > Karnataka > Bengaluru (0.04)
North America > United States > District of Columbia > Washington (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(5 more...)

Genre:

Research Report > Experimental Study (0.92)
Research Report > New Finding (0.67)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Leisure & Entertainment (0.93)
Education (0.67)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
(2 more...)

Neural Information Processing SystemsFeb-16-2026, 21:44:07 GMT

bf85879363044ca21f7868a3d1b4021c-Paper-Conference.pdf

machine learning, natural language, pseudolabel, (16 more...)

Country:

North America > United States (0.14)
Europe > United Kingdom (0.14)
Asia > China > Guangxi Province > Nanning (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Passenger (1.00)
Transportation > Air (1.00)
Aerospace & Defense > Aircraft (1.00)
(2 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Lee, Ivan, Berg-Kirkpatrick, Taylor

Readability $\ne$ Learnability: Rethinking the Role of Simplicity in Training Small Language Models

arXiv.org Artificial IntelligenceOct-17-2025

Recent studies suggest that very small language models (SLMs) can generate surprisingly coherent text when trained on simplified, child-directed corpora such as TinyStories. These findings have been interpreted as evidence that readability -- characterized by accessible vocabulary, familiar narrative structure, and simple syntax -- plays a key role in enabling such capabilities to emerge. In this paper, we challenge that interpretation. We construct synthetic datasets with matched structure but varied readability, and find that readability alone does not predict coherence or learning efficiency in SLMs. Models trained on complex, adult-level text perform comparably to those trained on simplified language, and even exhibit faster development of coherence during training. Instead, we show that statistical simplicity, as measured by n-gram diversity, is a stronger predictor of learnability. Our findings caution against the growing trend of anthropomorphizing language model training -- drawing parallels to human cognitive development without empirical basis -- and argue for more precise reasoning about what properties actually support capability emergence in small models.

large language model, machine learning, natural language, (18 more...)

2510.13915

Country:

North America > United States (1.00)
Europe (1.00)
Asia (1.00)

Genre: Research Report > New Finding (1.00)

Industry:

Media (1.00)
Leisure & Entertainment > Sports (1.00)
Law (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Neural Information Processing SystemsOct-10-2025, 19:18:38 GMT

e0e956681b04ac126679e8c7dd706b2e-Paper-Conference.pdf

accuracy, arxiv, satellite, (17 more...)

Country:

Asia > India > Karnataka > Bengaluru (0.04)
North America > United States > District of Columbia > Washington (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(5 more...)

Genre:

Research Report > Experimental Study (0.92)
Research Report > New Finding (0.67)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Leisure & Entertainment (0.93)
Education (0.67)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
(2 more...)

Neural Information Processing SystemsOct-9-2025, 06:23:33 GMT

bf85879363044ca21f7868a3d1b4021c-Paper-Conference.pdf

machine learning, natural language, pseudolabel, (16 more...)

Country:

North America > United States (0.14)
Europe > United Kingdom (0.14)
Asia > China > Guangxi Province > Nanning (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Passenger (1.00)
Transportation > Air (1.00)
Aerospace & Defense > Aircraft (1.00)
(2 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

New ScientistJul-30-2025, 18:00:00 GMT

Five years later, has sci-fi cult hit Devs aged well?

March 2020 was an inauspicious time, I think we can agree. This may be why Devs, an eight-part sci-fi series by Alex Garland that debuted as the world went into lockdown, didn't attract as large an audience as it could have – we certainly had other things to worry about. I was, I confess, one of the many people who missed it. There are lots of reasons why I have recently rectified that: Garland was on my mind after watching 28 Years Later, for which he wrote the screenplay, and the cold, dark world of Devs was also the perfect antidote to the heatwave this column was written under. But the main reason is that five strange years have passed since the show aired, and I was intrigued to see how it looked, at half a decade's remove.

artificial intelligence, dev, garland, (8 more...)

New Scientist

Country: North America > United States > California > San Francisco County > San Francisco (0.05)

Industry:

Media > Film (0.36)
Leisure & Entertainment (0.36)

Technology: Information Technology > Artificial Intelligence (0.73)

arXiv.org Artificial IntelligenceMay-22-2025

dKV-Cache: The Cache for Diffusion Language Models

Ma, Xinyin, Yu, Runpeng, Fang, Gongfan, Wang, Xinchao

Diffusion Language Models (DLMs) have been seen as a promising competitor for autoregressive language models. However, diffusion language models have long been constrained by slow inference. A core challenge is that their non-autoregressive architecture and bidirectional attention preclude the key-value cache that accelerates decoding. We address this bottleneck by proposing a KV-cache-like mechanism, delayed KV-Cache, for the denoising process of DLMs. Our approach is motivated by the observation that different tokens have distinct representation dynamics throughout the diffusion process. Accordingly, we propose a delayed and conditioned caching strategy for key and value states. We design two complementary variants to cache key and value step-by-step: (1) dKV-Cache-Decode, which provides almost lossless acceleration, and even improves performance on long sequences, suggesting that existing DLMs may under-utilise contextual information during inference. (2) dKV-Cache-Greedy, which has aggressive caching with reduced lifespan, achieving higher speed-ups with quadratic time complexity at the cost of some performance degradation. dKV-Cache, in final, achieves from 2-10x speedup in inference, largely narrowing the gap between ARs and DLMs. We evaluate our dKV-Cache on several benchmarks, delivering acceleration across general language understanding, mathematical, and code-generation benchmarks. Experiments demonstrate that cache can also be used in DLMs, even in a training-free manner from current DLMs.

large language model, machine learning, natural language, (15 more...)

2505.15781

Genre:

Workflow (0.88)
Research Report (0.64)

Industry: Leisure & Entertainment (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)

arXiv.org Artificial IntelligenceMay-21-2025

Debate Only When Necessary: Adaptive Multiagent Collaboration for Efficient LLM Reasoning

Eo, Sugyeong, Moon, Hyeonseok, Zi, Evelyn Hayoon, Park, Chanjun, Lim, Heuiseok

Multiagent collaboration has emerged as a promising framework for enhancing the reasoning capabilities of large language models (LLMs). Despite improvements in reasoning, the approach introduces substantial computational overhead resulting from iterative agent interactions. Furthermore, engaging in unnecessary debates increases the risk of generating erroneous responses. To address these challenges, we propose Debate Only When Necessary (DOWN), an adaptive multiagent debate framework that selectively activates debate based on the confidence score of the agent's initial response. Debate is activated only for queries requiring further deliberation, during which agents refine their outputs by referencing peer responses and associated confidence scores. Evaluations on benchmarks show that DOWN improves efficiency by up to six times while preserving or even outperforming the performance of existing methods. Further analysis indicates that DOWN effectively mitigates the risk of error propagation stemming from the unnecessary debate process. These findings demonstrate the effectiveness of our approach in delivering high-performance LLM solutions at a lower computational cost.

large language model, machine learning, natural language, (17 more...)

2504.05047

Country:

North America > United States (0.46)
Asia > Middle East > UAE (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Chrisman, Brianna, Bushnaq, Lucius, Sharkey, Lee

Identifying Sparsely Active Circuits Through Local Loss Landscape Decomposition

arXiv.org Artificial IntelligenceMar-31-2025

Much of mechanistic interpretability has focused on understanding the activation spaces of large neural networks. However, activation space-based approaches reveal little about the underlying circuitry used to compute features. To better understand the circuits employed by models, we introduce a new decomposition method called Local Loss Landscape Decomposition (L3D). L3D identifies a set of low-rank subnetworks: directions in parameter space of which a subset can reconstruct the gradient of the loss between any sample's output and a reference output vector. We design a series of progressively more challenging toy models with well-defined subnetworks and show that L3D can nearly perfectly recover the associated subnetworks. Additionally, we investigate the extent to which perturbing the model in the direction of a given subnetwork affects only the relevant subset of samples. Finally, we apply L3D to a real-world transformer model and a convolutional neural network, demonstrating its potential to identify interpretable and relevant circuits in parameter space.

artificial intelligence, machine learning, subnetwork, (19 more...)

2504.00194

Genre: Research Report (0.83)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)