AITopics | capacity allocation

Collaborating Authors

capacity allocation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Semantics Meet Signals: Dual Codebook Representationl Learning for Generative Recommendation

Hui, Zheng, Wei, Xiaokai, Shirkavand, Reza, Wang, Chen, Zhang, Weizhi, Peláez, Alejandro, Gong, Michelle

arXiv.org Artificial IntelligenceNov-27-2025

Generative recommendation has recently emerged as a powerful paradigm that unifies retrieval and generation, representing items as discrete semantic tokens and enabling flexible sequence modeling with autoregressive models. Despite its success, existing approaches rely on a single, uniform codebook to encode all items, overlooking the inherent imbalance between popular items rich in collaborative signals and long-tail items that depend on semantic understanding. We argue that this uniform treatment limits representational efficiency and hinders generalization. To address this, we introduce FlexCode, a popularity-aware framework that adaptively allocates a fixed token budget between a collaborative filtering (CF) codebook and a semantic codebook. A lightweight MoE dynamically balances CF-specific precision and semantic generalization, while an alignment and smoothness objective maintains coherence across the popularity spectrum. We perform experiments on both public and industrial-scale datasets, showing that FlexCode consistently outperform strong baselines. FlexCode provides a new mechanism for token representation in generative recommenders, achieving stronger accuracy and tail robustness, and offering a new perspective on balancing memorization and generalization in token-based recommendation models.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2511.20673

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.67)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.49)

Add feedback

Localist LLMs with Recruitment Learning

Diederich, Joachim

arXiv.org Artificial IntelligenceOct-21-2025

We present a novel framework for training large language models with continuously adjustable internal representations that span the full spectrum from localist (interpretable, rule-based) to distributed (generalizable, efficient) encodings. The key innovations are (1) a locality dial, a tunable parameter that dynamically controls the degree of localization during both training and inference without requiring model retraining, (2) an information-theoretic recruitment mechanism that adaptively allocates semantic blocks as needed, eliminating the requirement for complete domain knowledge at initialization, and (3) a hierarchical recruitment framework that extends capacity allocation to entire specialized LLMs, enabling multi-granularity architectural adaptation. This is achieved through group sparsity penalties on attention mechanisms, information-theoretic anchor design, dynamic rule injection, and principled recruitment criteria based on penalized likelihood with explicit units. We provide rigorous mathematical results establishing explicit threshold conditions under which attention provably concentrates on semantically relevant blocks at stationary points, with exact bounds on attention entropy and pointer fidelity. The hierarc hical recruitment mechanism provides convergence guarantees at both the block level (fine-grained, within-LLM) and the LLM level (coarse-grained, cross-domain), ensuring the system discovers semantic partitions that balance model complexity against data encoding efficiency. This framework enables practitioners to continuously interpolate between interpretable and high-performance modes while adapti ng architectural capacity at multiple granularities, supporting applications in regulated domains requiring both transparency and capability.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2510.17358

Country: Oceania > Australia (0.28)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.69)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Capacity-Constrained Continual Learning

Wen, Zheng, Precup, Doina, Van Roy, Benjamin, Singh, Satinder

arXiv.org Machine LearningJul-30-2025

Any agents we can possibly build are subject to capacity constraints, as memory and compute resources are inherently finite. However, comparatively little attention has been dedicated to understanding how agents with limited capacity should allocate their resources for optimal performance. The goal of this paper is to shed some light on this question by studying a simple yet relevant continual learning problem: the capacity-constrained linear-quadratic-Gaussian (LQG) sequential prediction problem. We derive a solution to this problem under appropriate technical conditions. Moreover, for problems that can be decomposed into a set of sub-problems, we also demonstrate how to optimally allocate capacity across these sub-problems in the steady state. We view the results of this paper as a first step in the systematic theoretical study of learning under capacity constraints.

artificial intelligence, exp, machine learning, (16 more...)

arXiv.org Machine Learning

2507.21479

Country:

North America > United States > California > Santa Clara County > Mountain View (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.30)

Add feedback

Polysemanticity and Capacity in Neural Networks

Scherlis, Adam, Sachan, Kshitij, Jermyn, Adam S., Benton, Joe, Shlegeris, Buck

arXiv.org Artificial IntelligenceJul-11-2023

Individual neurons in neural networks often represent a mixture of unrelated features. This phenomenon, called polysemanticity, can make interpreting neural networks more difficult and so we aim to understand its causes. We propose doing so through the lens of feature \emph{capacity}, which is the fractional dimension each feature consumes in the embedding space. We show that in a toy model the optimal capacity allocation tends to monosemantically represent the most important features, polysemantically represent less important features (in proportion to their impact on the loss), and entirely ignore the least important features. Polysemanticity is more prevalent when the inputs have higher kurtosis or sparsity and more prevalent in some architectures than others. Given an optimal allocation of capacity, we go on to study the geometry of the embedding space. We find a block-semi-orthogonal structure, with differing block sizes in different models, highlighting the impact of model architecture on the interpretability of its neurons.

artificial intelligence, machine learning, matrix, (18 more...)

arXiv.org Artificial Intelligence

2210.01892

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Capacity allocation through neural network layers

Donier, Jonathan

arXiv.org Machine LearningFeb-27-2019

Capacity analysis has been recently introduced as a way to analyze how linear models distribute their modelling capacity across the input space. In this paper, we extend the notion of capacity allocation to the case of neural networks with non-linear layers. We show that under some hypotheses the problem is equivalent to linear capacity allocation, within some extended input space that factors in the non-linearities. We introduce the notion of layer decoupling, which quantifies the degree to which a non-linear activation decouples its outputs, and show that it plays a central role in capacity allocation through layers. In the highly non-linear limit where decoupling is total, we show that the propagation of capacity throughout the layers follows a simple markovian rule, which turns into a diffusion PDE in the limit of deep networks with residual layers. This allows us to recover some known results about deep neural networks, such as the size of the effective receptive field, or why ResNets avoid the shattering problem.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Machine Learning

1902.08572

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Capacity allocation analysis of neural networks: A tool for principled architecture design

Donier, Jonathan

arXiv.org Machine LearningFeb-12-2019

Since the popularization of deep neural networks in the early 2010s, tailoring neural network architectures to specific tasks has been one of the main sources of activity for both academics and practitioners. Accordingly, a palette of empirical methods has been developed for automating the choice of neural networks hyperparameters (a process sometimes called Neural Architecture Search), including - but not limited to - random search [2, 1], genetic algorithms [16, 13], bayesian methods [24, 12] or reinforcement learning [29]. However, when the computational requirements for training a single model are high, such approaches might be too expensive or result in iteration cycles that are too long to be practically useful - though some work in that direction has been carried out recently [5, 14]. In other cases, when the loss function is only used as a proxy for the task at hand [25, 26, 10] or is not interpretable [8], a further perceptual evaluation is typically necessary to evaluate the quality of a model's outputs and such systematic approaches at least partially break down. In both cases, an efficient and quantitative method to analyze and compare neural network architectures would be highly desirable - be it only to come up with a limited set of plausible candidates to pass on to the more expensive (or manual) methods. In this paper, we introduce the notion of capacity allocation analysis, which is a systematic, quantitative andcomputationally efficient way to analyze neural network architectures by quantifying which dependencies between inputs and outputs a parameter of a set of parameters actually model. We develop a quantitative framework for assessing and comparing different architectures for a given task, providing insights that are complementary to the value of the loss function itself.

allocation, capacity allocation, constraint, (14 more...)

arXiv.org Machine Learning

1902.04485

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback