AITopics

Industry: Health & Medicine (0.59)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.52)

Neural Information Processing SystemsAug-16-2025, 08:02:55 GMT

Dynamic allocation of limited memory resources in reinforcement learning

However, the two threads have been largely separate.

agent, allocation, precision, (13 more...)

Country:

Europe > Switzerland > Geneva > Geneva (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Finland > Uusimaa > Helsinki (0.04)
(3 more...)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Neural Information Processing SystemsMay-31-2025, 18:54:02 GMT

Review for NeurIPS paper: Dynamic allocation of limited memory resources in reinforcement learning

Weaknesses: (This section is being combined with "comments for improvement" section below) A bit of a nitpick regarding the language use around "more" or "less" resources. The authors write about an agent using "more resources", which corresponds to "lower entropy" for the actions in a particular state. I think, though, that technically (and literally, for this agent) the amount of resources used for each memory is exactly the same; it's literally a number to represent the mean and standard deviation. From what I can tell, the authors are arguing that memories with lower standard deviations would *require* more resources to represent in certain implementations (such as in brains). So it's not actually the case that more resources are used for low-sigma memories in the agent, but that more resources might be used in other agents.

agent, algorithm, implementation, (12 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.53)

Neural Information Processing SystemsMay-31-2025, 18:53:55 GMT

Review for NeurIPS paper: Dynamic allocation of limited memory resources in reinforcement learning

This paper nicely bridges between neuroscience and RL, and considers the important topic of limited memory resources in RL agents. The topic is well-suited for NeurIPS (R2) as it has broader applicability toward e.g. All reviewers agreed that it is well-motivated and written (R1, R2, R3, R4), although R3 did ask for a bit more explanation on some methodological details. It is also appropriately situated with respect to related work (R1, R2, R3) although R2 suggests a separate related works section, and R4 wanted to see more discussion of work outside of neuroscience, focused on optimizing RL with limited capacity. R1 pointed out that perhaps there's a bit of confusion between memory precision and use of memory resources, as the former is more accurate for agents, the latter perhaps for real brains - ie more precise representations require more resources to encode in the brain, but this seems to be a minor point.

dynamic allocation, memory resource, reinforcement, (6 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.72)

Xu, Weijie, Futrell, Richard

Strategic resource allocation in memory encoding: An efficiency principle shaping language processing

arXiv.org Artificial IntelligenceMar-18-2025

How is the limited capacity of working memory efficiently used to support human linguistic behaviors? In this paper, we investigate strategic resource allocation as an efficiency principle for memory encoding in sentence processing. The idea is that working memory resources are dynamically and strategically allocated to prioritize novel and unexpected information, enhancing their representations to make them less susceptible to memory decay and interference. Theoretically, from a resource-rational perspective, we argue that this efficiency principle naturally arises from two functional assumptions about working memory, namely, its limited capacity and its noisy representation. Empirically, through naturalistic corpus data, we find converging evidence for strategic resource allocation in the context of dependency locality from both the production and the comprehension side, where non-local dependencies with less predictable antecedents are associated with reduced locality effect. However, our results also reveal considerable cross-linguistic variability, highlighting the need for a closer examination of how strategic resource allocation, as a universal efficiency principle, interacts with language-specific phrase structures.

large language model, machine learning, natural language, (16 more...)

2503.14728

Country:

Asia > Indonesia > Bali (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.93)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.66)

arXiv.org Artificial IntelligenceDec-25-2024

Taming the Memory Beast: Strategies for Reliable ML Training on Kubernetes

Ray, Jaideep

Kubernetes offers a powerful orchestration platform for machine learning training, but memory management can be challenging due to specialized needs and resource constraints. This paper outlines how Kubernetes handles memory requests, limits, Quality of Service classes, and eviction policies for ML workloads, with special focus on GPU memory and ephemeral storage. Common pitfalls such as overcommitment, memory leaks, and ephemeral volume exhaustion are examined. We then provide best practices for stable, scalable memory utilization to help ML practitioners prevent out-of-memory events and ensure high-performance ML training pipelines.

artificial intelligence, cloud computing, machine learning, (16 more...)

2412.14701

Genre: Research Report (0.40)

Technology:

Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceOct-15-2024

Edge Unlearning is Not "on Edge"! An Adaptive Exact Unlearning System on Resource-Constrained Devices

Xia, Xiaoyu, Wang, Ziqi, Sun, Ruoxi, Liu, Bowen, Khalil, Ibrahim, Xue, Minhui

The right to be forgotten mandates that machine learning models enable the erasure of a data owner's data and information from a trained model. Removing data from the dataset alone is inadequate, as machine learning models can memorize information from the training data, increasing the potential privacy risk to users. To address this, multiple machine unlearning techniques have been developed and deployed. Among them, approximate unlearning is a popular solution, but recent studies report that its unlearning effectiveness is not fully guaranteed. Another approach, exact unlearning, tackles this issue by discarding the data and retraining the model from scratch, but at the cost of considerable computational and memory resources. However, not all devices have the capability to perform such retraining. In numerous machine learning applications, such as edge devices, Internet-of-Things (IoT), mobile devices, and satellites, resources are constrained, posing challenges for deploying existing exact unlearning methods. In this study, we propose a Constraint-aware Adaptive Exact Unlearning System at the network Edge (CAUSE), an approach to enabling exact unlearning on resource-constrained devices. Aiming to minimize the retrain overhead by storing sub-models on the resource-constrained device, CAUSE innovatively applies a Fibonacci-based replacement strategy and updates the number of shards adaptively in the user-based data partition process. To further improve the effectiveness of memory usage, CAUSE leverages the advantage of model pruning to save memory via compression with minimal accuracy sacrifice. The experimental results demonstrate that CAUSE significantly outperforms other representative systems in realizing exact unlearning on the resource-constrained device by 9.23%-80.86%, 66.21%-83.46%, and 5.26%-194.13% in terms of unlearning speed, energy consumption, and accuracy.

accuracy, artificial intelligence, machine learning, (17 more...)

2410.10128

Country:

Europe > United Kingdom (0.14)
Europe > Ukraine (0.04)
Oceania > Australia (0.04)
(6 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Energy (1.00)
Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Neural Information Processing SystemsOct-11-2024, 07:29:43 GMT

Dynamic allocation of limited memory resources in reinforcement learning

dynamic allocation, memory resource, reinforcement, (3 more...)

Industry: Health & Medicine (0.41)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)
Information Technology > Artificial Intelligence > Cognitive Science (0.78)

Toupas, Petros, Yu, Zhewen, Bouganis, Christos-Savvas, Tzovaras, Dimitrios

SMOF: Streaming Modern CNNs on FPGAs with Smart Off-Chip Eviction

arXiv.org Artificial IntelligenceMar-27-2024

Convolutional Neural Networks (CNNs) have demonstrated their effectiveness in numerous vision tasks. However, their high processing requirements necessitate efficient hardware acceleration to meet the application's performance targets. In the space of FPGAs, streaming-based dataflow architectures are often adopted by users, as significant performance gains can be achieved through layer-wise pipelining and reduced off-chip memory access by retaining data on-chip. However, modern topologies, such as the UNet, YOLO, and X3D models, utilise long skip connections, requiring significant on-chip storage and thus limiting the performance achieved by such system architectures. The paper addresses the above limitation by introducing weight and activation eviction mechanisms to off-chip memory along the computational pipeline, taking into account the available compute and memory resources. The proposed mechanism is incorporated into an existing toolflow, expanding the design space by utilising off-chip memory as a buffer. This enables the mapping of such modern CNNs to devices with limited on-chip memory, under the streaming architecture design approach. SMOF has demonstrated the capacity to deliver competitive and, in some cases, state-of-the-art performance across a spectrum of computer vision tasks, achieving up to 10.65 X throughput improvement compared to previous works.

architecture, subgraph, weight fragmentation, (15 more...)

2403.18921

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

arXiv.org Artificial IntelligenceJan-21-2024

LR-CNN: Lightweight Row-centric Convolutional Neural Network Training for Memory Reduction

Wang, Zhigang, Yang, Hangyu, Wang, Ning, Xu, Chuanfei, Nie, Jie, Wei, Zhiqiang, Gu, Yu, Yu, Ge

In the last decade, Convolutional Neural Network with a multi-layer architecture has advanced rapidly. However, training its complex network is very space-consuming, since a lot of intermediate data are preserved across layers, especially when processing high-dimension inputs with a big batch size. That poses great challenges to the limited memory capacity of current accelerators (e.g., GPUs). Existing efforts mitigate such bottleneck by external auxiliary solutions with additional hardware costs, and internal modifications with potential accuracy penalty. Differently, our analysis reveals that computations intra- and inter-layers exhibit the spatial-temporal weak dependency and even complete independency features. That inspires us to break the traditional layer-by-layer (column) dataflow rule. Now operations are novelly re-organized into rows throughout all convolution layers. This lightweight design allows a majority of intermediate data to be removed without any loss of accuracy. We particularly study the weak dependency between two consecutive rows. For the resulting skewed memory consumption, we give two solutions with different favorite scenarios. Evaluations on two representative networks confirm the effectiveness. We also validate that our middle dataflow optimization can be smoothly embraced by existing works for better memory reduction.

artificial intelligence, feature map, machine learning, (17 more...)

2401.11471

Country: Europe (0.14)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)