AITopics | Memory

Collaborating Authors

Memory

News Overviews Instructional Materials AI-Alerts Classics

AMD Radeon RX 9070 and 9070 XT review: The new 1440p gaming champions

PCWorldMar-5-2025, 14:00:13 GMT

Some software bugs mar the experience but overall, AMD's 9070 graphics cards offer such a compelling mix of performance, value, and memory capacity that it's worth accepting those quibbles. Nvidia fumbled the ball with its 549 GeForce RTX 5070, and AMD's new Radeon RX 9070 and 9070 XT are primed to seize advantage. The RTX 5070, hitting store shelves today, is a good 1440p graphics card but a stagnant generational sidegrade at best. Enter the 549 Radeon RX 9070 and 599 Radeon RX 9070 XT, launching tomorrow. Both cards are faster than the RTX 5070, with the 9070 XT going toe-to-toe with the 750 RTX 5070 Ti in many games, and each includes an ample 16GB of VRAM.

artificial intelligence, radeon rx 9070, rtx 5070, (17 more...)

PCWorld

Industry:

Leisure & Entertainment > Sports (0.35)
Leisure & Entertainment > Games (0.35)

Technology:

Information Technology > Artificial Intelligence (0.69)
Information Technology > Hardware > Memory (0.30)

Add feedback

Linear-Memory and Decomposition-Invariant Linearly Convergent Conditional Gradient Algorithm for Structured Polytopes

Neural Information Processing SystemsFeb-11-2025, 20:32:28 GMT

Recently, several works have shown that natural modifications of the classical conditional gradient method (aka Frank-Wolfe algorithm) for constrained convex optimization, provably converge with a linear rate when the feasible set is a polytope, and the objective is smooth and strongly-convex. However, all of these results suffer from two significant shortcomings: i) large memory requirement due to the need to store an explicit convex decomposition of the current iterate, and as a consequence, large running-time overhead per iteration ii) the worst case convergence rate depends unfavorably on the dimension In this work we present a new conditional gradient variant and a corresponding analysis that improves on both of the above shortcomings. In particular, both memory and computation overheads are only linear in the dimension, and in addition, in case the optimal solution is sparse, the new convergence rate replaces a factor which is at least linear in the dimension in previous works, with a linear dependence on the number of non-zeros in the optimal solution At the heart of our method, and corresponding analysis, is a novel way to compute decomposition-invariant away-steps. While our theoretical guarantees do not apply to any polytope, they apply to several important structured polytopes that capture central concepts such as paths in graphs, perfect matchings in bipartite graphs, marginal distributions that arise in structured prediction tasks, and more. Our theoretical findings are complemented by empirical evidence that shows that our method delivers state-of-the-art performance.

artificial intelligence, linearly convergent conditional gradient algorithm, machine learning, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.65)
Information Technology > Hardware > Memory (0.61)

Add feedback

on a memory economical calculation, while its vanilla multi-key counterpart is less memory efficient when achieving

Neural Information Processing SystemsJan-26-2025, 16:32:28 GMT

Thank you for acknowledging the key contributions of our paper. R1.2 Generalize to video: As suggested, we conducted additional The top-1 accuracy of JCL pre-trained features is 48.6%, which outperforms MoCo v2 (47.3%). Generalization of JCL for other data modalities (sound, language, video) will be included in our future work. Regarding your concerns of the written quality and typos (e.g., Algorithm 1 The top-1 accuracy on ImageNet100 for vanilla (ResNet-50) is 80.9% while JCL achieves 82.0%. R2.3 SimCLR: The top-5 accuracy we reported (87.3%) for SimCLR was extracted from the Thus, there is no one-one correspondence between the data in Table1 and Figure2.

artificial intelligence, machine learning, vanilla multi-key counterpart, (12 more...)

Neural Information Processing Systems

Technology:

Information Technology > Hardware > Memory (0.41)
Information Technology > Artificial Intelligence > Machine Learning (0.32)

Add feedback

Reviews: Large Memory Layers with Product Keys

Neural Information Processing SystemsJan-26-2025, 01:36:58 GMT

UPDATE: Authors answered my questions, I would like to keep my score unchanged and suggest to focus on clarity of the final version. Perhaps, this is the case when I would really be interested in looking at the source code. Originality: the paper borrows the general idea of product keys from the database community, however the application to fast retrieval in neural memory systems seems quite novel to me. Quality: The core ideas of the paper are sound, however more I would appreciate more rigor in both conceptual and experimental comparison with other approaches incorporating memory to Transformer (see e.g. Another suggestion would be to discuss more the issue of potential non-uniformity of the query distribution, which indeed seems to be quite relevant.

machine learning, machine translation, memory layer, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Hardware > Memory (0.40)
Information Technology > Artificial Intelligence > Machine Learning (0.39)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.36)

Add feedback

corresponding modifications in the revised paper. 32GB of RAM, it takes 65 seconds to estimate the O(|V |

Neural Information Processing SystemsJan-24-2025, 00:30:23 GMT

We thank the reviewers for their valuable feedback. R2 and R3 had questions about the time complexity of our method. As noted in Appendix A, this computation can be amortized across many goal-reaching tasks. Lastly, we agree with R2 that the construction of "good" replay buffers is an We will clarify this in Section 2.3. We will clarify this in Alg. 1.

artificial intelligence, corresponding modification, machine learning, (15 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games > Computer Games (0.32)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.42)
Information Technology > Hardware > Memory (0.41)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.36)

Add feedback

Managed-Retention Memory: A New Class of Memory for the AI Era

Legtchenko, Sergey, Stefanovici, Ioan, Black, Richard, Rowstron, Antony, Liu, Junyi, Costa, Paolo, Canakci, Burcu, Narayanan, Dushyanth, Wu, Xingbo

arXiv.org Artificial IntelligenceJan-16-2025

AI clusters today are one of the major uses of High Bandwidth Memory (HBM). However, HBM is suboptimal for AI workloads for several reasons. Analysis shows HBM is overprovisioned on write performance, but underprovisioned on density and read bandwidth, and also has significant energy per bit overheads. It is also expensive, with lower yield than DRAM due to manufacturing complexity. We propose a new memory class: Managed-Retention Memory (MRM), which is more optimized to store key data structures for AI inference workloads. We believe that MRM may finally provide a path to viability for technologies that were originally proposed to support Storage Class Memory (SCM). These technologies traditionally offered long-term persistence (10+ years) but provided poor IO performance and/or endurance. MRM makes different trade-offs, and by understanding the workload IO patterns, MRM foregoes long-term data retention and write performance for better potential performance on the metrics important for these workloads.

large language model, natural language, workload, (18 more...)

arXiv.org Artificial Intelligence

2501.09605

Country: North America > United States (0.68)

Genre: Research Report (0.40)

Industry:

Information Technology (0.47)
Semiconductors & Electronics (0.46)

Technology:

Information Technology > Hardware > Memory (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)

Add feedback

Best of CES 2025: The PC and home tech that blew us away

PCWorldJan-9-2025, 11:30:00 GMT

You never know what you're going to get with CES. Of course, we knew we'd hear a lot about AI -- check -- and that there'd be announcements of new CPUs and GPUs -- also check. But you just never know how the all the pomp and hoo-ha of this annual mega tech event is going to pay off in the real-world, for regular consumers. Does the average PC user have something to be excited about now that the veil has come off of this year's product launches? If the PCWorld staff is any indication, the answer is yes!

artificial intelligence, laptop, lenovo, (15 more...)

PCWorld

Industry:

Health & Medicine > Therapeutic Area (0.47)
Information Technology > Hardware (0.32)

Technology:

Information Technology > Artificial Intelligence (0.67)
Information Technology > Hardware > Memory (0.48)

Add feedback

Input-Based Ensemble-Learning Method for Dynamic Memory Configuration of Serverless Computing Functions

Agarwal, Siddharth, Rodriguez, Maria A., Buyya, Rajkumar

arXiv.org Artificial IntelligenceNov-11-2024

In today's Function-as-a-Service offerings, a programmer is usually responsible for configuring function memory for its successful execution, which allocates proportional function resources such as CPU and network. However, right-sizing the function memory force developers to speculate performance and make ad-hoc configuration decisions. Recent research has highlighted that a function's input characteristics, such as input size, type and number of inputs, significantly impact its resource demand, run-time performance and costs with fluctuating workloads. This correlation further makes memory configuration a non-trivial task. On that account, an input-aware function memory allocator not only improves developer productivity by completely hiding resource-related decisions but also drives an opportunity to reduce resource wastage and offer a finer-grained cost-optimised pricing scheme. Therefore, we present MemFigLess, a serverless solution that estimates the memory requirement of a serverless function with input-awareness. The framework executes function profiling in an offline stage and trains a multi-output Random Forest Regression model on the collected metrics to invoke input-aware optimal configurations. We evaluate our work with the state-of-the-art approaches on AWS Lambda service to find that MemFigLess is able to capture the input-aware resource relationships and allocate upto 82% less resources and save up to 87% run-time costs.

artificial intelligence, configuration, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2411.07444

Country: North America > United States (0.29)

Genre: Research Report (0.70)

Industry: Information Technology > Services (0.68)

Technology:

Information Technology > Hardware > Memory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Apple MacBook Pro M4 review: faster, better and cheaper

The GuardianNov-8-2024, 07:00:04 GMT

Apple's upgraded MacBook Pro for 2024 gets a significant power boost with the M4 chip, double the memory as standard, even longer battery life and a price cut, ending the year on a high. The Guardian's journalism is independent. We will earn a commission if you buy something through an affiliate link. The longstanding laptop line now starts at 1,599 ( 1,899/ 1,599/A 2,499), making it 100 or so cheaper than last year's M3 models. Though still an expensive, premium laptop, it comes with at least 16GB of RAM rather than 8GB, which was an upgrade worth paying extra for on previous models. The outside hasn't changed from its predecessor.

artificial intelligence, macbook, natural language, (16 more...)

The Guardian

Technology:

Information Technology > Hardware > Memory (0.36)
Information Technology > Communications > Mobile (0.35)
Information Technology > Artificial Intelligence > Natural Language (0.30)

Add feedback

vAttention: Dynamic Memory Management for Serving LLMs without PagedAttention

Prabhu, Ramya, Nayak, Ajay, Mohan, Jayashree, Ramjee, Ramachandran, Panwar, Ashish

arXiv.org Artificial IntelligenceJul-12-2024

Efficient management of GPU memory is essential for high throughput LLM inference. Prior systems used to reserve KV-cache memory ahead-of-time that resulted in wasted capacity due to internal fragmentation. Inspired by demand paging, vLLM proposed PagedAttention to enable dynamic memory allocation for KV-cache. This approach eliminates fragmentation and improves serving throughout. However, to be able to allocate physical memory dynamically, PagedAttention changes the layout of KV-cache from contiguous virtual memory to non-contiguous virtual memory. As a consequence, one needs to rewrite the attention kernels to support paging, and implement a memory manager in the serving framework. This results in both performance and programming overheads, as well as portability challenges in adopting state-of-the-art attention kernels. In this paper, we propose vAttention, a new approach for dynamic KV-cache memory management. In contrast to PagedAttention, vAttention stores KV-cache in contiguous virtual memory and leverages OS support for on-demand allocation of physical memory. vAttention thus enables one to use state-of-the art attention kernels out-of-the-box by adding support for dynamic allocation of physical memory without having to re-write their code. We implement vAttention in the vLLM serving stack to show that it also helps improve decode throughput by up to 1.99x over vLLM, and the end-to-end serving throughput by up to 1.22x and 1.29x, compared to using the state-of-the-art PagedAttention based kernels of FlashAttention and FlashInfer.

large language model, machine learning, natural language, (23 more...)

arXiv.org Artificial Intelligence

2405.04437

Country: North America > United States (0.46)

Genre: Research Report (0.40)

Technology:

Information Technology > Software (1.00)
Information Technology > Hardware > Memory (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback