AITopics | smartssd

Collaborating Authors

smartssd

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SCRec: A Scalable Computational Storage System with Statistical Sharding and Tensor-train Decomposition for Recommendation Models

Yang, Jinho, Kim, Ji-Hoon, Kim, Joo-Young

arXiv.org Artificial IntelligenceApr-1-2025

NN, MM YYYY 1 SCRec: A Scalable Computational Storage System with Statistical Sharding and Tensor-train Decomposition for Recommendation Models Jinho Y ang, Graduate Student Member, IEEE, Ji-Hoon Kim, Graduate Student Member, IEEE, Joo-Y oung Kim, Senior Member, IEEE, Abstract --Deep Learning Recommendation Models (DLRMs) play a crucial role in delivering personalized content across web applications such as social networking and video streaming. However, with improvements in performance, the parameter size of DLRMs has grown to terabyte (TB) scales, accompanied by memory bandwidth demands exceeding TB/s levels. Furthermore, the workload intensity within the model varies based on the target mechanism, making it difficult to build an optimized recommendation system. In this paper, we propose SCRec, a scalable computational storage recommendation system that can handle TB-scale industrial DLRMs while guaranteeing high bandwidth requirements. SCRec utilizes a software framework that features a mixed-integer programming (MIP)-based cost model, efficiently fetching data based on data access patterns and adaptively configuring memory-centric and compute-centric cores. Additionally, SCRec integrates hardware acceleration cores to enhance DLRM computations, particularly allowing for the high-performance reconstruction of approximated embedding vectors from extremely compressed tensor-train (TT) format. By combining its software framework and hardware accelerators, while eliminating data communication overhead by being implemented on a single server, SCRec achieves substantial improvements in DLRM inference performance. It delivers up to 55.77 speedup compared to a CPU-DRAM system with no loss in accuracy and up to 13.35 energy efficiency gains over a multi-GPU system. I NTRODUCTION R RECOMMENDA TION systems are widely used in social network services and video streaming platforms to provide personalized and preferred content to consumers as described in Fig.1. They are also employed in search engines to offer differentiated search services [1]-[5]. For example, more than 80% of Meta's data center resources are allocated to recommendation system inference, while over 50% are utilized for training these systems [6]. Traditional recommendation systems relied on collaborative filtering techniques, such as content filtering using matrix factorization [7]-[10]. However, with advancements in deep neural networks (DNNs), deep learning recommendation models (DLRMs) that combine embedding tables (EMBs) and This work was supported by Samsung Electronics Co., Ltd.. Manuscript received MM dd, yyyy; revised MM dd, yyyy. These models are widely adopted in data centers, with recent focuses on both software-level and hardware-level optimizations [11]- [17]. This combination has demonstrated superior recommendation performance, making DLRM the industry standard in recommendation systems.

artificial intelligence, machine learning, smartssd, (20 more...)

arXiv.org Artificial Intelligence

2504.0052

Country:

Asia > South Korea > Daejeon > Daejeon (0.04)
North America > United States (0.04)
Asia > South Korea > Seoul > Seoul (0.04)
Asia > South Korea > Gyeonggi-do > Suwon (0.04)

Genre: Research Report (0.82)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PreSto: An In-Storage Data Preprocessing System for Training Recommendation Models

Lee, Yunjae, Kim, Hyeseong, Rhu, Minsoo

arXiv.org Artificial IntelligenceJun-11-2024

Training recommendation systems (RecSys) faces several challenges as it requires the "data preprocessing" stage to preprocess an ample amount of raw data and feed them to the GPU for training in a seamless manner. To sustain high training throughput, state-of-the-art solutions reserve a large fleet of CPU servers for preprocessing which incurs substantial deployment cost and power consumption. Our characterization reveals that prior CPU-centric preprocessing is bottlenecked on feature generation and feature normalization operations as it fails to reap out the abundant inter-/intra-feature parallelism in RecSys preprocessing. PreSto is a storage-centric preprocessing system leveraging In-Storage Processing (ISP), which offloads the bottlenecked preprocessing operations to our ISP units. We show that PreSto outperforms the baseline CPU-centric system with a $9.6\times$ speedup in end-to-end preprocessing time, $4.3\times$ enhancement in cost-efficiency, and $11.3\times$ improvement in energyefficiency on average for production-scale RecSys preprocessing.

opération, presto, proceedings, (15 more...)

arXiv.org Artificial Intelligence

2406.14571

Genre: Research Report (0.84)

Industry: Information Technology > Services (0.93)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Architecture (1.00)
(2 more...)

Add feedback

SNIA Persistent Memory And Computational Storage Summit, Part 1

#artificialintelligenceJun-11-2022, 22:56:59 GMT

SNIA held its Persistent Memory and Computational Storage Summit, virtual this year, like last year. Let's explore some of the insights from that virtual conference from the first day. Dr. Yang Seok, VP of the Memory Solutions Lab at Samsung spoke about the company's SmartSSD. He argued that computational storage devices, which off-load processing from CPUs, may reduce energy consumption and thus provide a green computing alternative. He pointed out that data center energy usage has stayed flat at about 1% since 2010 (in 2020 its was 200-250 TWh per year) due to technology innovations.

computational storage summit, memory and computational storage summit, snia persistent memory, (11 more...)

#artificialintelligence

Industry: Information Technology (0.42)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback