AITopics | Shankar, Devashish

Collaborating Authors

Shankar, Devashish

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Enhancing Performance and Scalability of Large-Scale Recommendation Systems with Jagged Flash Attention

Xu, Rengan, Yang, Junjie, Xu, Yifan, Li, Hong, Liu, Xing, Shankar, Devashish, Zhang, Haoci, Liu, Meng, Li, Boyang, Hu, Yuxi, Tang, Mingwei, Zhang, Zehua, Zhang, Tunhou, Li, Dai, Chen, Sijia, Musumeci, Gian-Paolo, Zhai, Jiaqi, Zhu, Bill, Yan, Hong, Reddy, Srihari

arXiv.org Artificial IntelligenceSep-19-2024

The integration of hardware accelerators has significantly advanced the capabilities of modern recommendation systems, enabling the exploration of complex ranking paradigms previously deemed impractical. However, the GPU-based computational costs present substantial challenges. In this paper, we demonstrate our development of an efficiency-driven approach to explore these paradigms, moving beyond traditional reliance on native PyTorch modules. We address the specific challenges posed by ranking models' dependence on categorical features, which vary in length and complicate GPU utilization. We introduce Jagged Feature Interaction Kernels, a novel method designed to extract fine-grained insights from long categorical features through efficient handling of dynamically sized tensors. We further enhance the performance of attention mechanisms by integrating Jagged tensors with Flash Attention. Our novel Jagged Flash Attention achieves up to 9x speedup and 22x memory reduction compared to dense attention. Notably, it also outperforms dense flash attention, with up to 3x speedup and 53% more memory efficiency. In production models, we observe 10% QPS improvement and 18% memory savings, enabling us to scale our recommendation systems with longer features and more complex architectures.

artificial intelligence, deep learning, machine learning, (11 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3640457.3688040

2409.15373

Country: North America > United States (0.29)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.41)

Add feedback

ARMDN: Associative and Recurrent Mixture Density Networks for eRetail Demand Forecasting

Mukherjee, Srayanta, Shankar, Devashish, Ghosh, Atin, Tathawadekar, Nilam, Kompalli, Pramod, Sarawagi, Sunita, Chaudhury, Krishnendu

arXiv.org Machine LearningMar-16-2018

Accurate demand forecasts can help on-line retail organizations better plan their supply-chain processes. The challenge, however, is the large number of associative factors that result in large, non-stationary shifts in demand, which traditional time series and regression approaches fail to model. In this paper, we propose a Neural Network architecture called AR-MDN, that simultaneously models associative factors, time-series trends and the variance in the demand. We first identify several causal features and use a combination of feature embeddings, MLP and LSTM to represent them. We then model the output density as a learned mixture of Gaussian distributions. The AR-MDN can be trained end-to-end without the need for additional supervision. We experiment on a dataset of an year's worth of data over tens-of-thousands of products from Flipkart. The proposed architecture yields a significant improvement in forecasting accuracy when compared with existing alternatives.

ar-mdn, deep learning, neural network, (19 more...)

arXiv.org Machine Learning

1803.038

Country:

Asia > India (0.28)
South America > Brazil > Rio de Janeiro (0.14)

Genre: Research Report (1.00)

Industry:

Marketing (0.68)
Retail (0.68)
Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback