effective learning
SIDE: Semantic ID Embedding for effective learning from sequences
Ramasamy, Dinesh, Kumar, Shakti, Cadonic, Chris, Yang, Jiaxin, Roychowdhury, Sohini, Rhman, Esam Abdel, Reddy, Srihari
Sequence-based recommendations models are driving the state-of-the-art for industrial ad-recommendation systems. Such systems typically deal with user histories or sequence lengths ranging in the order of O(10^3) to O(10^4) events. While adding embeddings at this scale is manageable in pre-trained models, incorporating them into real-time prediction models is challenging due to both storage and inference costs. To address this scaling challenge, we propose a novel approach that leverages vector quantization (VQ) to inject a compact Semantic ID (SID) as input to the recommendation models instead of a collection of embeddings. Our method builds on recent works of SIDs by introducing three key innovations: (i) a multi-task VQ-VAE framework, called VQ fusion that fuses multiple content embeddings and categorical predictions into a single Semantic ID; (ii) a parameter-free, highly granular SID-to-embedding conversion technique, called SIDE, that is validated with two content embedding collections, thereby eliminating the need for a large parameterized lookup table; and (iii) a novel quantization method called Discrete-PCA (DPCA) which generalizes and enhances residual quantization techniques. The proposed enhancements when applied to a large-scale industrial ads-recommendation system achieves 2.4X improvement in normalized entropy (NE) gain and 3X reduction in data footprint compared to traditional SID methods.
- North America > Canada > Ontario > Toronto (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Data Science (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Amazon.com: Grokking Algorithms: A Complete Beginner's Guide for the Effective Learning of Algorithms eBook : Christian, Dylan: Kindle Store
Learn the different types of Algorithms and how they work Learn about the practical uses of Algorithms Get background knowledge about Algorithms with concrete examples Master Selection sort and Recursion Discover Quicksort with real-life examples Learn about Hashtags and why they are useful Discover Breadth-first search and Dijkstra's algorithm Master dynamic programming
Effective Learning of a GMRF Mixture Model
Finder, Shahaf E., Treister, Eran, Freifeld, Oren
Learning a Gaussian Mixture Model (GMM) is hard when the number of parameters is too large given the amount of available data. As a remedy, we propose restricting the GMM to a Gaussian Markov Random Field Mixture Model (GMRF-MM), as well as a new method for estimating the latter's sparse precision (i.e., inverse covariance) matrices. When the sparsity pattern of each matrix is known, we propose an efficient optimization method for the Maximum Likelihood Estimate (MLE) of that matrix. When it is unknown, we utilize the popular Graphical LASSO (GLASSO) to estimate that pattern. However, we show that even for a single Gaussian, when GLASSO is tuned to successfully estimate the sparsity pattern, it does so at the price of a substantial bias of the values of the nonzero entries of the matrix, and we show that this problem only worsens in a mixture setting. To overcome this, we discard the non-zero values estimated by GLASSO, keep only its pattern estimate and use it within the proposed MLE method. This yields an effective two-step procedure that removes the bias. We show that our "debiasing" approach outperforms GLASSO in both the single-GMRF and the GMRF-MM cases. We also show that when learning priors for image patches, our method outperforms GLASSO even if we merely use an educated guess about the sparsity pattern, and that our GMRF-MM outperforms the baseline GMM on real and synthetic high-dimensional datasets. Our code is available at \url{https://github.com/shahaffind/GMRF-MM}.
- Asia > Middle East > Israel (0.04)
- North America > United States (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)
- (2 more...)
Effective Learning Critical to Compelling Experience of Artificial Intelligence, Says Strategy Analytics
The desire for users to complete everyday tasks and functions using AI is high. But current AI solutions often fall short of expectations due to limitations in capability and functionality. "Effective learning is critical to creating a compelling experience in this way," commented Christopher Dodge, Associate Director and report author. "Many AI solutions require too much upfront action from the user; manually inputting data or linking profiles from different apps to ensure inclusion in one service is time-consuming and clumsy." Correctly identifying an individual within one ecosystem also impacts basic tasks, especially if the solution cannot identify the individual requesting the action.
- North America (0.06)
- Europe (0.06)
- Asia (0.06)
- Media > News (0.42)
- Information Technology (0.34)
Machine Learning Models and Algorithms for Big Data Classification: Thinking with Examples for Effective Learning (Integrated Series in Information Systems): Shan Suthaharan: 9781489976406: Amazon.com: Books
I purchased the book in order to learn a bit more on big data and machine learning. This is a nicely written book with all of the fundamental concepts in big data and machine learning. With every concept clearly explained with examples and graphs, accompanied with R codes. An example is the Patterns of Big data in Section 3.3, where the author explained different pattern evolutions are to be used for supervised learning. As the selected features increase, class separation by the standardization is then clearly demonstrated to be an efficient and accurate method.
- Information Technology > Data Science > Data Mining > Big Data (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)