AITopics | unified embedding

Collaborating Authors

unified embedding

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Unified Embedding: Battle-Tested Feature Representations for Web-Scale ML Systems Benjamin Coleman

Neural Information Processing SystemsFeb-16-2026, 13:52:53 GMT

Learning high-quality feature embeddings efficiently and effectively is critical for the performance of web-scale machine learning systems.

artificial intelligence, machine learning, unified embedding, (16 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Unified Embedding: Battle-Tested Feature Representations for Web-Scale ML Systems

Neural Information Processing SystemsDec-26-2025, 13:48:00 GMT

Learning high-quality feature embeddings efficiently and effectively is critical for the performance of web-scale machine learning systems.

battle-tested feature representation, name change, unified embedding, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Unified Embedding: Battle-Tested Feature Representations for Web-Scale ML Systems Benjamin Coleman

Neural Information Processing SystemsOct-9-2025, 04:51:28 GMT

Learning high-quality feature embeddings efficiently and effectively is critical for the performance of web-scale machine learning systems.

artificial intelligence, machine learning, unified embedding, (16 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Unified Embedding: Battle-Tested Feature Representations for Web-Scale ML Systems

Neural Information Processing SystemsJan-19-2025, 19:31:10 GMT

Learning high-quality feature embeddings efficiently and effectively is critical for the performance of web-scale machine learning systems. The standard approach is to represent each feature value as a d -dimensional embedding, which introduces hundreds of billions of parameters for extremely high-cardinality features. This bottleneck has led to substantial progress in alternative embedding algorithms. Many of these methods, however, make the assumption that each feature uses an independent embedding table. This work introduces a simple yet highly effective framework, Feature Multiplexing, where one single representation space is used for many different categorical features.

battle-tested feature representation, unified embedding, web-scale ml system

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

PixelBytes: Catching Unified Embedding for Multimodal Generation

Furfaro, Fabien

arXiv.org Artificial IntelligenceOct-21-2024

This report introduces PixelBytes Embedding, a novel approach for unified multimodal representation learning. Our method captures diverse inputs in a single, cohesive representation, enabling emergent properties for multimodal sequence generation, particularly for text and pixelated images. Inspired by state-of-the-art sequence models such as Image Transformers, PixelCNN, and Mamba-Bytes, PixelBytes aims to address the challenges of integrating different data types. We explore various model architectures, including Recurrent Neural Networks (RNNs), State Space Models (SSMs), and Attention-based models, focusing on bidirectional processing and our innovative PxBy embedding technique. Our experiments, conducted on a specialized PixelBytes Pok{\'e}mon dataset, demonstrate that bidirectional sequence models with PxBy embedding and convolutional layers can generate coherent multimodal sequences. This work contributes to the advancement of integrated AI models capable of understanding and generating multimodal data in a unified manner.

artificial intelligence, machine learning, multimodal generation, (2 more...)

arXiv.org Artificial Intelligence

2409.15512

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.53)

Add feedback

Unified Embedding: Battle-Tested Feature Representations for Web-Scale ML Systems

Coleman, Benjamin, Kang, Wang-Cheng, Fahrbach, Matthew, Wang, Ruoxi, Hong, Lichan, Chi, Ed H., Cheng, Derek Zhiyuan

arXiv.org Artificial IntelligenceNov-14-2023

Learning high-quality feature embeddings efficiently and effectively is critical for the performance of web-scale machine learning systems. A typical model ingests hundreds of features with vocabularies on the order of millions to billions of tokens. The standard approach is to represent each feature value as a d-dimensional embedding, introducing hundreds of billions of parameters for extremely high-cardinality features. This bottleneck has led to substantial progress in alternative embedding algorithms. Many of these methods, however, make the assumption that each feature uses an independent embedding table. This work introduces a simple yet highly effective framework, Feature Multiplexing, where one single representation space is used across many different categorical features. Our theoretical and empirical analysis reveals that multiplexed embeddings can be decomposed into components from each constituent feature, allowing models to distinguish between features. We show that multiplexed representations lead to Pareto-optimal parameter-accuracy tradeoffs for three public benchmark datasets. Further, we propose a highly practical approach called Unified Embedding with three major benefits: simplified feature configuration, strong adaptation to dynamic data distributions, and compatibility with modern hardware. Unified embedding gives significant improvements in offline and online metrics compared to highly competitive baselines across five web-scale search, ads, and recommender systems, where it serves billions of users across the world in industry-leading products.

collision, dimension, unified embedding, (13 more...)

arXiv.org Artificial Intelligence

2305.12102

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Unified Embedding Based Personalized Retrieval in Etsy Search

Jha, Rishikesh, Subramaniyam, Siddharth, Benjamin, Ethan, Taula, Thrivikrama

arXiv.org Artificial IntelligenceJun-7-2023

Embedding-based neural retrieval is a prevalent approach to address the semantic gap problem which often arises in product search on tail queries. In contrast, popular queries typically lack context and have a broad intent where additional context from users historical interaction can be helpful. In this paper, we share our novel approach to address both: the semantic gap problem followed by an end to end trained model for personalized semantic retrieval. We propose learning a unified embedding model incorporating graph, transformer and term-based embeddings end to end and share our design choices for optimal tradeoff between performance and efficiency. We share our learnings in feature engineering, hard negative sampling strategy, and application of transformer model, including a novel pre-training strategy and other tricks for improving search relevance and deploying such a model at industry scale. Our personalized retrieval model significantly improves the overall search experience, as measured by a 5.58% increase in search purchase rate and a 2.63% increase in site-wide conversion rate, aggregated across multiple A/B tests - on live traffic.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2306.04833

Country:

North America > United States > New York > New York County > New York City (0.15)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > Kings County > New York City (0.05)
(9 more...)

Genre:

Research Report (1.00)
Overview > Innovation (0.34)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Databases (0.93)
(2 more...)

Add feedback

#025 FaceNet: A Unified Embedding for Face Recognition and Clustering in PyTorch - Master Data Science 05.01.2022

#artificialintelligenceApr-8-2022, 18:10:11 GMT

Highlights: Face recognition represents an active area of research for more than 3 decades. This paper, FaceNet, published in 2015, introduced a lot of novelties and significantly improved the performance of face recognition, verification, and clustering tasks. Here, we explore this interesting framework that become popular for introducing 1) 128-dimensional face embedding vector and 2) triplet loss function. In addition to the theoretical background, we give an outline of how this network can be implemented in PyTorch. FaceNet method developed a novel design for the final layer of the CNN to embed the face image. This, so called, embedding vector is of size 128 elements.

face image, facenet, vector, (12 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback