Goto

Collaborating Authors

 product id


Learn by Selling: Equipping Large Language Models with Product Knowledge for Context-Driven Recommendations

arXiv.org Artificial Intelligence

In contrast, LLMs, with their ability to understand nuances of language and context, offer a promising solution The rapid evolution of large language models (LLMs) has opened to overcome these limitations. However, a crucial prerequisite up new possibilities for applications such as context-driven product for LLMs to excel in product recommendation is their possession of recommendations. However, the effectiveness of these models in comprehensive knowledge about the entire inventory of products this context is heavily reliant on their comprehensive understanding available for sale. of the product inventory. This paper presents a novel approach In this paper, we propose a novel approach to equip LLMs with to equipping LLMs with product knowledge by training them to respond product knowledge by training them to generate contextual responses contextually to synthetic search queries that include product to synthetic search queries containing product IDs.


Simulating Task-Oriented Dialogues with State Transition Graphs and Large Language Models

arXiv.org Artificial Intelligence

This paper explores SynTOD, a new synthetic data generation approach for developing end-to-end Task-Oriented Dialogue (TOD) Systems capable of handling complex tasks such as intent classification, slot filling, conversational question-answering, and retrieval-augmented response generation, without relying on crowdsourcing or real-world data. SynTOD utilizes a state transition graph to define the desired behavior of a TOD system and generates diverse, structured conversations through random walks and response simulation using large language models (LLMs). In our experiments, using graph-guided response simulations leads to significant improvements in intent classification, slot filling and response relevance compared to naive single-prompt simulated conversations. We also investigate the end-to-end TOD effectiveness of different base and instruction-tuned LLMs, with and without the constructed synthetic conversations. Finally, we explore how various LLMs can evaluate responses in a TOD system and how well they are correlated with human judgments. Our findings pave the path towards quick development and evaluation of domain-specific TOD systems. We release our datasets, models, and code for research purposes.


Rethinking E-Commerce Search

arXiv.org Artificial Intelligence

E-commerce search and recommendation usually operate on structured data such as product catalogs and taxonomies. However, creating better search and recommendation systems often requires a large variety of unstructured data including customer reviews and articles on the web. Traditionally, the solution has always been converting unstructured data into structured data through information extraction, and conducting search over the structured data. However, this is a costly approach that often has low quality. In this paper, we envision a solution that does entirely the opposite. Instead of converting unstructured data (web pages, customer reviews, etc) to structured data, we instead convert structured data (product inventory, catalogs, taxonomies, etc) into textual data, which can be easily integrated into the text corpus that trains LLMs. Then, search and recommendation can be performed through a Q/A mechanism through an LLM instead of using traditional information retrieval methods over structured data.


Generalized Multiple Intent Conditioned Slot Filling

arXiv.org Artificial Intelligence

Natural language understanding includes the tasks of intent detection (identifying a user's objectives) and slot filling (extracting the entities relevant to those objectives). Prior slot filling methods assume that each intent type cannot occur more than once within a message, however this is often not a valid assumption for real-world settings. In this work, we generalize slot filling by removing the constraint of unique intents in a message. We cast this as a JSON generation task and approach it using a language model. We create a pre-training dataset by combining DBpedia and existing slot filling datasets that we convert for JSON generation. We also generate an in-domain dataset using GPT-3. We train T5 models for this task (with and without exemplars in the prompt) and find that both training datasets improve performance, and that the model is able to generalize to intent types not seen during training.


Order Matters at Fanatics Recommending Sequentially Ordered Products by LSTM Embedded with Word2Vec

arXiv.org Machine Learning

A unique challenge for e-commerce recommendation is that customers are often interested in products that are more advanced than their already purchased products, but not reversed. The few existing recommender systems modeling unidirectional sequence output a limited number of categories or continuous variables. To model the ordered sequence, we design the first recommendation system that both embed purchased items with Word2Vec, and model the sequence with stateless LSTM RNN. The click-through rate of this recommender system in production outperforms its solely Word2Vec based predecessor. Developed in 2017, it was perhaps the first published real-world application that makes distributed predictions of a single machine trained Keras model on Spark slave nodes at a scale of more than 0.4 million columns per row.


Learning Invariant Representations for Sentiment Analysis: The Missing Material is Datasets

arXiv.org Machine Learning

Learning representations which remain invariant to a nuisance factor has a great interest in Domain Adaptation, Transfer Learning, and Fair Machine Learning. Finding such representations becomes highly challenging in NLP tasks since the nuisance factor is entangled in a raw text. To our knowledge, a major issue is also that only few NLP datasets allow assessing the impact of such factor. In this paper, we introduce two generalization metrics to assess model robustness to a nuisance factor: \textit{generalization under target bias} and \textit{generalization onto unknown}. We combine those metrics with a simple data filtering approach to control the impact of the nuisance factor on the data and thus to build experimental biased datasets. We apply our method to standard datasets of the literature (\textit{Amazon} and \textit{Yelp}). Our work shows that a simple text classification baseline (i.e., sentiment analysis on reviews) may be badly affected by the \textit{product ID} (considered as a nuisance factor) when learning the polarity of a review. The method proposed is generic and applicable as soon as the nuisance variable is annotated in the dataset.


Learning and Transferring IDs Representation in E-commerce

arXiv.org Machine Learning

Many machine intelligence techniques are developed in E-commerce and one of the most essential components is the representation of IDs, including user ID, item ID, product ID, store ID, brand ID, category ID etc. The classical encoding based methods (like one-hot encoding) are inefficient in that it suffers sparsity problems due to its high dimension, and it cannot reflect the relationships among IDs, either homogeneous or heterogeneous ones. In this paper, we propose an embedding based framework to learn and transfer the representation of IDs. As the the implicit feedbacks of users, a tremendous amount of item ID sequences can be easily collected from the interactive sessions. By jointly using these informative sequences and the structural connections among IDs, all types of IDs can be embedded into one low-dimensional semantic space. Subsequently, the learned representations are utilized and transferred in four scenarios: (i) measuring the similarity between items, (ii) transferring from seen items to unseen items, (iii) transferring across different domains, (iv) transferring across different tasks. We deploy and evaluate the proposed approach in Hema App and the results validate its effectiveness.


Building a recommendation engine with AWS Data Pipeline, Elastic MapReduce and Spark

#artificialintelligence

From Google's advertisements to Amazon's product suggestions, recommendation engines are everywhere. As users of smart internet services, we've become so accustomed to seeing things we like. This blog post is an overview of how we built a product recommendation engine for Hubba. I'll start with an explanation of different types of recommenders and how we went about the selection process. Then I'll cover our AWS solution before diving into some implementation details. Content-based recommenders use discrete properties of an item, such as its tags.


How to build a Market Basket Analysis Engine

@machinelearnbot

A market basket analysis or recommendation engine [1] is what is behind all these recommendations we get when we go shopping online or whenever we receive targeted advertising. The underlying engine collects information about people's habits and knows that if people buy pasta and wine, they are usually also interested in pasta sauces. So, the next time you go to the supermarket and buy pasta and wine, be ready to get a recommendation for some pasta sauce! A typical analysis goal when applying market basket analysis it to produce a set of association rules in the following form: IF {pasta, wine, garlic} THEN pasta-sauce The first part of the rule is called "antecedent", the second part is called "consequent". A few measures, such as support, confidence, and lift, define how reliable each rule is.