milvus
A Generative Caching System for Large Language Models
Iyengar, Arun, Kundu, Ashish, Kompella, Ramana, Mamidi, Sai Nandan
Caching has the potential to be of significant benefit for accessing large language models (LLMs) due to their high latencies which typically range from a small number of seconds to well over a minute. Furthermore, many LLMs charge money for queries; caching thus has a clear monetary benefit. This paper presents a new caching system for improving user experiences with LLMs. In addition to reducing both latencies and monetary costs for accessing LLMs, our system also provides important features that go beyond the performance benefits typically associated with caches. A key feature we provide is generative caching, wherein multiple cached responses can be synthesized to provide answers to queries which have never been seen before. Our generative caches function as repositories of valuable information which can be mined and analyzed. We also improve upon past semantic caching techniques by tailoring the caching algorithms to optimally balance cost and latency reduction with the quality of responses provided. Performance tests indicate that our caches are considerably faster than GPTcache.
Scalable Vector Search for AI Apps with Milvus and Databricks
Multi-modal embeddings are all the rage these days. Everyone wants a piece of them because they give you a way to convert unstructured data to representations that are useful for understanding the semantic nature of unstructured assets -- across image, text, audio, video, etc. These representations are vectors that can be used for a variety of purposes across use cases which require models for image similarity, deduplication, anomaly detection, text similarity, audio classification, video understanding, etc. To top that off, you don't have to be a data scientist with deep ML expertise to build these systems, nor do you need to have large amounts of data to start leveraging them. This is fine until you run into actual "hands on the keyboard" work for production.
Building an Intelligent QA System With NLP and Milvus
The question answering system is commonly used in the field of natural language processing. It is used to answer questions in the form of natural language and has a wide range of applications. Typical applications include intelligent voice interaction, online customer service, knowledge acquisition, personalized emotional chatting, and more. Most question answering systems can be classified as generative and retrieval question answering systems, single-round question answering and multi-round question answering systems, open question answering systems, and specific question-answering systems. This article mainly deals with a QA system designed for a specific field, which is usually called an intelligent customer service robot.
Building a Deep-Learning-Based Movie Recommender System
With the continuous development of network technology and the ever-expanding scale of e-commerce, the number and variety of goods grow rapidly and users need to spend a lot of time to find the goods they want to buy. To solve this problem, the recommendation system came into being. The recommendation system is a subset of the Information Filtering System, which can be used in a range of areas such as movies, music, e-commerce, and Feed stream recommendations. The recommendation system discovers the user's personalized needs and interests by analyzing and mining user behaviors and recommends information or products that may be of interest to the user. Unlike search engines, recommendation systems do not require users to accurately describe their needs but model their historical behavior to proactively provide information that meets user interests and needs.
- Information Technology (0.72)
- Media > Film (0.57)
r/MachineLearning - [P] Milvus: A big leap to scalable AI search engine
The explosion in unstructured data, such as images, videos, sound records, and text, requires an effective solution for computer vision, voice recognition, and natural language processing. How to extract value from unstructured data poses as a big challenge for many enterprises. AI, especially deep learning, has been proved as an effective solution. Vectorization of data features enables people to perform content-based search on unstructured data. For example, you can perform content-based image retrieval, including facial recognition and object detection, etc.