pinecone
RAGtifier: Evaluating RAG Generation Approaches of State-of-the-Art RAG Systems for the SIGIR LiveRAG Competition
Cofala, Tim, Astappiev, Oleh, Xion, William, Teklehaymanot, Hailay
Retrieval-Augmented Generation (RAG) enriches Large Language Models (LLMs) by combining their internal, parametric knowledge with external, non-parametric sources, with the goal of improving factual correctness and minimizing hallucinations. The LiveRAG 2025 challenge explores RAG solutions to maximize accuracy on DataMorgana's QA pairs, which are composed of single-hop and multi-hop questions. The challenge provides access to sparse OpenSearch and dense Pinecone indices of the Fineweb 10BT dataset. It restricts model use to LLMs with up to 10B parameters and final answer generation with Falcon-3-10B. A judge-LLM assesses the submitted answers along with human evaluators. By exploring distinct retriever combinations and RAG solutions under the challenge conditions, our final solution emerged using InstructRAG in combination with a Pinecone retriever and a BGE reranker. Our solution achieved a correctness score of 1.13 and a faithfulness score of 0.55 in the non-human evaluation, placing it overall in third place in the SIGIR 2025 LiveRAG Challenge.
- North America > United States > District of Columbia > Washington (0.05)
- Europe > Germany > Lower Saxony > Hanover (0.05)
- Asia > Singapore (0.04)
- (7 more...)
PBFT-Backed Semantic Voting for Multi-Agent Memory Pruning
The proliferation of multi-agent systems (MAS) in complex, dynamic environments necessitates robust and efficient mechanisms for managing shared knowledge. A critical challenge is ensuring that distributed memories remain synchronized, relevant, and free from the accumulation of outdated or inconsequential data - a process analogous to biological forgetting. This paper introduces the Co-Forgetting Protocol, a novel, comprehensive framework designed to address this challenge by enabling synchronized memory pruning in MAS. The protocol integrates three key components: (1) context-aware semantic voting, where agents utilize a lightweight DistilBERT model to assess the relevance of memory items based on their content and the current operational context; (2) multi-scale temporal decay functions, which assign diminishing importance to memories based on their age and access frequency across different time horizons; and (3) a Practical Byzantine Fault Tolerance (PBFT)-based consensus mechanism, ensuring that decisions to retain or discard memory items are agreed upon by a qualified and fault-tolerant majority of agents, even in the presence of up to f Byzantine (malicious or faulty) agents in a system of N greater than or equal to 3f+1 agents. The protocol leverages gRPC for efficient inter-agent communication and Pinecone for scalable vector embedding storage and similarity search, with SQLite managing metadata. Experimental evaluations in a simulated MAS environment with four agents demonstrate the protocol's efficacy, achieving a 52% reduction in memory footprint over 500 epochs, 88% voting accuracy in forgetting decisions against human-annotated benchmarks, a 92% PBFT consensus success rate under simulated Byzantine conditions, and an 82% cache hit rate for memory access.
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > Canada > Ontario > Toronto (0.04)
Comparative Analysis of Retrieval Systems in the Real World
Mozolevskyi, Dmytro, AlShikh, Waseem
This research paper presents a comprehensive analysis of integrating advanced language models with search and retrieval systems in the fields of information retrieval and natural language processing. The objective is to evaluate and compare various state-of-the-art methods based on their performance in terms of accuracy and efficiency. The analysis explores different combinations of technologies, including Azure Cognitive Search Retriever with GPT-4, Pinecone's Canopy framework, Langchain with Pinecone and different language models (OpenAI, Cohere), LlamaIndex with Weaviate Vector Store's hybrid search, Google's RAG implementation on Cloud VertexAI-Search, Amazon SageMaker's RAG, and a novel approach called KG-FID Retrieval. The motivation for this analysis arises from the increasing demand for robust and responsive question-answering systems in various domains. The RobustQA metric is used to evaluate the performance of these systems under diverse paraphrasing of questions. The report aims to provide insights into the strengths and weaknesses of each method, facilitating informed decisions in the deployment and development of AI-driven search and retrieval systems.
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.74)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)
- Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.57)
Everything you need to know about BabyAGI - TechStory
In recent months, we have seen the emergence and proliferation of several artificial intelligence systems worldwide, such as OpenAI's ChatGPT, GPT-4, and Google's Bard. Microsoft's new Bing and Baidu's Ernie Bot have also entered the scene. Joining this group of AI systems is a newcomer known as BabyAGI. BabyAGI is an innovative AI platform designed to train and evaluate various AI agents in a simulated environment. The AI is a pared-down version of the original Task-Driven Autonomous Agent developed and launched by VC and AI expert Yohei Nakajima.
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.74)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.43)
Generative Question-Answering with Long-Term Memory
Generative AI sparked several "wow" moments in 2022. It's hardly surprising that Generative AI is experiencing a boom in interest and innovation [1]. Yet, this marks the just first year of generative AI's widespread adoption. The early days of a new field poised to disrupt how we interact with machines. One of the most thought-provoking use cases belongs to Generative Question-Answering (GQA).
Global Big Data Conference
Many, if not most, search engines in use today are based on keywords, in which the search engine attempts to find the best match for a word or set of words used as input. It's a tried-and-true method that has been deployed millions of times over decades of use. But new search approaches based on deep learning, including vector search and neural search, have emerged recently, and early backers say they have the potential to shake up the search market. Vector search uses a fundamentally different approach to finding the best fit between a term provided as input to the engine and the result that is presented to the user. Instead of powering the search by doing a direct one-to-one matching of keywords, in vector search, the engine attempts to match the input term to a vector, which is an array of features generated from objects in the catalog.
Pinecone CEO on bringing vector similarity search to dev teams
All the sessions from Transform 2021 are available on-demand now. The traditional way for a database to answer a query is with a list of rows that fit the criteria. If there's any sorting, it's done by one field at a time. Vector similarity search looks for matches by comparing the likeness of objects, as captured by machine learning models. Vector similarity search is particularly useful with real-world data because that data is often unstructured and contains similar yet not identical items.
Movie Recommender System With a Deep Ranking Model (Example)
Let's create a movie recommender based on ratings. In this example we have a collection of movies, a bunch of users, and movie ratings from users that range from 1 to 5. These ratings are sparse because each user rates only a small percentage of the total movies, and they are biased because users' ratings are distributed differently. Our goal is to take any user ID and search for recommended movies for that user. We will use Pinecone to tie everything together and expose the recommender as a real-time service that will take any user ID and return relevant movie recommendations.