AITopics | Chowdhury, Shihabur Rahman

Collaborating Authors

Chowdhury, Shihabur Rahman

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Incremental IVF Index Maintenance for Streaming Vector Search

Mohoney, Jason, Pacaci, Anil, Chowdhury, Shihabur Rahman, Minhas, Umar Farooq, Pound, Jeffery, Renggli, Cedric, Reyhani, Nima, Ilyas, Ihab F., Rekatsinas, Theodoros, Venkataraman, Shivaram

arXiv.org Artificial IntelligenceNov-1-2024

The prevalence of vector similarity search in modern machine IVF indexes out-of-the-box do not have the notion of inserting learning applications and the continuously changing nature of data new vectors or deleting existing vectors once constructed. Indeed, processed by these applications necessitate efficient and effective the most common method used by practitioners today is to rebuild index maintenance techniques for vector search indexes. Designed the index from scratch to reflect any updates that have accumulated primarily for static workloads, existing vector search indexes degrade over time. However, depending on the scale of the vector in search quality and performance as the underlying data is dataset and the volume and frequency of updates, a full index rebuild updated unless costly index reconstruction is performed. To address can be prohibitively expensive. For example, it takes multiple this, we introduce Ada-IVF, an incremental indexing methodology days to rebuild an IVF index from scratch for billion-scale vector for Inverted File (IVF) indexes. Ada-IVF consists of 1) an adaptive datasets [21, 69], making it necessary to revisit how updates can maintenance policy that decides which index partitions are problematic be reflected. Devising such an update mechanism consists of readjusting for performance and should be repartitioned and 2) a local the partitioning of the high-dimensional space defined by re-clustering mechanism that determines how to repartition them.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2411.0097

Country:

North America > United States > Massachusetts (0.28)
North America > United States > California (0.28)

Genre:

Overview (0.67)
Research Report (0.64)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

High-Throughput Vector Similarity Search in Knowledge Graphs

Mohoney, Jason, Pacaci, Anil, Chowdhury, Shihabur Rahman, Mousavi, Ali, Ilyas, Ihab F., Minhas, Umar Farooq, Pound, Jeffrey, Rekatsinas, Theodoros

arXiv.org Artificial IntelligenceApr-4-2023

There is an increasing adoption of machine learning for encoding data into vectors to serve online recommendation and search use cases. As a result, recent data management systems propose augmenting query processing with online vector similarity search. In this work, we explore vector similarity search in the context of Knowledge Graphs (KGs). Motivated by the tasks of finding related KG queries and entities for past KG query workloads, we focus on hybrid vector similarity search (hybrid queries for short) where part of the query corresponds to vector similarity search and part of the query corresponds to predicates over relational attributes associated with the underlying data vectors. For example, given past KG queries for a song entity, we want to construct new queries for new song entities whose vector representations are close to the vector representation of the entity in the past KG query. But entities in a KG also have non-vector attributes such as a song associated with an artist, a genre, and a release date. Therefore, suggested entities must also satisfy query predicates over non-vector attributes beyond a vector-based similarity predicate. While these tasks are central to KGs, our contributions are generally applicable to hybrid queries. In contrast to prior works that optimize online queries, we focus on enabling efficient batch processing of past hybrid query workloads. We present our system, HQI, for high-throughput batch processing of hybrid queries. We introduce a workload-aware vector data partitioning scheme to tailor the vector index layout to the given workload and describe a multi-query optimization technique to reduce the overhead of vector similarity computations. We evaluate our methods on industrial workloads and demonstrate that HQI yields a 31x improvement in throughput for finding related KG queries compared to existing hybrid query processing approaches.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2304.01926

Country: North America > United States (0.68)

Genre:

Research Report (0.50)
Overview (0.46)

Industry: Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback