AITopics

Graph databases (GDBs) like Neo4j and TigerGraph excel at handling interconnected data but lack advanced inference capabilities. Neural Graph Databases (NGDBs) address this by integrating Graph Neural Networks (GNNs) for predictive analysis and reasoning over incomplete or noisy data. However, NGDBs rely on predefined queries and lack autonomy and adaptability. This paper introduces Agentic Neural Graph Databases (Agentic NGDBs), which extend NGDBs with three core functionalities: autonomous query construction, neural query execution, and continuous learning. We identify ten key challenges in realizing Agentic NGDBs: semantic unit representation, abductive reasoning, scalable query execution, and integration with foundation models like large language models (LLMs). By addressing these challenges, Agentic NGDBs can enable intelligent, self-improving systems for modern data-driven applications, paving the way for adaptable and autonomous data management solutions.

large language model, machine learning, natural language, (16 more...)

2501.14224

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(8 more...)

Genre:

Overview (0.46)
Research Report (0.40)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)

Nascimento, Eduardo R., Avila, Caio Viktor S., Izquierdo, Yenier T., García, Grettel M., Andrade, Lucas Feijó L., Facina, Michelle S. P., Lemos, Melissa, Casanova, Marco A.

Text-to-SQL based on Large Language Models and Database Keyword Search

Text-to-SQL prompt strategies based on Large Language Models (LLMs) achieve remarkable performance on well-known benchmarks. However, when applied to real-world databases, their performance is significantly less than for these benchmarks, especially for Natural Language (NL) questions requiring complex filters and joins to be processed. This paper then proposes a strategy to compile NL questions into SQL queries that incorporates a dynamic few-shot examples strategy and leverages the services provided by a database keyword search (KwS) platform. The paper details how the precision and recall of the schema-linking process are improved with the help of the examples provided and the keyword-matching service that the KwS platform offers. Then, it shows how the KwS platform can be used to synthesize a view that captures the joins required to process an input NL question and thereby simplify the SQL query compilation step. The paper includes experiments with a real-world relational database to assess the performance of the proposed strategy. The experiments suggest that the strategy achieves an accuracy on the real-world relational database that surpasses state-of-the-art approaches. The paper concludes by discussing the results obtained.

information retrieval, large language model, machine learning, (20 more...)

2501.13594

Country:

South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
North America > United States (0.04)
Europe > Switzerland (0.04)
(7 more...)

Genre:

Research Report > Experimental Study (0.48)
Research Report > New Finding (0.46)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Multi-Level Attention and Contrastive Learning for Enhanced Text Classification with an Optimized Transformer

Gao, Jia, Liu, Guiran, Zhu, Binrong, Zhou, Shicheng, Zheng, Hongye, Liao, Xiaoxuan

This paper studies a text classification algorithm based on an improved Transformer to improve the performance and efficiency of the model in text classification tasks. Aiming at the shortcomings of the traditional Transformer model in capturing deep semantic relationships and optimizing computational complexity, this paper introduces a multi-level attention mechanism and a contrastive learning strategy. The multi-level attention mechanism effectively models the global semantics and local features in the text by combining global attention with local attention; the contrastive learning strategy enhances the model's ability to distinguish between different categories by constructing positive and negative sample pairs while improving the classification effect. In addition, in order to improve the training and inference efficiency of the model on large-scale text data, this paper designs a lightweight module to optimize the feature transformation process and reduce the computational cost. Experimental results on the dataset show that the improved Transformer model outperforms the comparative models such as BiLSTM, CNN, standard Transformer, and BERT in terms of classification accuracy, F1 score, and recall rate, showing stronger semantic representation ability and generalization performance. The method proposed in this paper provides a new idea for algorithm optimization in the field of text classification and has good application potential and practical value. Future work will focus on studying the performance of this model in multi-category imbalanced datasets and cross-domain tasks and explore the integration wi

information retrieval, machine learning, natural language, (16 more...)

2501.13467

Country:

North America > United States > California > San Francisco County > San Francisco (0.05)
North America > United States > New York (0.04)
Asia > China > Hong Kong (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Media (0.46)
Information Technology (0.46)
Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.88)

Ikeda, Taiga, Miyashita, Daisuke, Deguchi, Jun

On Storage Neural Network Augmented Approximate Nearest Neighbor Search

artificial intelligence, augmented approximate nearest neighbor search, machine learning, (2 more...)

Large-scale approximate nearest neighbor search (ANN) has been gaining attention along with the latest machine learning researches employing ANNs. If the data is too large to fit in memory, it is necessary to search for the most similar vectors to a given query vector from the data stored in storage devices, not from that in memory. The storage device such as NAND flash memory has larger capacity than the memory device such as DRAM, but they also have larger latency to read data. Therefore, ANN methods for storage require completely different approaches from conventional in-memory ANN methods. Since the approximation that the time required for search is determined only by the amount of data fetched from storage holds under reasonable assumptions, our goal is to minimize it while maximizing recall. For partitioning-based ANNs, vectors are partitioned into clusters in the index building phase. In the search phase, some of the clusters are chosen, the vectors in the chosen clusters are fetched from storage, and the nearest vector is retrieved from the fetched vectors. Thus, the key point is to accurately select the clusters containing the ground truth nearest neighbor vectors. We accomplish this by proposing a method to predict the correct clusters by means of a neural network that is gradually refined by alternating supervised learning and duplicated cluster assignment. Compared to state-of-the-art SPANN and an exhaustive method using k-means clustering and linear search, the proposed method achieves 90% recall on SIFT1M with 80% and 58% less data fetched from storage, respectively.

2501.16375

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.80)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.53)

Dual-Branch HNSW Approach with Skip Bridges and LID-Driven Optimization

Nguyen, Hy, Nguyen, Nguyen Hung, Nguyen, Nguyen Linh Bao, Thudumu, Srikanth, Du, Hung, Vasa, Rajesh, Mouzakis, Kon

The Hierarchical Navigable Small World (HNSW) algorithm is widely used for approximate nearest neighbor (ANN) search, leveraging the principles of navigable small-world graphs. However, it faces some limitations. The first is the local optima problem, which arises from the algorithm's greedy search strategy, selecting neighbors based solely on proximity at each step. This often leads to cluster disconnections. The second limitation is that HNSW frequently fails to achieve logarithmic complexity, particularly in high-dimensional datasets, due to the exhaustive traversal through each layer. To address these limitations, we propose a novel algorithm that mitigates local optima and cluster disconnections while enhancing the construction speed, maintaining inference speed. The first component is a dual-branch HNSW structure with LID-based insertion mechanisms, enabling traversal from multiple directions. This improves outlier node capture, enhances cluster connectivity, accelerates construction speed and reduces the risk of local minima. The second component incorporates a bridge-building technique that bypasses redundant intermediate layers, maintaining inference and making up the additional computational overhead introduced by the dual-branch structure. Experiments on various benchmarks and datasets showed that our algorithm outperforms the original HNSW in both accuracy and speed. We evaluated six datasets across Computer Vision (CV), and Natural Language Processing (NLP), showing recall improvements of 18\% in NLP, and up to 30\% in CV tasks while reducing the construction time by up to 20\% and maintaining the inference speed. We did not observe any trade-offs in our algorithm. Ablation studies revealed that LID-based insertion had the greatest impact on performance, followed by the dual-branch structure and bridge-building components.

information retrieval, machine learning, natural language, (21 more...)

2501.13992

Country:

Oceania > Australia > Victoria (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.54)

WIREDJan-22-2025, 17:00:08 GMT

This New AI Search Engine Has a Gimmick: Humans Answering Questions

When online search engines first appeared, they seemed miraculous. It is a truth near-universally acknowledged that search is in the dumps, corroded by spam and ads. Big players like Google are insistent that AI is the savior of search, despite many early attempts to integrate AI ending in disaster. Recently, I got an email promoting another new AI search engine--but this one has a notably quirky approach to answering questions. Called Pearl, it's coming out of beta this week.

information retrieval, machine learning, natural language, (11 more...)

WIRED

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.33)

Neural Information Processing SystemsJan-22-2025, 02:00:38 GMT

Reviews: Push-pull Feedback Implements Hierarchical Information Retrieval Efficiently

Update: I apologize for my confusion about the dynamics. I feel more positively now about this work, and have increased my score. There are two issues here to be addressed: a) how realistic is it for the dynamics of the feedforward pass recurrence within layers to run to convergence *before* sending down the top-down feedback? What happens if these are concurrent processes, such that units get both bottom-up and top-down inputs at the same time? Given the time-scale of the recurrent dynamics in cortex, the authors could then ask (in their model) whether this delay is "enough" for their push-pull mechanism to work. If yes, that would strengthen the result a fair bit.

hierarchical rnn, implement hierarchical information retrieval efficiently, natural image

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.40)

Neural Information Processing SystemsJan-22-2025, 02:00:26 GMT

Reviews: Push-pull Feedback Implements Hierarchical Information Retrieval Efficiently

The manuscript studies the role of feedback connections in pattern retrieval networks. This is done in a hierarchical Hopfield-type network. Then, different types of top-down feedback are investigated. All reviewers found the results interesting. Besides its relevance to modelling of biology, it is also of potential interest for technical applications in hierarchical information retrieval.

implement hierarchical information retrieval efficiently, manuscript, potential interest

Industry: Health & Medicine > Therapeutic Area > Neurology (0.30)

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.68)

Neural Information Processing SystemsJan-21-2025, 10:31:37 GMT

Reviews: DiskANN: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node

The writing could be improved, but it's in general understandable. However, citation quality can be improved. In particular, it seems to me that NSG and HNSW are actually using the same pruning rule (which results in approximate relative neighborhood graph). I really like your updated version, which reduces the number hops (and I haven't seen this pruning variant before)! Detailed comments: Abstract and further: base points sounds like a strange term, do you mean domain points? Please, find a more specific-generic citation that describes this phenomena.

accurate billion-point nearest neighbor search, relative neighborhood graph, single node, (3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.53)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.44)

Neural Information Processing SystemsJan-21-2025, 10:31:26 GMT

Reviews: DiskANN: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node

In post rebuttal discussions, reviewers concurred in subsequent discussions that the paper presents solid state of art implementation and very impressive results, which will have good impact for practitioners. This significant impact by itself was worthy of publication.

accurate billion-point nearest neighbor search, diskann, single node

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.40)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.40)