AITopics | gdr

Collaborating Authors

gdr

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

5b8e9841e87fb8fc590434f5d933c92c-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 13:16:52 GMT

data mining, machine learning, prediction, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois (0.04)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
North America > Canada (0.04)
(3 more...)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.48)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)

Add feedback

The Devil is in the Detail: A Framework for Macroscopic Prediction via Microscopic Models

Neural Information Processing SystemsOct-3-2025, 00:17:29 GMT

Macroscopic data aggregated from microscopic events are pe rvasive in machine learning, such as country-level COVID-19 infection statis tics based on city-level data.

data mining, machine learning, prediction, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.47)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.88)
Health & Medicine > Therapeutic Area > Immunology (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)

Add feedback

Response to Reviewer 1: 3

Neural Information Processing SystemsOct-3-2025, 00:17:17 GMT

We thank all reviewers for their comments and acknowledgeme nt of our contribution. Below we address each reviewer's comments separately. The reviewer raised a very good point. We will add this clarification in the revised version. Our gradient-based method is much more efficient but only finds a stationary point.

artificial intelligence, machine learning, reviewer, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.37)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.36)

Add feedback

RALLM-POI: Retrieval-Augmented LLM for Zero-shot Next POI Recommendation with Geographical Reranking

Li, Kunrong, Lim, Kwan Hui

arXiv.org Artificial IntelligenceSep-23-2025

Next point-of-interest (POI) recommendation predicts a user's next destination from historical movements. Traditional models require intensive training, while LLMs offer flexible and generalizable zero-shot solutions but often generate generic or geographically irrelevant results due to missing trajectory and spatial context. To address these issues, we propose RALLM-POI, a framework that couples LLMs with retrieval-augmented generation and self-rectification. We first propose a Historical Trajectory Retriever (HTR) that retrieves relevant past trajectories to serve as contextual references, which are then reranked by a Geographical Distance Reranker (GDR) for prioritizing spatially relevant trajectories. Lastly, an Agentic LLM Rectifier (ALR) is designed to refine outputs through self-reflection. Without additional training, RALLM-POI achieves substantial accuracy gains across three real-world Foursquare datasets, outperforming both conventional and LLM-based baselines.

large language model, machine learning, trajectory, (16 more...)

arXiv.org Artificial Intelligence

2509.17066

Genre: Research Report (0.40)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Generative Data Refinement: Just Ask for Better Data

Jiang, Minqi, Araújo, João G. M., Ellsworth, Will, Gooding, Sian, Grefenstette, Edward

arXiv.org Artificial IntelligenceSep-12-2025

For a fixed parameter size, the capabilities of large models are primarily determined by the quality and quantity of its training data. Consequently, training datasets now grow faster than the rate at which new data is indexed on the web, leading to projected data exhaustion over the next decade. Much more data exists as user-generated content that is not publicly indexed, but incorporating such data comes with considerable risks, such as leaking private information and other undesirable content. We introduce a framework, Generative Data Refinement (GDR), for using pretrained generative models to transform a dataset with undesirable content into a refined dataset that is more suitable for training. Our experiments show that GDR can outperform industry-grade solutions for dataset anonymization, as well as enable direct detoxification of highly unsafe datasets. Moreover, we show that by generating synthetic data that is conditioned on each example in the real dataset, GDR's refined outputs naturally match the diversity of web scale datasets, and thereby avoid the often challenging task of generating diverse synthetic data via model prompting. The simplicity and effectiveness of GDR make it a powerful tool for scaling up the total stock of training data for frontier models.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2509.08653

Genre: Research Report (0.64)

Industry: Information Technology (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)

Add feedback

Bottleneck-Minimal Indexing for Generative Document Retrieval

Du, Xin, Xiu, Lixin, Tanaka-Ishii, Kumiko

arXiv.org Artificial IntelligenceMay-20-2024

We apply an information-theoretic perspective to reconsider generative document retrieval (GDR), in which a document $x \in X$ is indexed by $t \in T$, and a neural autoregressive model is trained to map queries $Q$ to $T$. GDR can be considered to involve information transmission from documents $X$ to queries $Q$, with the requirement to transmit more bits via the indexes $T$. By applying Shannon's rate-distortion theory, the optimality of indexing can be analyzed in terms of the mutual information, and the design of the indexes $T$ can then be regarded as a {\em bottleneck} in GDR. After reformulating GDR from this perspective, we empirically quantify the bottleneck underlying GDR. Finally, using the NQ320K and MARCO datasets, we evaluate our proposed bottleneck-minimal indexing method in comparison with various previous indexing methods, and we show that it outperforms those methods.

bottleneck-minimal indexing, indexing method, query, (12 more...)

arXiv.org Artificial Intelligence

2405.10974

Country:

Europe > Austria > Vienna (0.14)
North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
(5 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Generative Dense Retrieval: Memory Can Be a Burden

Yuan, Peiwen, Wang, Xinglin, Feng, Shaoxiong, Pan, Boyuan, Li, Yiwei, Wang, Heda, Miao, Xupeng, Li, Kan

arXiv.org Artificial IntelligenceJan-18-2024

Generative Retrieval (GR), autoregressively decoding relevant document identifiers given a query, has been shown to perform well under the setting of small-scale corpora. By memorizing the document corpus with model parameters, GR implicitly achieves deep interaction between query and document. However, such a memorizing mechanism faces three drawbacks: (1) Poor memory accuracy for fine-grained features of documents; (2) Memory confusion gets worse as the corpus size increases; (3) Huge memory update costs for new documents. To alleviate these problems, we propose the Generative Dense Retrieval (GDR) paradigm. Specifically, GDR first uses the limited memory volume to achieve inter-cluster matching from query to relevant document clusters. Memorizing-free matching mechanism from Dense Retrieval (DR) is then introduced to conduct fine-grained intra-cluster matching from clusters to relevant documents. The coarse-to-fine process maximizes the advantages of GR's deep interaction and DR's scalability. Besides, we design a cluster identifier constructing strategy to facilitate corpus memory and a cluster-adaptive negative sampling strategy to enhance the intra-cluster mapping ability. Empirical results show that GDR obtains an average of 3.0 R@100 improvement on NQ dataset under multiple settings and has better scalability.

gdr, identifier, retrieval, (15 more...)

arXiv.org Artificial Intelligence

2401.10487

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Austria (0.04)
(8 more...)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Synthesising Recursive Functions for First-Order Model Counting: Challenges, Progress, and Conjectures

Dilkas, Paulius, Belle, Vaishak

arXiv.org Artificial IntelligenceJun-7-2023

First-order model counting (FOMC) is a computational problem that asks to count the models of a sentence in finite-domain first-order logic. In this paper, we argue that the capabilities of FOMC algorithms to date are limited by their inability to express many types of recursive computations. To enable such computations, we relax the restrictions that typically accompany domain recursion and generalise the circuits used to express a solution to an FOMC problem to directed graphs that may contain cycles. To this end, we adapt the most well-established (weighted) FOMC algorithm ForcLift to work with such graphs and introduce new compilation rules that can create cycle-inducing edges that encode recursive function calls. These improvements allow the algorithm to find efficient solutions to counting problems that were previously beyond its reach, including those that cannot be solved efficiently by any other exact FOMC algorithm. We end with a few conjectures on what classes of instances could be domain-liftable as a result.

formula, machine learning, programming language, (19 more...)

arXiv.org Artificial Intelligence

2306.04189

Country:

Asia > Singapore > Central Region > Singapore (0.04)
Europe > United Kingdom (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Mathematics of Computing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Add feedback

ActUp: Analyzing and Consolidating tSNE and UMAP

Draganov, Andrew, Jørgensen, Jakob Rødsgaard, Nellemann, Katrine Scheel, Mottin, Davide, Assent, Ira, Berry, Tyrus, Aslay, Cigdem

arXiv.org Artificial IntelligenceMay-12-2023

tSNE and UMAP are popular dimensionality reduction algorithms due to their speed and interpretable low-dimensional embeddings. Despite their popularity, however, little work has been done to study their full span of differences. We theoretically and experimentally evaluate the space of parameters in both tSNE and UMAP and observe that a single one -- the normalization -- is responsible for switching between them. This, in turn, implies that a majority of the algorithmic differences can be toggled without affecting the embeddings. We discuss the implications this has on several theoretic claims behind UMAP, as well as how to reconcile them with existing tSNE interpretations. Based on our analysis, we provide a method (\ourmethod) that combines previously incompatible techniques from tSNE and UMAP and can replicate the results of either algorithm. This allows our method to incorporate further improvements, such as an acceleration that obtains either method's outputs faster than UMAP. We release improved versions of tSNE, UMAP, and \ourmethod that are fully plug-and-play with the traditional libraries at https://github.com/Andrew-Draganov/GiDR-DUN

artificial intelligence, machine learning, umap, (17 more...)

arXiv.org Artificial Intelligence

2305.0732

Country: