AITopics | dpr

Collaborating Authors

dpr

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Fine-grained Late-interaction Multi-modal Retrieval for Retrieval Augmented Visual Question Answering (Appendix)

Neural Information Processing SystemsApr-26-2026, 22:14:59 GMT

We chose the Google Search corpus [Luo et al., 2021] for our question-answering system as it provides good coverage of the knowledge needed and is publicly available. However, as noted by the authors of RA-VQA, additional knowledge bases may be required to answer some questions correctly. Future work may address the issue by improving the quality and expanding the coverage of knowledge. We do not perceive any immediate ethical concerns associated with the misuse of our proposed system. There is a possibility that the trained KB-VQA system might generate inappropriate or biased content as a result of the training data biases during LLM and LMM pre-training and fine-tuning.

machine learning, natural language, question answering, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.29)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Industry: Information Technology (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Fine-grained Late-interaction Multi-modal Retrieval for Retrieval Augmented Visual Question Answering

Neural Information Processing SystemsApr-26-2026, 22:14:55 GMT

Knowledge-based Visual Question Answering (KB-VQA) requires VQA systems to utilize knowledge from external knowledge bases to answer visually-grounded questions. Retrieval-Augmented Visual Question Answering (RA-VQA), a strong framework to tackle KB-VQA, first retrieves related documents with Dense Passage Retrieval (DPR) and then uses them to answer questions. This paper proposes Fine-grained Late-interaction Multi-modal Retrieval (FLMR) which significantly improves knowledge retrieval in RA-VQA. FLMR addresses two major limitations in RA-VQA's retriever: (1) the image representations obtained via image-to-text transforms can be incomplete and inaccurate and (2) relevance scores between queries and documents are computed with one-dimensional embeddings, which can be insensitive to finer-grained relevance. FLMR overcomes these limitations by obtaining image representations that complement those from the image-totext transforms using a vision model aligned with an existing text-based retriever through a simple alignment network. FLMR also encodes images and questions using multi-dimensional embeddings to capture finer-grained relevance between queries and documents. FLMR significantly improves the original RA-VQA retriever's PRRecall@5 by approximately 8%. Finally, we equipped RA-VQA with two state-of-the-art large multi-modal/language models to achieve 61% VQA score in the OK-VQA dataset.

large language model, machine learning, question answering, (18 more...)

Neural Information Processing Systems

Country:

Asia (0.68)
North America > United States (0.68)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)

Genre: Research Report (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Optimal Classification under Performative Distribution Shift

Neural Information Processing SystemsFeb-16-2026, 02:32:07 GMT

This approach can also be integrated into various change-of-variable-based models, such as V AEs or normalizing flows.

artificial intelligence, machine learning, performative effect, (16 more...)

Neural Information Processing Systems

Country:

Europe > France (0.04)
Europe > Austria > Vienna (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Fine-grained Late-interaction Multi-modal Retrieval for Retrieval Augmented Visual Question Answering (Appendix)

Neural Information Processing SystemsFeb-11-2026, 07:05:55 GMT

We chose the Google Search corpus [Luo et al., 2021] for our question-answering system as it provides good coverage of the knowledge needed and is publicly available. Therefore, it is advised to conduct an ethical review prior to deploying the system in live service. Table 1 shows the data statistics of the OK-VQA dataset. We build a DPR retriever as a baseline for FLMR. Equally contributed as the first author 37th Conference on Neural Information Processing Systems (NeurIPS 2023). The inner product search (supported by FAISS [Johnson et al., 2019]) is used to train and In answer generation, we use t5-large and Salesforce/blip2-flan-t5-xl.

machine learning, natural language, question answering, (18 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
North America > Dominican Republic (0.04)

Industry: Information Technology (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Fine-grained Late-interaction Multi-modal Retrieval for Retrieval Augmented Visual Question Answering

Neural Information Processing SystemsFeb-11-2026, 07:05:51 GMT

Retrieval (DPR) and then uses them to answer questions.

large language model, machine learning, question answering, (19 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(4 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
(2 more...)

Add feedback

7de665476d0adc8a54d3b8744f932bbf-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 07:13:36 GMT

estimator, performative effect, performative risk, (15 more...)

Neural Information Processing Systems

Country:

Europe > France (0.04)
Europe > Austria > Vienna (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Optimal spectral initializers impact on phase retrieval phase transitions -- an RDT view

Stojnic, Mihailo

arXiv.org Machine LearningJun-24-2025

We analyze the relation between spectral initializers and theoretical limits of \emph{descending} phase retrieval algorithms (dPR). In companion paper [104], for any sample complexity ratio, $α$, \emph{parametric manifold}, ${\mathcal {PM}}(α)$, is recognized as a critically important structure that generically determines dPRs abilities to solve phase retrieval (PR). Moreover, overlap between the algorithmic solution and the true signal is positioned as a key ${\mathcal {PM}}$'s component. We here consider the so-called \emph{overlap optimal} spectral initializers (OptSpins) as dPR's starting points and develop a generic \emph{Random duality theory} (RDT) based program to statistically characterize them. In particular, we determine the functional structure of OptSpins and evaluate the starting overlaps that they provide for the dPRs. Since ${\mathcal {PM}}$'s so-called \emph{flat regions} are highly susceptible to \emph{local jitteriness} and as such are key obstacles on dPR's path towards PR's global optimum, a precise characterization of the starting overlap allows to determine if such regions can be successfully circumvented. Through the presented theoretical analysis we observe two key points in that regard: \textbf{\emph{(i)}} dPR's theoretical phase transition (critical $α$ above which they solve PR) might be difficult to practically achieve as the ${\mathcal {PM}}$'s flat regions are large causing the associated OptSpins to fall exactly within them; and \textbf{\emph{(ii)}} Opting for so-called ``\emph{safer compression}'' and slightly increasing $α$ (by say $15\%$) shrinks flat regions and allows OptSpins to fall outside them and dPRs to ultimately solve PR. Numerical simulations are conducted as well and shown to be in an excellent agreement with theoretical predictions.

artificial intelligence, machine learning, optimization problem, (14 more...)

arXiv.org Machine Learning

2506.18279

Country:

North America > United States > Texas > Dallas County > Dallas (0.04)
Africa > Sudan (0.04)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Aachen (0.04)
(11 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.92)

Add feedback

Experimental Study on Automatically Assembling Custom Catering Packages With a 3-DOF Delta Robot Using Deep Learning Methods

Yourdkhani, Reihaneh, Tavoosian, Arash, Khomami, Navid Asadi, Masouleh, Mehdi Tale

arXiv.org Artificial IntelligenceMay-20-2025

This paper introduces a pioneering experimental study on the automated packing of a catering package using a two-fingered gripper affixed to a 3-degree-of-freedom Delta parallel robot. A distinctive contribution lies in the application of a deep learning approach to tackle this challenge. A custom dataset, comprising 1,500 images, is meticulously curated for this endeavor, representing a noteworthy initiative as the first dataset focusing on Persian-manufactured products. The study employs the YOLOV5 model for object detection, followed by segmentation using the FastSAM model. Subsequently, rotation angle calculation is facilitated with segmentation masks, and a rotated rectangle encapsulating the object is generated. This rectangle forms the basis for calculating two grasp points using a novel geometrical approach involving eigenvectors. An extensive experimental study validates the proposed model, where all pertinent information is seamlessly transmitted to the 3-DOF Delta parallel robot. The proposed algorithm ensures real-time detection, calibration, and the fully autonomous packing process of a catering package, boasting an impressive over 80\% success rate in automatic grasping. This study marks a significant stride in advancing the capabilities of robotic systems for practical applications in packaging automation.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2505.11879

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.81)

Industry:

Health & Medicine (0.69)
Consumer Products & Services > Food, Beverage, Tobacco & Cannabis (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DPR: Diffusion Preference-based Reward for Offline Reinforcement Learning

Pang, Teng, Wang, Bingzheng, Wu, Guoqiang, Yin, Yilong

arXiv.org Artificial IntelligenceMar-2-2025

Offline preference-based reinforcement learning (PbRL) mitigates the need for reward definition, aligning with human preferences via preference-driven reward feedback without interacting with the environment. However, the effectiveness of preference-driven reward functions depends on the modeling ability of the learning model, which current MLP-based and Transformer-based methods may fail to adequately provide. To alleviate the failure of the reward function caused by insufficient modeling, we propose a novel preference-based reward acquisition method: Diffusion Preference-based Reward (DPR). Unlike previous methods using Bradley-Terry models for trajectory preferences, we use diffusion models to directly model preference distributions for state-action pairs, allowing rewards to be discriminatively obtained from these distributions. In addition, considering the particularity of preference data that only know the internal relationships of paired trajectories, we further propose Conditional Diffusion Preference-based Reward (C-DPR), which leverages relative preference information to enhance the construction of the diffusion model. We apply the above methods to existing offline reinforcement learning algorithms and a series of experiment results demonstrate that the diffusion-based reward acquisition approach outperforms previous MLP-based and Transformer-based methods.

algorithm, diffusion model, diffusion preference-based reward, (11 more...)

arXiv.org Artificial Intelligence

2503.01143

Country: Asia > China > Shandong Province (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

From Retrieval to Generation: Comparing Different Approaches

Abdallah, Abdelrahman, Mozafari, Jamshid, Piryani, Bhawna, Ali, Mohammed, Jatowt, Adam

arXiv.org Artificial IntelligenceFeb-27-2025

Knowledge-intensive tasks, particularly open-domain question answering (ODQA), document reranking, and retrieval-augmented language modeling, require a balance between retrieval accuracy and generative flexibility. Traditional retrieval models such as BM25 and Dense Passage Retrieval (DPR), efficiently retrieve from large corpora but often lack semantic depth. Generative models like GPT-4-o provide richer contextual understanding but face challenges in maintaining factual consistency. In this work, we conduct a systematic evaluation of retrieval-based, generation-based, and hybrid models, with a primary focus on their performance in ODQA and related retrieval-augmented tasks. Our results show that dense retrievers, particularly DPR, achieve strong performance in ODQA with a top-1 accuracy of 50.17\% on NQ, while hybrid models improve nDCG@10 scores on BEIR from 43.42 (BM25) to 52.59, demonstrating their strength in document reranking. Additionally, we analyze language modeling tasks using WikiText-103, showing that retrieval-based approaches like BM25 achieve lower perplexity compared to generative and hybrid methods, highlighting their utility in retrieval-augmented generation. By providing detailed comparisons and practical insights into the conditions where each approach excels, we aim to facilitate future optimizations in retrieval, reranking, and generative models for ODQA and related knowledge-intensive applications.

arxiv preprint arxiv, retrieval, triviaqa, (14 more...)

arXiv.org Artificial Intelligence

2502.20245

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
Asia > Singapore (0.04)
North America > Dominican Republic (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback