Goto

Collaborating Authors

 Information Retrieval


A Comprehensive Study on Fine-Tuning Large Language Models for Medical Question Answering Using Classification Models and Comparative Analysis

arXiv.org Artificial Intelligence

This paper presents the overview of the development and fine-tuning of large language models (LLMs) designed specifically for answering medical questions. We are mainly improving the accuracy and efficiency of providing reliable answers to medical queries. In our approach, we have two stages, prediction of a specific label for the received medical question and then providing a predefined answer for this label. Various models such as RoBERTa and BERT were examined and evaluated based on their ability. The models are trained using the datasets derived from 6,800 samples that were scraped from Healthline. com with additional synthetic data. For evaluation, we conducted a comparative study using 5-fold cross-validation. For accessing performance we used metrics like, accuracy, precision, recall, and F1 score and also recorded the training time. The performance of the models was evaluated using 5-fold cross-validation. The LoRA Roberta-large model achieved an accuracy of 78.47%, precision of 72.91%, recall of 76.95%, and an F1 score of 73.56%. The Roberta-base model demonstrated high performance with an accuracy of 99.87%, precision of 99.81%, recall of 99.86%, and an F1 score of 99.82%. The Bert Uncased model showed strong results with an accuracy of 95.85%, precision of 94.42%, recall of 95.58%, and an F1 score of 94.72%. Lastly, the Bert Large Uncased model achieved the highest performance, with an accuracy, precision, recall, and F1 score of 100%. The results obtained have helped indicate the capability of the models in classifying the medical questions and generating accurate answers in the prescription of improved health-related AI solutions.


Query-based versus resource-based cache strategies in tag-based browsing systems

arXiv.org Artificial Intelligence

Tag-based browsing is a popular interaction model for navigating digital libraries. According to this model, users select descriptive tags to filter resources in the collections. Typical implementations of the model are based on inverted indexes. However, these implementations can require a considerable amount of set operations to update the browsing state. To palliate this inconven-ience, it is possible to adopt suitable cache strategies. In this paper we describe and compare two of these strategies: (i) a query-based strategy, according to which previously computed browsing states are indexed by sets of selected tags; and (ii) a resource-based strategy, according to which browsing states are in-dexed by sets of filtered resources. Our comparison focused on runtime perfor-mance, and was carried out empirically, using a real-world web-based collec-tion in the field of digital humanities. The results obtained show that the re-source-based strategy clearly outperforms the query-based one.


HM-ANN: Efficient Billion-Point Nearest Neighbor Search on Heterogeneous Memory

Neural Information Processing Systems

The state-of-the-art approximate nearest neighbor search (ANNS) algorithms face a fundamental tradeoff between query latency and accuracy, because of small main memory capacity: To store indices in main memory for short query latency, the ANNS algorithms have to limit dataset size or use a quantization scheme which hurts search accuracy. The emergence of heterogeneous memory (HM) brings a solution to significantly increase memory capacity and break the above tradeoff: Using HM, billions of data points can be placed in the main memory on a single machine without using any data compression. However, HM consists of both fast (but small) memory and slow (but large) memory, and using HM inappropriately slows down query significantly. In this work, we present a novel graph-based similarity search algorithm called HM-ANN, which takes both memory and data heterogeneity into consideration and enables billion-scale similarity search on a single node without using compression. On two billion-sized datasets BIGANN and DEEP1B, HM-ANN outperforms state-of-the-art compression-based solutions such as L&C and IMI OPQ in recall-vs-latency by a large margin, obtaining 46% higher recall under the same search latency. We also extend existing graph-based methods such as HNSW and NSG with two strong baseline implementations on HM.


Review for NeurIPS paper: Optimal Query Complexity of Secure Stochastic Convex Optimization

Neural Information Processing Systems

In terms of the presentation, the paper constantly switches in terminology, both referring to "secure" and "private." Given that by now differential privacy has been established as a notion of formal privacy, I believe it is better to refer to this problem to "secure" optimization. For example, in page 2, line 49, classic bounds are referred as "non-private." I think it would be better to exclusively refer to "secure" in this definitions. However, they are clearly vacuous for the classical "non-secure" case.


Review for NeurIPS paper: Optimal Query Complexity of Secure Stochastic Convex Optimization

Neural Information Processing Systems

This paper was carefully reviewed and discussed by our reviewer panel. The consensus was that this is nice work, the rebuttal had some sway, and the paper can be published in NeurIPS this year. But please do take into account the detailed comments of the reviewers when putting together your camera-ready version.


Reviews: Hierarchical Optimal Transport for Document Representation

Neural Information Processing Systems

This paper proposes a distance metric for documents. The proposed solution is to combine latent topics from topic models with the idea of using geometry from word embeddings to compute distances between pairs of documents (as in the WMD metric). First topics are computed, and WMD is performed at the topic level as opposed to the word level. The hypothesis presented is that modeling documents by their representative topics is better for highlighting differences despite the loss in resolution and is similar to how a person would do this task: breaking down each document into concepts, and then comparing the concepts. Since the topics are precomputed for a given corpus, speed up is gained at inference time when computing document similarities.


Chain-of-Retrieval Augmented Generation

arXiv.org Artificial Intelligence

This paper introduces an approach for training o1-like RAG models that retrieve and reason over relevant information step by step before generating the final answer. Conventional RAG methods usually perform a single retrieval step before the generation process, which limits their effectiveness in addressing complex queries due to imperfect retrieval results. In contrast, our proposed method, CoRAG (Chain-of-Retrieval Augmented Generation), allows the model to dynamically reformulate the query based on the evolving state. To train CoRAG effectively, we utilize rejection sampling to automatically generate intermediate retrieval chains, thereby augmenting existing RAG datasets that only provide the correct final answer. At test time, we propose various decoding strategies to scale the model's test-time compute by controlling the length and number of sampled retrieval chains. Experimental results across multiple benchmarks validate the efficacy of CoRAG, particularly in multi-hop question answering tasks, where we observe more than 10 points improvement in EM score compared to strong baselines. On the KILT benchmark, CoRAG establishes a new state-of-the-art performance across a diverse range of knowledge-intensive tasks. Furthermore, we offer comprehensive analyses to understand the scaling behavior of CoRAG, laying the groundwork for future research aimed at developing factual and grounded foundation models.


Interpretability Analysis of Domain Adapted Dense Retrievers

arXiv.org Artificial Intelligence

Dense retrievers have demonstrated significant potential for neural information retrieval; however, they exhibit a lack of robustness to domain shifts, thereby limiting their efficacy in zero-shot settings across diverse domains. Previous research has investigated unsupervised domain adaptation techniques to adapt dense retrievers to target domains. However, these studies have not focused on explainability analysis to understand how such adaptations alter the model's behavior. In this paper, we propose utilizing the integrated gradients framework to develop an interpretability method that provides both instance-based and ranking-based explanations for dense retrievers. To generate these explanations, we introduce a novel baseline that reveals both query and document attributions. This method is used to analyze the effects of domain adaptation on input attributions for query and document tokens across two datasets: the financial question answering dataset (FIQA) and the biomedical information retrieval dataset (TREC-COVID). Our visualizations reveal that domain-adapted models focus more on in-domain terminology compared to non-adapted models, exemplified by terms such as "hedge," "gold," "corona," and "disease." This research addresses how unsupervised domain adaptation techniques influence the behavior of dense retrievers when adapted to new domains. Additionally, we demonstrate that integrated gradients are a viable choice for explaining and analyzing the internal mechanisms of these opaque neural models.


Reviews: Drill-down: Interactive Retrieval of Complex Scenes using Natural Language Queries

Neural Information Processing Systems

The main problem for me is that the paper promises a very real scenario (Figure 1) of how a user can refine search by using a sequence of refined queries. However, majority of the model design and evaluation (except section 4.2) is performed with dense region captions that have almost no sequential nature. While this is partially a strength as no additional labels are required, the method seems suited especially towards such disconnected queries -- there is space for M disconnected queries and only then updates are required. This would provide a deeper understanding of when the proposed method works better. In Figure 1, the user queries seem very natural, but the simulated queries in Figure 1 are not.


Reviews: Drill-down: Interactive Retrieval of Complex Scenes using Natural Language Queries

Neural Information Processing Systems

This paper investigates the problem of multi-round natural language image retrieval, using annotations from the Visual Genome dataset for training and evaluation. After feedback and reviewer discussion, this paper received final ratings of 6, 6 and 7. Despite some concerns about the use of non-sequential annotation data for a sequential task, the reviewers found the proposed model to be generally sound and the experimental evaluation convincing, and the AC agrees. However, we would encourage the authors to pay close attention to the reviewer feedback when preparing the final paper version. In particular, the author feedback committed to including the additional baselines requested by R1, so these should be included in the final version as promised.