AITopics | irrelevant question

Collaborating Authors

irrelevant question

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Methodological Framework for Quantifying Semantic Test Coverage in RAG Systems

Broestl, Noah, Abdalla, Adel Nasser, Bale, Rajprakash, Gupta, Hersh, Struever, Max

arXiv.org Artificial IntelligenceOct-2-2025

Reliably determining the performance of Retrieval-Augmented Generation (RAG) systems depends on comprehensive test questions. While a proliferation of evaluation frameworks for LLM-powered applications exists, current practices lack a systematic method to ensure these test sets adequately cover the underlying knowledge base, leaving developers with significant blind spots. To address this, we present a novel, applied methodology to quantify the semantic coverage of RAG test questions against their underlying documents. Our approach leverages existing technologies, including vector embeddings and clustering algorithms, to create a practical framework for validating test comprehensiveness. Our methodology embeds document chunks and test questions into a unified vector space, enabling the calculation of multiple coverage metrics: basic proximity, content-weighted coverage, and multi-topic question coverage. Furthermore, we incorporate outlier detection to filter irrelevant questions, allowing for the refinement of test sets. Experimental evidence from two distinct use cases demonstrates that our framework effectively quantifies test coverage, identifies specific content areas with inadequate representation, and provides concrete recommendations for generating new, high-value test questions. This work provides RAG developers with essential tools to build more robust test suites, thereby improving system reliability and extending to applications such as identifying misaligned documents.

data mining, large language model, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2510.00001

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

SOrT-ing VQA Models : Contrastive Gradient Learning for Improved Consistency

Dharur, Sameer, Tendulkar, Purva, Batra, Dhruv, Parikh, Devi, Selvaraju, Ramprasaath R.

arXiv.org Artificial IntelligenceOct-20-2020

Recent research in Visual Question Answering (VQA) has revealed state-of-the-art models to be inconsistent in their understanding of the world -- they answer seemingly difficult questions requiring reasoning correctly but get simpler associated sub-questions wrong. These sub-questions pertain to lower level visual concepts in the image that models ideally should understand to be able to answer the higher level question correctly. To address this, we first present a gradient-based interpretability approach to determine the questions most strongly correlated with the reasoning question on an image, and use this to evaluate VQA models on their ability to identify the relevant sub-questions needed to answer a reasoning question. Next, we propose a contrastive gradient learning based approach called Sub-question Oriented Tuning (SOrT) which encourages models to rank relevant sub-questions higher than irrelevant questions for an <$image, reasoning-question$> pair. We show that SOrT improves model consistency by upto 6.5% points over existing baselines, while also improving visual grounding.

artificial intelligence, machine learning, reasoning question, (17 more...)

arXiv.org Artificial Intelligence

2010.10038

Country:

North America > United States > California > San Francisco County > San Francisco (0.05)
Pacific Ocean > North Pacific Ocean > San Francisco Bay > Golden Gate (0.04)
North America > United States > New York (0.04)
(8 more...)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Financial Service Chatbot based on Deep Bidirectional Transformers

Yu, Shi, Chen, Yuxin, Zaidi, Hussain

arXiv.org Machine LearningFeb-17-2020

We develop a chatbot using Deep Bidirectional Transformer models (BERT) to handle client questions in financial investment customer service. The bot can recognize 381 intents, and decides when to say "I don't know" and escalates irrelevant/uncertain questions to human operators. Our main novel contribution is the discussion about uncertainty measure for BERT, where three different approaches are systematically compared on real problems. We investigated two uncertainty metrics, information entropy and variance of dropout sampling in BERT, followed by mixed-integer programming to optimize decision thresholds. Another novel contribution is the usage of BERT as a language model in automatic spelling correction. Inputs with accidental spelling errors can significantly decrease intent classification performance. The proposed approach combines probabilities from masked language model and word edit distances to find the best corrections for misspelled words. The chatbot and the entire conversational AI system are developed using open-source tools, and deployed within our company's intranet. The proposed approach can be useful for industries seeking similar in-house solutions in their specific business domains. We share all our code and a sample chatbot built on a public dataset on Github.

financial service chatbot, irrelevant question, threshold, (11 more...)

arXiv.org Machine Learning

2003.04987

Country:

Oceania > Australia (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (0.47)
Banking & Finance > Financial Services (0.42)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback