AITopics | newsqa

Collaborating Authors

newsqa

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Small Encoders Can Rival Large Decoders in Detecting Groundedness

Abbes, Istabrak, Prato, Gabriele, Fournier, Quentin, Rodriguez, Fernando, Boukhary, Alaa, Elwood, Adam, Chandar, Sarath

arXiv.org Artificial IntelligenceJun-27-2025

Augmenting large language models (LLMs) with external context significantly improves their performance in natural language processing (NLP) tasks. However, LLMs struggle to answer queries reliably when the provided context lacks information, often resorting to ungrounded speculation or internal knowledge. Groundedness - generating responses strictly supported by the context - is essential for ensuring factual consistency and trustworthiness. This study focuses on detecting whether a given query is grounded in a document provided in context before the costly answer generation by LLMs. Such a detection mechanism can significantly reduce both inference time and resource consumption. We show that lightweight, task specific encoder models such as RoBERTa and NomicBERT, fine-tuned on curated datasets, can achieve accuracy comparable to state-of-the-art LLMs, such as Llama3 8B and GPT4o, in groundedness detection while reducing inference latency by orders of magnitude. The code is available at : https://github.com/chandarlab/Hallucinate-less

information, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2506.21288

Country:

North America > United States > Florida > Miami-Dade County > Miami (0.04)
North America > Canada > Quebec > Montreal (0.04)
North America > Canada > Ontario > Toronto (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Efficient In-Domain Question Answering for Resource-Constrained Environments

Chung, Isaac, Vo, Phat, Kizilkale, Arman C., Reite, Aaron

arXiv.org Artificial IntelligenceOct-17-2024

Retrieval Augmented Generation (RAG) is a common method for integrating external knowledge into pretrained Large Language Models (LLMs) to enhance accuracy and relevancy in question answering (QA) tasks. However, prompt engineering and resource efficiency remain significant bottlenecks in developing optimal and robust RAG solutions for real-world QA applications. Recent studies have shown success in using fine tuning to address these problems; in particular, Retrieval Augmented Fine Tuning (RAFT) applied to smaller 7B models has demonstrated superior performance compared to RAG setups with much larger models such as GPT-3.5. The combination of RAFT with parameter-efficient fine tuning (PEFT) techniques, such as Low-Rank Adaptation (LoRA), promises an even more efficient solution, yet remains an unexplored area. In this work, we combine RAFT with LoRA to reduce fine tuning and storage requirements and gain faster inference times while maintaining comparable RAG performance. This results in a more compute-efficient RAFT, or CRAFT, which is particularly useful for knowledge-intensive QA tasks in resource-constrained environments where internet access may be restricted and hardware resources limited.

adapter, dataset, lora, (17 more...)

arXiv.org Artificial Intelligence

2409.17648

Country:

Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

The Turking Test: Can Language Models Understand Instructions?

Efrat, Avia, Levy, Omer

arXiv.org Artificial IntelligenceOct-22-2020

Supervised machine learning provides the learner with a set of input-output examples of the target task. Humans, however, can also learn to perform new tasks from instructions in natural language. Can machines learn to understand instructions as well? We present the Turking Test, which examines a model's ability to follow natural language instructions of varying complexity. These range from simple tasks, like retrieving the nth word of a sentence, to ones that require creativity, such as generating examples for SNLI and SQuAD in place of human intelligence workers ("turkers"). Despite our lenient evaluation methodology, we observe that a large pretrained language model performs poorly across all tasks. Analyzing the model's error patterns reveals that the model tends to ignore explicit instructions and often generates outputs that cannot be construed as an attempt to solve the task. While it is not yet clear whether instruction understanding can be captured by traditional language models, the sheer expressivity of instruction understanding makes it an appealing alternative to the rising few-shot inference paradigm.

instruction, newsqa, noun, (16 more...)

arXiv.org Artificial Intelligence

2010.11982

Country:

Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
North America > United States > Alaska (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report (0.82)

Industry: Education (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Adversarial Augmentation Policy Search for Domain and Cross-Lingual Generalization in Reading Comprehension

Maharana, Adyasha, Bansal, Mohit

arXiv.org Artificial IntelligenceMay-1-2020

Reading comprehension models often overfit to nuances of training datasets and fail at adversarial evaluation. Training with adversarially augmented dataset improves robustness against those adversarial attacks but hurts generalization of the models. In this work, we present several effective adversaries and automated data augmentation policy search methods with the goal of making reading comprehension models more robust to adversarial evaluation, but also improving generalization to the source domain as well as new domains and languages. We first propose three new methods for generating QA adversaries, that introduce multiple points of confusion within the context, show dependence on insertion location of the distractor, and reveal the compounding effect of mixing adversarial strategies with syntactic and semantic paraphrasing methods. Next, we find that augmenting the training datasets with uniformly sampled adversaries improves robustness to the adversarial attacks but leads to decline in performance on the original unaugmented dataset. We address this issue via RL and more efficient Bayesian policy search methods for automatically learning the best augmentation policy combinations of the transformation probability for each adversary in a large search space. Using these learned policies, we show that adversarial training can lead to significant improvements in in-domain, out-of-domain, and cross-lingual (German, Russian, Turkish) generalization without any use of training data from the target domain or language.

adversary, dataset, roberta base, (16 more...)

arXiv.org Artificial Intelligence

2004.06076

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(14 more...)

Genre: Research Report (1.00)

Industry:

Education > Assessment & Standards > Student Performance (0.82)
Information Technology > Security & Privacy (0.69)
Government > Military (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.89)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.88)

Add feedback

Undersensitivity in Neural Reading Comprehension

Welbl, Johannes, Minervini, Pasquale, Bartolo, Max, Stenetorp, Pontus, Riedel, Sebastian

arXiv.org Machine LearningFeb-15-2020

Current reading comprehension models generalise well to in-distribution test sets, yet perform poorly on adversarially selected inputs. Most prior work on adversarial inputs studies oversensitivity: semantically invariant text perturbations that cause a model's prediction to change when it should not. In this work we focus on the complementary problem: excessive prediction undersensitivity, where input text is meaningfully changed but the model's prediction does not, even though it should. We formulate a noisy adversarial attack which searches among semantic variations of the question for which a model erroneously predicts the same answer, and with even higher probability. Despite comprising unanswerable questions, both SQuAD2.0 and NewsQA models are vulnerable to this attack. This indicates that although accurate, models tend to rely on spurious patterns and do not fully consider the information specified in a question. We experiment with data augmentation and adversarial training as defences, and find that both substantially decrease vulnerability to attacks on held out data, as well as held out attack spaces. Addressing undersensitivity also improves results on AddSent and AddOneSent, and models furthermore generalise better when facing train/evaluation distribution mismatch: they are less prone to overly rely on predictive cues present only in the training set, and outperform a conventional model by as much as 10.9% F1.

computational linguistic, perturbation, squad2, (15 more...)

arXiv.org Machine Learning

2003.04808

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(8 more...)

Genre: Research Report (0.64)

Industry:

Education > Assessment & Standards > Student Performance (0.61)
Government > Military (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback