AITopics | naturalquestion

Collaborating Authors

naturalquestion

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Appendix

Neural Information Processing SystemsFeb-11-2026, 18:08:07 GMT

H.3 WinogenderSetup We follow the same setup as in Rae et al.[38].

gopher, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Industry: Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Semantic Caching of Contextual Summaries for Efficient Question-Answering with Language Models

Couturier, Camille, Mastorakis, Spyros, Shen, Haiying, Rajmohan, Saravan, Rühle, Victor

arXiv.org Artificial IntelligenceMay-19-2025

Large Language Models (LLMs) are increasingly deployed across edge and cloud platforms for real-time question-answering and retrieval-augmented generation. However, processing lengthy contexts in distributed systems incurs high computational overhead, memory usage, and network bandwidth. This paper introduces a novel semantic caching approach for storing and reusing intermediate contextual summaries, enabling efficient information reuse across similar queries in LLM-based QA workflows. Our method reduces redundant computations by up to 50-60% while maintaining answer accuracy comparable to full document processing, as demonstrated on NaturalQuestions, TriviaQA, and a synthetic ArXiv dataset. This approach balances computational cost and response quality, critical for real-time AI assistants.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.11271

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Perception Compressor:A training-free prompt compression method in long context scenarios

Tang, Jiwei, Xu, Jin, Lu, Tingwei, Zhang, Zhicheng, Zhao, Yiming, Hai, Lin, Zheng, Hai-Tao

arXiv.org Artificial IntelligenceNov-5-2024

Large Language Models (LLMs) demonstrate exceptional capabilities in various scenarios. However, they suffer from much redundant information and are sensitive to the position of key information (relevant to the input question) in long context scenarios, leading to inferior performance. To address these challenges, we present Perception Compressor, a training-free prompt compression method. It includes a perception retriever that leverages guiding questions and instruction to retrieve the most relevant demonstrations, a dual-slope ratio allocator to dynamically allocate compression ratios and open-book ratios, and a semi-guided iterative compression that retains key information at the token level while removing tokens that distract the LLM. We conduct extensive experiments on long context benchmarks, i.e., NaturalQuestions, LongBench, and MuSiQue. Experiment results show that Perception Compressor outperforms existing methods by a large margin, achieving state-of-the-art performance.

demonstration, input question, perception compressor, (14 more...)

arXiv.org Artificial Intelligence

2409.19272

Country:

North America > United States (0.14)
Asia > Thailand > Bangkok > Bangkok (0.04)
North America > Canada > Ontario > Toronto (0.04)
(3 more...)

Genre: Research Report > New Finding (0.88)

Industry:

Leisure & Entertainment (1.00)
Media > Television (0.93)
Media > Film (0.93)
Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

UNLEARN Efficient Removal of Knowledge in Large Language Models

Lizzo, Tyler, Heck, Larry

arXiv.org Artificial IntelligenceAug-7-2024

Given the prevalence of large language models (LLMs) and the prohibitive cost of training these models from scratch, dynamically forgetting specific knowledge e.g., private or proprietary, without retraining the model has become an important capability. This paper proposes a novel method to achieve this objective called UNLEARN. The approach builds upon subspace methods to identify and specifically target the removal of knowledge without adversely affecting other knowledge in the LLM. Results demonstrate 96% of targeted knowledge can be forgotten while maintaining performance on other knowledge within 2.5% of the original model, significantly outperforming the discriminatory abilities of the previous state-of-the-art. A dual method called LEARN is also proposed for targeted knowledge addition. Results show LEARN can match the fine-tuning accuracy of Low-Rank Adaptation (LoRA) without adversely affecting similar tasks.

knowledge, matrix, subspace, (16 more...)

arXiv.org Artificial Intelligence

2408.0414

Country:

North America > United States > California (0.04)
Asia > Singapore (0.04)
North America > United States > Virginia (0.04)
(2 more...)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (0.68)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Nearest Neighbor Speculative Decoding for LLM Generation and Attribution

Li, Minghan, Chen, Xilun, Holtzman, Ari, Chen, Beidi, Lin, Jimmy, Yih, Wen-tau, Lin, Xi Victoria

arXiv.org Artificial IntelligenceMay-30-2024

Large language models (LLMs) often hallucinate and lack the ability to provide attribution for their generations. Semi-parametric LMs, such as kNN-LM, approach these limitations by refining the output of an LM for a given prompt using its nearest neighbor matches in a non-parametric data store. However, these models often exhibit slow inference speeds and produce non-fluent texts. In this paper, we introduce Nearest Neighbor Speculative Decoding (NEST), a novel semi-parametric language modeling approach that is capable of incorporating real-world text spans of arbitrary length into the LM generations and providing attribution to their sources. NEST performs token-level retrieval at each inference step to compute a semi-parametric mixture distribution and identify promising span continuations in a corpus. It then uses an approximate speculative decoding procedure that accepts a prefix of the retrieved span or generates a new token. NEST significantly enhances the generation quality and attribution rate of the base LM across a variety of knowledge-intensive tasks, surpassing the conventional kNN-LM method and performing competitively with in-context retrieval augmentation. In addition, NEST substantially improves the generation speed, achieving a 1.8x speedup in inference time when applied to Llama-2-Chat 70B.

computational linguistic, language model, proceedings, (15 more...)

arXiv.org Artificial Intelligence

2405.19325

Country:

Europe > Germany > Bremen > Bremen (0.14)
Asia > Singapore (0.04)
Asia > North Korea (0.04)
(22 more...)

Genre: Research Report (0.50)

Industry:

Leisure & Entertainment (1.00)
Government > Military (0.93)
Energy (0.67)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Label-Efficient Model Selection for Text Generation

Ashury-Tahan, Shir, Sznajder, Benjamin, Choshen, Leshem, Ein-Dor, Liat, Shnarch, Eyal, Gera, Ariel

arXiv.org Artificial IntelligenceFeb-12-2024

Model selection for a given target task can be costly, as it may entail extensive annotation of the quality of outputs of different models. We introduce DiffUse, an efficient method to make an informed decision between candidate text generation models. DiffUse reduces the required amount of preference annotations, thus saving valuable time and resources in performing evaluation. DiffUse intelligently selects instances by clustering embeddings that represent the semantic differences between model outputs. Thus, it is able to identify a subset of examples that are more informative for preference decisions. Our method is model-agnostic, and can be applied to any text generation model. Moreover, we propose a practical iterative approach for dynamically determining how many instances to annotate. In a series of experiments over hundreds of model pairs, we demonstrate that DiffUse can dramatically reduce the required number of annotations -- by up to 75% -- while maintaining high evaluation reliability.

annotated example number, h-ward, naturalquestion, (14 more...)

arXiv.org Artificial Intelligence

2402.07891

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
(7 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Dai, Damai, Deng, Chengqi, Zhao, Chenggang, Xu, R. X., Gao, Huazuo, Chen, Deli, Li, Jiashi, Zeng, Wangding, Yu, Xingkai, Wu, Y., Xie, Zhenda, Li, Y. K., Huang, Panpan, Luo, Fuli, Ruan, Chong, Sui, Zhifang, Liang, Wenfeng

arXiv.org Artificial IntelligenceJan-11-2024

In the era of large language models, Mixture-of-Experts (MoE) is a promising architecture for managing computational costs when scaling up model parameters. However, conventional MoE architectures like GShard, which activate the top-$K$ out of $N$ experts, face challenges in ensuring expert specialization, i.e. each expert acquires non-overlapping and focused knowledge. In response, we propose the DeepSeekMoE architecture towards ultimate expert specialization. It involves two principal strategies: (1) finely segmenting the experts into $mN$ ones and activating $mK$ from them, allowing for a more flexible combination of activated experts; (2) isolating $K_s$ experts as shared ones, aiming at capturing common knowledge and mitigating redundancy in routed experts. Starting from a modest scale with 2B parameters, we demonstrate that DeepSeekMoE 2B achieves comparable performance with GShard 2.9B, which has 1.5 times the expert parameters and computation. In addition, DeepSeekMoE 2B nearly approaches the performance of its dense counterpart with the same number of total parameters, which set the upper bound of MoE models. Subsequently, we scale up DeepSeekMoE to 16B parameters and show that it achieves comparable performance with LLaMA2 7B, with only about 40% of computations. Further, our preliminary efforts to scale up DeepSeekMoE to 145B parameters consistently validate its substantial advantages over the GShard architecture, and show its performance comparable with DeepSeek 67B, using only 28.5% (maybe even 18.2%) of computations.

acc, architecture, deepseekmoe, (16 more...)

arXiv.org Artificial Intelligence

2401.06066

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > Jordan (0.04)
South America > Colombia > Meta Department > Villavicencio (0.04)
(14 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Evaluating Verifiability in Generative Search Engines

Liu, Nelson F., Zhang, Tianyi, Liang, Percy

arXiv.org Artificial IntelligenceOct-23-2023

Generative search engines directly generate responses to user queries, along with in-line citations. A prerequisite trait of a trustworthy generative search engine is verifiability, i.e., systems should cite comprehensively (high citation recall; all statements are fully supported by citations) and accurately (high citation precision; every cite supports its associated statement). We conduct human evaluation to audit four popular generative search engines -- Bing Chat, NeevaAI, perplexity.ai, and YouChat -- across a diverse set of queries from a variety of sources (e.g., historical Google user queries, dynamically-collected open-ended questions on Reddit, etc.). We find that responses from existing generative search engines are fluent and appear informative, but frequently contain unsupported statements and inaccurate citations: on average, a mere 51.5% of generated sentences are fully supported by citations and only 74.5% of citations support their associated sentence. We believe that these results are concerningly low for systems that may serve as a primary tool for information-seeking users, especially given their facade of trustworthiness. We hope that our results further motivate the development of trustworthy generative search engines and help researchers and users better understand the shortcomings of existing commercial systems.

precision, query, search engine, (16 more...)

arXiv.org Artificial Intelligence

2304.09848

Country:

Europe > United Kingdom (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Trinidad and Tobago (0.04)
Europe > Spain (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Media (1.00)
Leisure & Entertainment (1.00)
Health & Medicine > Therapeutic Area (0.94)
Government > Regional Government > North America Government > United States Government (0.93)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)

Add feedback

Merging Generated and Retrieved Knowledge for Open-Domain QA

Zhang, Yunxiang, Khalifa, Muhammad, Logeswaran, Lajanugen, Lee, Moontae, Lee, Honglak, Wang, Lu

arXiv.org Artificial IntelligenceOct-22-2023

Open-domain question answering (QA) systems are often built with retrieval modules. However, retrieving passages from a given source is known to suffer from insufficient knowledge coverage. Alternatively, prompting large language models (LLMs) to generate contextual passages based on their parametric knowledge has been shown to improve QA performance. Yet, LLMs tend to "hallucinate" content that conflicts with the retrieved knowledge. Based on the intuition that answers supported by both sources are more likely to be correct, we propose COMBO, a Compatibility-Oriented knowledge Merging for Better Open-domain QA framework, to effectively leverage the two sources of information. Concretely, we match LLM-generated passages with retrieved counterparts into compatible pairs, based on discriminators trained with silver compatibility labels. Then a Fusion-in-Decoder-based reader model handles passage pairs to arrive at the final answer. Experiments show that COMBO outperforms competitive baselines on three out of four tested open-domain QA benchmarks. Further analysis reveals that our proposed framework demonstrates greater efficacy in scenarios with a higher degree of knowledge conflicts.

computational linguistic, knowledge, llm-generated passage, (16 more...)

arXiv.org Artificial Intelligence

2310.14393

Country:

North America > United States > Washington > King County > Seattle (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Austria (0.04)
(17 more...)

Genre: Research Report (0.82)

Industry:

Media > Television (0.68)
Leisure & Entertainment > Sports > Football (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Atlas: Few-shot Learning with Retrieval Augmented Language Models

Izacard, Gautier, Lewis, Patrick, Lomeli, Maria, Hosseini, Lucas, Petroni, Fabio, Schick, Timo, Dwivedi-Yu, Jane, Joulin, Armand, Riedel, Sebastian, Grave, Edouard

arXiv.org Artificial IntelligenceNov-16-2022

Large language models have shown impressive few-shot results on a wide range of tasks. However, when knowledge is key for such results, as is the case for tasks such as question answering and fact checking, massive parameter counts to store knowledge seem to be needed. Retrieval augmented models are known to excel at knowledge intensive tasks without the need for as many parameters, but it is unclear whether they work in few-shot settings. In this work we present Atlas, a carefully designed and pre-trained retrieval augmented language model able to learn knowledge intensive tasks with very few training examples. We perform evaluations on a wide range of tasks, including MMLU, KILT and NaturalQuestions, and study the impact of the content of the document index, showing that it can easily be updated. Notably, Atlas reaches over 42% accuracy on Natural Questions using only 64 examples, outperforming a 540B parameters model by 3% despite having 50x fewer parameters.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2208.03299

Country:

North America > Bermuda (0.04)
Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England (0.04)
(11 more...)

Genre: Research Report (0.83)

Industry:

Education > Curriculum > Subject-Specific Education (1.00)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback