AITopics | reading comprehension

Collaborating Authors

reading comprehension

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Densely Connected Attention Propagation for Reading Comprehension

Yi Tay, Anh Tuan Luu, Siu Cheung Hui, Jian Su

Neural Information Processing SystemsFeb-13-2026, 09:01:15 GMT

We propose DECAPROP (Densely Connected Attention Propagation), a new densely connected neural architecture for reading comprehension (RC).

artificial intelligence, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.05)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > Canada > Quebec > Montreal (0.04)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.51)

Add feedback

AraLingBench A Human-Annotated Benchmark for Evaluating Arabic Linguistic Capabilities of Large Language Models

Zbeeb, Mohammad, Hammoud, Hasan Abed Al Kader, Mukalled, Sina, Rizk, Nadine, Karnib, Fatima, Lakkis, Issam, Mohanna, Ammar, Ghanem, Bernard

arXiv.org Artificial IntelligenceDec-10-2025

The benchmark spans five core categories: grammar, morphology, spelling, reading comprehension, and syntax, through 150 expert-designed multiple choice questions that directly assess structural language understanding. Evaluating 35 Arabic and bilingual LLMs reveals that current models demonstrate strong surface level proficiency but struggle with deeper grammatical and syntactic reasoning. AraLingBench highlights a persistent gap between high scores on knowledge-based benchmarks and true linguistic mastery, showing that many models succeed through memorization or pattern recognition rather than authentic comprehension. By isolating and measuring fundamental linguistic skills, AraLingBench provides a diagnostic framework for developing Arabic LLMs. The full evaluation code is publicly available on GitHub.

benchmark, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2511.14295

Country:

Europe (1.00)
North America (0.68)
Asia > Middle East > UAE (0.46)

Genre:

Research Report (0.51)
Questionnaire & Opinion Survey (0.34)

Industry: Education > Assessment & Standards > Student Performance (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Densely Connected Attention Propagation for Reading Comprehension

Yi Tay, Anh Tuan Luu, Siu Cheung Hui, Jian Su

Neural Information Processing SystemsNov-20-2025, 23:33:02 GMT

We conduct extensive experiments on four challenging RC benchmarks.

arxiv preprint arxiv, machine learning, natural language, (14 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > Canada > Quebec > Montreal (0.04)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)

Industry: Education > Assessment & Standards > Student Performance (0.42)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

R1-Ranker: Teaching LLM Rankers to Reason

Feng, Tao, Hua, Zhigang, Lei, Zijie, Xie, Yan, Yang, Shuang, Long, Bo, You, Jiaxuan

arXiv.org Artificial IntelligenceOct-17-2025

Large language models (LLMs) have recently shown strong reasoning abilities in domains like mathematics, coding, and scientific problem-solving, yet their potential for ranking tasks, where prime examples include retrieval, recommender systems, and LLM routing, remains underexplored. Ranking requires complex reasoning across heterogeneous candidates, but existing LLM-based rankers are often domain-specific, tied to fixed backbones, and lack iterative refinement, limiting their ability to fully exploit LLMs' reasoning potential. To address these challenges, we propose R1-Ranker, a reasoning-incentive framework built on reinforcement learning, with two complementary designs: DRanker, which generates full rankings in one shot, and IRanker, which decomposes ranking into an iterative elimination process with step-wise rewards to encourage deeper reasoning. We evaluate unified R1-Rankers on nine datasets spanning recommendation, routing, and passage ranking, showing that IRanker-3B consistently achieves state-of-the-art performance, surpasses larger 7B models on some tasks, and yields a 15.7% average relative improvement. Ablation and generalization experiments further confirm the critical role of reinforcement learning and iterative reasoning, with IRanker-3B improving zero-shot performance by over 9% on out-of-domain tasks and reasoning traces boosting other LLMs by up to 22.87%. These results demonstrate that unifying diverse ranking tasks with a single reasoning-driven foundation model is both effective and essential for advancing LLM reasoning in ranking scenarios.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2506.21638

Genre:

Overview (1.00)
Research Report > New Finding (0.87)

Industry:

Media > Music (1.00)
Media > Film (1.00)
Leisure & Entertainment > Games > Computer Games (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

UniversalCEFR: Enabling Open Multilingual Research on Language Proficiency Assessment

Imperial, Joseph Marvin, Barayan, Abdullah, Stodden, Regina, Wilkens, Rodrigo, Sanchez, Ricardo Munoz, Gao, Lingyun, Torgbi, Melissa, Knight, Dawn, Forey, Gail, Jablonkai, Reka R., Kochmar, Ekaterina, Reynolds, Robert, Ribeiro, Eugénio, Saggion, Horacio, Volodina, Elena, Vajjala, Sowmya, François, Thomas, Alva-Manchego, Fernando, Madabushi, Harish Tayyar

arXiv.org Artificial IntelligenceSep-17-2025

We introduce UniversalCEFR, a large-scale multilingual and multidimensional dataset of texts annotated with CEFR (Common European Framework of Reference) levels in 13 languages. To enable open research in automated readability and language proficiency assessment, UniversalCEFR comprises 505,807 CEFR-labeled texts curated from educational and learner-oriented resources, standardized into a unified data format to support consistent processing, analysis, and modelling across tasks and languages. To demonstrate its utility, we conduct benchmarking experiments using three modelling paradigms: a) linguistic feature-based classification, b) fine-tuning pre-trained LLMs, and c) descriptor-based prompting of instruction-tuned LLMs. Our results support using linguistic features and fine-tuning pretrained models in multilingual CEFR level assessment. Overall, UniversalCEFR aims to establish best practices in data distribution for language proficiency research by standardising dataset formats, and promoting their accessibility to the global research community.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2506.01419

Country:

Europe > United Kingdom (0.67)
Asia > Middle East > UAE (0.28)
North America > United States > Minnesota (0.27)

Genre: Research Report > New Finding (0.87)

Industry:

Information Technology > Security & Privacy (1.00)
Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

SCOP: Evaluating the Comprehension Process of Large Language Models from a Cognitive View

Xiao, Yongjie, Liang, Hongru, Qin, Peixin, Zhang, Yao, Lei, Wenqiang

arXiv.org Artificial IntelligenceSep-8-2025

Despite the great potential of large language models(LLMs) in machine comprehension, it is still disturbing to fully count on them in real-world scenarios. This is probably because there is no rational explanation for whether the comprehension process of LLMs is aligned with that of experts. In this paper, we propose SCOP to carefully examine how LLMs perform during the comprehension process from a cognitive view. Specifically, it is equipped with a systematical definition of five requisite skills during the comprehension process, a strict framework to construct testing data for these skills, and a detailed analysis of advanced open-sourced and closed-sourced LLMs using the testing data. With SCOP, we find that it is still challenging for LLMs to perform an expert-level comprehension process. Even so, we notice that LLMs share some similarities with experts, e.g., performing better at comprehending local information than global information. Further analysis reveals that LLMs can be somewhat unreliable -- they might reach correct answers through flawed comprehension processes. Based on SCOP, we suggest that one direction for improving LLMs is to focus more on the comprehension process, ensuring all comprehension skills are thoroughly developed during training.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2506.05

Country:

Asia (1.00)
Europe (0.67)
North America > United States > Florida > Miami-Dade County (0.46)
North America > United States > South Carolina > Greenville County (0.28)

Genre: Research Report > New Finding (0.67)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Rethinking Reasoning in LLMs: Neuro-Symbolic Local RetoMaton Beyond ICL and CoT

Mamidala, Rushitha Santhoshi, Chhabra, Anshuman, Mali, Ankur

arXiv.org Artificial IntelligenceAug-28-2025

Prompt-based reasoning strategies such as Chain-of-Thought (CoT) and In-Context Learning (ICL) have become widely used for eliciting reasoning capabilities in large language models (LLMs). However, these methods rely on fragile, implicit mechanisms often yielding inconsistent outputs across seeds, formats, or minor prompt variations making them fundamentally unreliable for tasks requiring stable, interpretable reasoning. In contrast, automata-based neuro-symbolic frameworks like RetoMaton offer a more structured and trustworthy alternative by grounding retrieval in symbolic memory with deterministic transitions. In this work, we extend RetoMaton by replacing its global datastore with a local, task-adaptive Weighted Finite Automaton (WFA), constructed directly from external domain corpora. This local automaton structure promotes robust, context-aware retrieval while preserving symbolic traceability and low inference overhead. Unlike prompting, which entangles context and memory in opaque ways, our approach leverages the explicit structure of WFAs to provide verifiable and modular retrieval behavior, making it better suited for domain transfer and interoperability. We evaluate this local RetoMaton variant on two pretrained LLMs LLaMA-3.2-1B and Gemma-3-1B-PT across three reasoning tasks: TriviaQA (reading comprehension), GSM8K (multi-step math), and MMLU (domain knowledge). Compared to the base model and prompting-based methods, augmenting these setups with local RetoMaton consistently improves performance while enabling transparent and reproducible retrieval dynamics. Our results highlight a promising shift toward trustworthy, symbolic reasoning in modern LLMs via lightweight, automaton-guided memory.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2508.19271

Country: Africa (0.15)

Genre: Research Report > New Finding (0.48)

Industry: Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

LoraxBench: A Multitask, Multilingual Benchmark Suite for 20 Indonesian Languages

Aji, Alham Fikri, Cohn, Trevor

arXiv.org Artificial IntelligenceAug-19-2025

As one of the world's most populous countries, with 700 languages spoken, Indonesia is behind in terms of NLP progress. We introduce LoraxBench, a benchmark that focuses on low-resource languages of Indonesia and covers 6 diverse tasks: reading comprehension, open-domain QA, language inference, causal reasoning, translation, and cultural QA. Our dataset covers 20 languages, with the addition of two formality registers for three languages. We evaluate a diverse set of multilingual and region-focused LLMs and found that this benchmark is challenging. We note a visible discrepancy between performance in Indonesian and other languages, especially the low-resource ones. There is no clear lead when using a region-specific model as opposed to the general multilingual model. Lastly, we show that a change in register affects model performance, especially with registers not commonly found in social media, such as high-level politeness `Krama' Javanese.

computational linguistic, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2508.12459

Country:

North America (1.00)
Europe (1.00)
Asia > Indonesia > Sumatra (0.46)
(3 more...)

Genre: Research Report (0.40)

Industry: Education > Assessment & Standards > Student Performance (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)

Add feedback

CogL TX: Applying BERT to Long Texts

Neural Information Processing SystemsAug-15-2025, 05:34:45 GMT

BERT is incapable of processing long texts due to its quadratically increasing memory and time consumption. The most natural ways to address this problem, such as slicing the text by a sliding window or simplifying transformers, suffer from insufficient long-range attentions or need customized CUDA kernels.

arxiv preprint arxiv, cogl tx, reasoner, (13 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)

Industry: Health & Medicine (0.30)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

CardiffNLP at CLEARS-2025: Prompting Large Language Models for Plain Language and Easy-to-Read Text Rewriting

Ayesh, Mutaz, Gutiérrez-Rolón, Nicolás, Alva-Manchego, Fernando

arXiv.org Artificial IntelligenceAug-6-2025

This paper details the CardiffNLP team's contribution to the CLEARS shared task on Spanish text adaptation, hosted by IberLEF 2025. The shared task contained two subtasks and the team submitted to both. Our team took an LLM-prompting approach with different prompt variations. While we initially experimented with LLaMA-3.2, we adopted Gemma-3 for our final submission, and landed third place in Subtask 1 and second place in Subtask 2. We detail our numerous prompt variations, examples, and experimental results.

large language model, machine learning, simplification, (16 more...)

arXiv.org Artificial Intelligence

2508.0324

Country:

North America > United States (0.28)
Europe > Spain (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback