AITopics | text summarization

Collaborating Authors

text summarization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Deliberation Networks: Sequence Generation Beyond One-Pass Decoding

Neural Information Processing SystemsMar-17-2026, 17:16:41 GMT

The encoder-decoder framework has achieved promising progress for many sequence generation tasks, including machine translation, text summarization, dialog system, image captioning, etc. Such a framework adopts an one-pass forward process while decoding and generating a sequence, but lacks the deliberation process: A generated sequence is directly used as final output without further polishing. However, deliberation is a common behavior in human's daily life like reading news and writing papers/articles/books. In this work, we introduce the deliberation process into the encoder-decoder framework and propose deliberation networks for sequence generation. A deliberation network has two levels of decoders, where the first-pass decoder generates a raw sequence and the second-pass decoder polishes and refines the raw sentence with deliberation. Since the second-pass deliberation decoder has global information about what the sequence to be generated might be, it has the potential to generate a better sequence by looking into future words in the raw sentence. Experiments on neural machine translation and text summarization demonstrate the effectiveness of the proposed deliberation networks. On the WMT 2014 English-to-French translation task, our model establishes a new state-of-the-art BLEU score of 41.5.

artificial intelligence, natural language, proceedings, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Large Language Models for the Summarization of Czech Documents: From History to the Present

Tran, Václav, Šmíd, Jakub, Lenc, Ladislav, Salmon, Jean-Pierre, Král, Pavel

arXiv.org Artificial IntelligenceNov-25-2025

Text summarization is the task of automatically condensing longer texts into shorter, coherent summaries while preserving the original meaning and key information. Although this task has been extensively studied in English and other high-resource languages, Czech summarization, particularly in the context of historical documents, remains underexplored. This is largely due to the inherent linguistic complexity of Czech and the lack of high-quality annotated datasets. In this work, we address this gap by leveraging the capabilities of Large Language Models (LLMs), specifically Mistral and mT5, which have demonstrated strong performance across a wide range of natural language processing tasks and multilingual settings. In addition, we also propose a translation-based approach that first translates Czech texts into English, summarizes them using an English-language model, and then translates the summaries back into Czech. Our study makes the following main contributions: We demonstrate that LLMs achieve new state-of-the-art results on the SumeCzech dataset, a benchmark for modern Czech text summarization, showing the effectiveness of multilingual LLMs even for morphologically rich, medium-resource languages like Czech. We introduce a new dataset, Posel od Čerchova, designed for the summarization of historical Czech texts. This dataset is derived from digitized 19th-century publications and annotated for abstractive summarization. We provide initial baselines using modern LLMs to facilitate further research in this underrepresented area. By combining cutting-edge models with both modern and historical Czech datasets, our work lays the foundation for further progress in Czech summarization and contributes valuable resources for future research in Czech historical document processing and low-resource summarization more broadly.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.18848

Country:

Europe (0.94)
North America > United States (0.28)

Genre: Research Report > Promising Solution (0.34)

Industry: Media > News (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Deliberation Networks: Sequence Generation Beyond One-Pass Decoding

Neural Information Processing SystemsNov-21-2025, 16:03:12 GMT

deliberation network, name change, sequence generation, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Deliberation Networks: Sequence Generation Beyond One-Pass Decoding

Yingce Xia, Fei Tian, Lijun Wu, Jianxin Lin, Tao Qin, Nenghai Yu, Tie-Yan Liu

Neural Information Processing SystemsNov-21-2025, 12:51:09 GMT

A generated sequence is directly used as final output without further polishing. However, deliberation is a common behavior in human's daily life like reading

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

Africa > Middle East > Egypt (0.06)
Europe > Middle East (0.04)
Asia > Middle East (0.04)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Faithful Summarization of Consumer Health Queries: A Cross-Lingual Framework with LLMs

Abrar, Ajwad, Oeshy, Nafisa Tabassum, Maheru, Prianka, Tabassum, Farzana, Chowdhury, Tareque Mohmud

arXiv.org Artificial IntelligenceNov-17-2025

Summarizing consumer health questions (CHQs) can ease communication in healthcare, but unfaithful summaries that misrepresent medical details pose serious risks. We propose a framework that combines TextRank-based sentence extraction and medical named entity recognition with large language models (LLMs) to enhance faithfulness in medical text summarization. In our experiments, we fine-tuned the LLaMA-2-7B model on the MeQSum (English) and BanglaCHQ-Summ (Bangla) datasets, achieving consistent improvements across quality (ROUGE, BERTScore, readability) and faithfulness (SummaC, AlignScore) metrics, and outperforming zero-shot baselines and prior systems. Human evaluation further shows that over 80\% of generated summaries preserve critical medical information. These results highlight faithfulness as an essential dimension for reliable medical summarization and demonstrate the potential of our approach for safer deployment of LLMs in healthcare contexts.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2511.10768

Country:

Asia > Bangladesh (0.19)
Europe > Italy (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (0.35)

Industry: Health & Medicine > Consumer Health (0.73)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Add feedback

BiSparse-AAS: Bilinear Sparse Attention and Adaptive Spans Framework for Scalable and Efficient Text Summarization

Hagos, Desta Haileselassie, Burge, Legand L., Andy, Anietie, Yazidi, Anis, Vlassov, Vladimir

arXiv.org Artificial IntelligenceNov-3-2025

Transformer-based architectures have advanced text summarization, yet their quadratic complexity limits scalability on long documents. This paper introduces BiSparse-AAS (Bilinear Sparse Attention with Adaptive Spans), a novel framework that combines sparse attention, adaptive spans, and bilinear attention to address these limitations. Sparse attention reduces computational costs by focusing on the most relevant parts of the input, while adaptive spans dynamically adjust the attention ranges. Bilinear attention complements both by modeling complex token interactions within this refined context. BiSparse-AAS consistently outperforms state-of-the-art baselines in both extractive and abstractive summarization tasks, achieving average ROUGE improvements of about 68.1% on CNN/DailyMail and 52.6% on XSum, while maintaining strong performance on OpenWebText and Gigaword datasets. By addressing efficiency, scalability, and long-sequence modeling, BiSparse-AAS provides a unified, practical solution for real-world text summarization applications.

large language model, machine learning, mechanism, (19 more...)

arXiv.org Artificial Intelligence

2510.27516

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Balancing Rewards in Text Summarization: Multi-Objective Reinforcement Learning via HyperVolume Optimization

Song, Junjie, Liu, Yiwen, Li, Dapeng, Sun, Yin, Fu, Shukun, Chen, Siqi, Cao, Yuji

arXiv.org Artificial IntelligenceOct-23-2025

Text summarization is a crucial task that requires the simultaneous optimization of multiple objectives, including consistency, coherence, relevance, and fluency, which presents considerable challenges. Although large language models (LLMs) have demonstrated remarkable performance, enhanced by reinforcement learning (RL), few studies have focused on optimizing the multi-objective problem of summarization through RL based on LLMs. In this paper, we introduce hypervolume optimization (HVO), a novel optimization strategy that dynamically adjusts the scores between groups during the reward process in RL by using the hypervolume method. This method guides the model's optimization to progressively approximate the pareto front, thereby generating balanced summaries across multiple objectives. Experimental results on several representative summarization datasets demonstrate that our method outperforms group relative policy optimization (GRPO) in overall scores and shows more balanced performance across different dimensions. Moreover, a 7B foundation model enhanced by HVO performs comparably to GPT-4 in the summarization task, while maintaining a shorter generation length. Our code is publicly available at https://github.com/ai4business-LiAuto/HVO.git

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2510.19325

Country:

Asia > China (0.28)
Asia > Middle East > UAE (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.37)

Add feedback

Automatic Question & Answer Generation Using Generative Large Language Model (LLM)

Ehsan, Md. Alvee, Hasan, A. S. M Mehedi, Shahnoor, Kefaya Benta, Tasneem, Syeda Sumaiya

arXiv.org Artificial IntelligenceSep-30-2025

In the realm of education, student evaluation holds equal significance to imparting knowledge. To be evaluated, students usually need to go through text-based academic assessment methods. Instructors need to make a diverse set of questions that need to be fair for all students to prove their adequacy over a particular topic. This can prove to be quite challenging as they may need to manually go through several different lecture materials. Our objective is to make this whole process much easier by implementing Automatic Question Answer Generation(AQAG), using a fine-tuned generative LLM. For tailoring the instructor's preferred question style (MCQ, conceptual, or factual questions), Prompt Engineering (PE) is being utilized. In this research, we propose to leverage unsupervised learning methods in NLP, primarily focusing on the English language. This approach empowers the base Meta-Llama 2-7B model to integrate the RACE dataset as training data for the fine-tuning process. Creating a customized model that will offer efficient solutions for educators, instructors, and individuals engaged in text-based evaluations. A reliable and efficient tool for generating questions and answers can free up valuable time and resources, thus streamlining their evaluation processes.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2508.19475

Genre: Research Report > New Finding (0.93)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AdvSumm: Adversarial Training for Bias Mitigation in Text Summarization

Gupta, Mukur, Varimalla, Nikhil Reddy, Deas, Nicholas, Subbiah, Melanie, McKeown, Kathleen

arXiv.org Artificial IntelligenceSep-23-2025

Large Language Models (LLMs) have achieved impressive performance in text summarization and are increasingly deployed in real-world applications. However, these systems often inherit associative and framing biases from pre-training data, leading to inappropriate or unfair outputs in downstream tasks. In this work, we present AdvSumm (Adversarial Summarization), a domain-agnostic training framework designed to mitigate bias in text summarization through improved generalization. Inspired by adversarial robustness, AdvSumm introduces a novel Perturber component that applies gradient-guided perturbations at the embedding level of Sequence-to-Sequence models, enhancing the model's robustness to input variations. We empirically demonstrate that AdvSumm effectively reduces different types of bias in summarization-specifically, name-nationality bias and political framing bias-without compromising summarization quality. Compared to standard transformers and data augmentation techniques like back-translation, AdvSumm achieves stronger bias mitigation performance across benchmark datasets.

computational linguistic, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2506.06273

Country:

Europe (0.93)
North America > United States > Texas (0.14)

Genre: Research Report (1.00)

Industry: Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Efficient Extractive Text Summarization for Online News Articles Using Machine Learning

Biswas, Sajib, Biswas, Milon, Mandal, Arunima, Liza, Fatema Tabassum, Sarker, Joy

arXiv.org Artificial IntelligenceSep-22-2025

In the age of information overload, content management for online news articles relies on efficient summarization to enhance accessibility and user engagement. This article addresses the challenge of extractive text summarization by employing advanced machine learning techniques to generate concise and coherent summaries while preserving the original meaning. Using the Cornell Newsroom dataset, comprising 1.3 million article-summary pairs, we developed a pipeline leveraging BERT embeddings to transform textual data into numerical representations. By framing the task as a binary classification problem, we explored various models, including logistic regression, feed-forward neural networks, and long short-term memory (LSTM) networks. Our findings demonstrate that LSTM networks, with their ability to capture sequential dependencies, outperform baseline methods like Lede-3 and simpler models in F1 score and ROUGE-1 metrics. This study underscores the potential of automated summarization in improving content management systems for online news platforms, enabling more efficient content organization and enhanced user experiences.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2509.15614

Country: North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Media > News (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback