AITopics | scientific paper

Collaborating Authors

scientific paper

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

cPAPERS: A Dataset of Situated and Multimodal Interactive Conversations in Scientific Papers

Neural Information Processing SystemsMar-21-2026, 05:41:05 GMT

An emerging area of research in situated and multimodal interactive conversations (SIMMC) includes interactions in scientific papers. Since scientific papers are primarily composed of text, equations, figures, and tables, SIMMC methods must be developed specifically for each component to support the depth of inquiry and interactions required by research scientists. This work introduces $Conversational Papers$ (cPAPERS), a dataset of conversational question-answer pairs from reviews of academic papers grounded in these paper components and their associated references from scientific documents available on arXiv. We present a data collection strategy to collect these question-answer pairs from OpenReview and associate them with contextual information from $LaTeX$ source files. Additionally, we present a series of baseline approaches utilizing Large Language Models (LLMs) in both zero-shot and fine-tuned configurations to address the cPAPERS dataset.

artificial intelligence, large language model, natural language, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.99)

Add feedback

7a19a9d527ed544d1272f07b0f8f934e-Paper-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsFeb-16-2026, 00:22:11 GMT

computational linguistic, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(15 more...)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.46)

Add feedback

OpenAI's latest product lets you vibe code science

MIT Technology ReviewJan-27-2026, 18:00:43 GMT

OpenAI's latest product lets you vibe code science Prism is a ChatGPT-powered text editor that automates much of the work involved in writing scientific papers. OpenAI just revealed what its new in-house team, OpenAI for Science, has been up to. The firm has released a free LLM-powered tool for scientists called Prism, which embeds ChatGPT in a text editor for writing scientific papers. The idea is to put ChatGPT front and center inside software that scientists use to write up their work in much the same way that chatbots are now embedded into popular programming editors. Kevin Weil, head of OpenAI for Science, pushes that analogy himself. "I think 2026 will be for AI and science what 2025 was for AI in software engineering," he said at a press briefing yesterday.

large language model, machine learning, natural language, (14 more...)

MIT Technology Review

Country:

North America > United States > Massachusetts (0.05)
North America > United States > California > Alameda County > Berkeley (0.05)
Asia > China (0.05)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.92)

Add feedback

fastbmRAG: A Fast Graph-Based RAG Framework for Efficient Processing of Large-Scale Biomedical Literature

Meng, Guofeng, Shen, Li, Zhong, Qiuyan, Wang, Wei, Zhang, Haizhou, Wang, Xiaozhen

arXiv.org Artificial IntelligenceNov-14-2025

Large language models (LLMs) are rapidly transforming various domains, including biomedicine and healthcare, and demonstrate remarkable potential from scientific research to new drug discovery. Graph-based retrieval-augmented generation (RAG) systems, as a useful application of LLMs, can improve contextual reasoning through structured entity and relationship identification from long-context knowledge, e.g. biomedical literature. Even though many advantages over naive RAGs, most of graph-based RAGs are computationally intensive, which limits their application to large-scale dataset. To address this issue, we introduce fastbmRAG, an fast graph-based RAG optimized for biomedical literature. Utilizing well organized structure of biomedical papers, fastbmRAG divides the construction of knowledge graph into two stages, first drafting graphs using abstracts; and second, refining them using main texts guided by vector-based entity linking, which minimizes redundancy and computational load. Our evaluations demonstrate that fastbmRAG is over 10x faster than existing graph-RAG tools and achieve superior coverage and accuracy to input knowledge. FastbmRAG provides a fast solution for quickly understanding, summarizing, and answering questions about biomedical literature on a large scale. FastbmRAG is public available in https://github.com/menggf/fastbmRAG.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.10014

Country: Asia (0.28)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area (0.99)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Discourse-Aware Scientific Paper Recommendation via QA-Style Summarization and Multi-Level Contrastive Learning

Wang, Shenghua, Yin, Zhen

arXiv.org Artificial IntelligenceNov-6-2025

The rapid growth of open-access (OA) publications has intensified the challenge of identifying relevant scientific papers. Due to privacy constraints and limited access to user interaction data, recent efforts have shifted toward content-based recommendation, which relies solely on textual information. However, existing models typically treat papers as unstructured text, neglecting their discourse organization and thereby limiting semantic completeness and interpretability. To address these limitations, we propose OMRC-MR, a hierarchical framework that integrates QA-style OMRC (Objective, Method, Result, Conclusion) summarization, multi-level contrastive learning, and structure-aware re-ranking for scholarly recommendation. The QA-style summarization module converts raw papers into structured and discourse-consistent representations, while multi-level contrastive objectives align semantic representations across metadata, section, and document levels. The final re-ranking stage further refines retrieval precision through contextual similarity calibration. Experiments on DBLP, S2ORC, and the newly constructed Sci-OMRC dataset demonstrate that OMRC-MR consistently surpasses state-of-the-art baselines, achieving up to 7.2% and 3.8% improvements in Precision@10 and Recall@10, respectively. Additional evaluations confirm that QA-style summarization produces more coherent and factually complete representations. Overall, OMRC-MR provides a unified and interpretable content-based paradigm for scientific paper recommendation, advancing trustworthy and privacy-aware scholarly information retrieval.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.0333

Country: Asia > China (0.14)

Genre:

Overview (1.00)
Research Report (0.82)

Industry:

Law (0.46)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.94)

Add feedback

Chain of Retrieval: Multi-Aspect Iterative Search Expansion and Post-Order Search Aggregation for Full Paper Retrieval

Park, Sangwoo, Baek, Jinheon, Jeong, Soyeong, Hwang, Sung Ju

arXiv.org Artificial IntelligenceNov-4-2025

Scientific paper retrieval, particularly framed as document-to-document retrieval, aims to identify relevant papers in response to a long-form query paper, rather than a short query string. Previous approaches to this task have focused exclusively on abstracts, embedding them into dense vectors as surrogates for full documents and calculating similarity between them. Yet, abstracts offer only sparse and high-level summaries, and such methods primarily optimize one-to-one similarity, overlooking the dynamic relations that emerge among relevant papers during the retrieval process. To address this, we propose Chain of Retrieval(COR), a novel iterative framework for full-paper retrieval. Specifically, CoR decomposes each query paper into multiple aspect-specific views, matches them against segmented candidate papers, and iteratively expands the search by promoting top-ranked results as new queries, thereby forming a tree-structured retrieval process. The resulting retrieval tree is then aggregated in a post-order manner: descendants are first combined at the query level, then recursively merged with their parent nodes, to capture hierarchical relations across iterations. To validate this, we present SCIFULLBENCH, a large-scale benchmark providing both complete and segmented contexts of full papers for queries and candidates, and results show that CoR significantly outperforms existing retrieval baselines. Our code and dataset is available at https://github.com/psw0021/Chain-of-Retrieval.git.

information retrieval, large language model, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2507.10057

Country:

Europe (0.92)
North America > United States > Minnesota (0.27)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment (0.93)
Law > Intellectual Property & Technology Law (0.67)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(7 more...)

Add feedback

MatPROV: A Provenance Graph Dataset of Material Synthesis Extracted from Scientific Literature

Tsuruta, Hirofumi, Kumagai, Masaya

arXiv.org Artificial IntelligenceOct-22-2025

Synthesis procedures play a critical role in materials research, as they directly affect material properties. With data-driven approaches increasingly accelerating materials discovery, there is growing interest in extracting synthesis procedures from scientific literature as structured data. However, existing studies often rely on rigid, domain-specific schemas with predefined fields for structuring synthesis procedures or assume that synthesis procedures are linear sequences of operations, which limits their ability to capture the structural complexity of real-world procedures. To address these limitations, we adopt PROV-DM, an international standard for provenance information, which supports flexible, graph-based modeling of procedures. We present MatPROV, a dataset of PROV-DM-compliant synthesis procedures extracted from scientific literature using large language models. MatPROV captures structural complexities and causal relationships among materials, operations, and conditions through visually intuitive directed graphs. This representation enables machine-interpretable synthesis knowledge, opening opportunities for future research such as automated synthesis planning and optimization.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2509.01042

Genre: Research Report > New Finding (0.46)

Industry:

Law (0.93)
Materials > Chemicals (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.95)

Add feedback

GraphMind: Interactive Novelty Assessment System for Accelerating Scientific Discovery

da Silva, Italo Luis, Yan, Hanqi, Gui, Lin, He, Yulan

arXiv.org Artificial IntelligenceOct-20-2025

Large Language Models (LLMs) show strong reasoning and text generation capabilities, prompting their use in scientific literature analysis, including novelty assessment. While evaluating novelty of scientific papers is crucial for peer review, it requires extensive knowledge of related work, something not all reviewers have. While recent work on LLM-assisted scientific literature analysis supports literature comparison, existing approaches offer limited transparency and lack mechanisms for result traceability via an information retrieval module. To address this gap, we introduce $\textbf{GraphMind}$, an easy-to-use interactive web tool designed to assist users in evaluating the novelty of scientific papers or drafted ideas. Specially, $\textbf{GraphMind}$ enables users to capture the main structure of a scientific paper, explore related ideas through various perspectives, and assess novelty via providing verifiable contextual insights. $\textbf{GraphMind}$ enables users to annotate key elements of a paper, explore related papers through various relationships, and assess novelty with contextual insight. This tool integrates external APIs such as arXiv and Semantic Scholar with LLMs to support annotation, extraction, retrieval and classification of papers. This combination provides users with a rich, structured view of a scientific idea's core contributions and its connections to existing work. $\textbf{GraphMind}$ is available at https://oyarsa.github.io/graphmind and a demonstration video at https://youtu.be/wKbjQpSvwJg. The source code is available at https://github.com/oyarsa/graphmind.

information, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2510.15706

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers

Neural Information Processing SystemsOct-10-2025, 18:04:23 GMT

Seeking answers to questions within long scientific research articles is a crucial area of study that aids readers in quickly addressing their inquiries.

dataset, figure and table, l3score, (16 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.04)
Asia > Indonesia > Bali (0.04)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.96)
(2 more...)

Add feedback

Filters

Collaborating Authors

scientific paper

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

cPAPERS: A Dataset of Situated and Multimodal Interactive Conversations in Scientific Papers

d74033a247989e8f6f3bf9e0c9629fb5-Paper-Datasets_and_Benchmarks_Track.pdf

7a19a9d527ed544d1272f07b0f8f934e-Paper-Datasets_and_Benchmarks_Track.pdf

OpenAI's latest product lets you vibe code science

fastbmRAG: A Fast Graph-Based RAG Framework for Efficient Processing of Large-Scale Biomedical Literature

Discourse-Aware Scientific Paper Recommendation via QA-Style Summarization and Multi-Level Contrastive Learning

Chain of Retrieval: Multi-Aspect Iterative Search Expansion and Post-Order Search Aggregation for Full Paper Retrieval

MatPROV: A Provenance Graph Dataset of Material Synthesis Extracted from Scientific Literature

GraphMind: Interactive Novelty Assessment System for Accelerating Scientific Discovery

SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers