entity and relationship
- North America > United States (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Software Engineering (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Structured Information Matters: Explainable ICD Coding with Patient-Level Knowledge Graphs
Li, Mingyang, Schlegel, Viktor, Mu, Tingting, Del-Pinto, Warren, Nenadic, Goran
Mapping clinical documents to standardised clinical vocabularies is an important task, as it provides structured data for information retrieval and analysis, which is essential to clinical research, hospital administration and improving patient care. However, manual coding is both difficult and time-consuming, making it impractical at scale. Automated coding can potentially alleviate this burden, improving the availability and accuracy of structured clinical data. The task is difficult to automate, as it requires mapping to high-dimensional and long-tailed target spaces, such as the International Classification of Diseases (ICD). While external knowledge sources have been readily utilised to enhance output code representation, the use of external resources for representing the input documents has been underexplored. In this work, we compute a structured representation of the input documents, making use of document-level knowledge graphs (KGs) that provide a comprehensive structured view of a patient's condition. The resulting knowledge graph efficiently represents the patient-centred input documents with 23\% of the original text while retaining 90\% of the information. We assess the effectiveness of this graph for automated ICD-9 coding by integrating it into the state-of-the-art ICD coding architecture PLM-ICD. Our experiments yield improved Macro-F1 scores by up to 3.20\% on popular benchmarks, while improving training efficiency. We attribute this improvement to different types of entities and relationships in the KG, and demonstrate the improved explainability potential of the approach over the text-only baseline.
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Europe > Greece (0.04)
- Asia > Middle East > Israel (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
- Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.86)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.85)
Exposing Privacy Risks in Graph Retrieval-Augmented Generation
Liu, Jiale, Zhang, Jiahao, Wang, Suhang
Retrieval-Augmented Generation (RAG) is a powerful technique for enhancing Large Language Models (LLMs) with external, up-to-date knowledge. Graph RAG has emerged as an advanced paradigm that leverages graph-based knowledge structures to provide more coherent and contextually rich answers. However, the move from plain document retrieval to structured graph traversal introduces new, under-explored privacy risks. This paper investigates the data extraction vulnerabilities of the Graph RAG systems. We design and execute tailored data extraction attacks to probe their susceptibility to leaking both raw text and structured data, such as entities and their relationships. Our findings reveal a critical trade-off: while Graph RAG systems may reduce raw text leakage, they are significantly more vulnerable to the extraction of structured entity and relationship information. We also explore potential defense mechanisms to mitigate these novel attack surfaces. This work provides a foundational analysis of the unique privacy challenges in Graph RAG and offers insights for building more secure systems.
- North America > United States > California (0.14)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- North America > United States > Pennsylvania > Centre County > University Park (0.04)
- (3 more...)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)
- Government (1.00)
- North America > United States (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Software Engineering (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
mKG-RAG: Multimodal Knowledge Graph-Enhanced RAG for Visual Question Answering
Yuan, Xu, Ning, Liangbo, Fan, Wenqi, Li, Qing
Recently, Retrieval-Augmented Generation (RAG) has been proposed to expand internal knowledge of Multimodal Large Language Models (MLLMs) by incorporating external knowledge databases into the generation process, which is widely used for knowledge-based Visual Question Answering (VQA) tasks. Despite impressive advancements, vanilla RAG-based VQA methods that rely on unstructured documents and overlook the structural relationships among knowledge elements frequently introduce irrelevant or misleading content, reducing answer accuracy and reliability. To overcome these challenges, a promising solution is to integrate multimodal knowledge graphs (KGs) into RAG-based VQA frameworks to enhance the generation by introducing structured multimodal knowledge. Therefore, in this paper, we propose a novel multimodal knowledge-augmented generation framework (mKG-RAG) based on multimodal KGs for knowledge-intensive VQA tasks. Specifically, our approach leverages MLLM-powered keyword extraction and vision-text matching to distill semantically consistent and modality-aligned entities/relationships from multimodal documents, constructing high-quality multimodal KGs as structured knowledge representations. In addition, a dual-stage retrieval strategy equipped with a question-aware multimodal retriever is introduced to improve retrieval efficiency while refining precision. Comprehensive experiments demonstrate that our approach significantly outperforms existing methods, setting a new state-of-the-art for knowledge-based VQA.
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- (2 more...)
Knowledge Editing for Multi-Hop Question Answering Using Semantic Analysis
Simon, Dominic, Ewetz, Rickard
Large Language Models (LLMs) require lightweight avenues of updating stored information that has fallen out of date. Knowledge Editing (KE) approaches have been successful in updating model knowledge for simple factual queries but struggle with handling tasks that require compositional reasoning such as multi-hop question answering (MQA). We observe that existing knowledge editors leverage decompositional techniques that result in illogical reasoning processes. In this paper, we propose a knowledge editor for MQA based on semantic analysis called CHECK. Our framework is based on insights from an analogy between compilers and reasoning using LLMs. Similar to how source code is first compiled before being executed, we propose to semantically analyze reasoning chains before executing the chains to answer questions. Reasoning chains with semantic errors are revised to ensure consistency through logic optimization and re-prompting the LLM model at a higher temperature. We evaluate the effectiveness of CHECK against five state-of-the-art frameworks on four datasets and achieve an average 22.8% improved MQA accuracy.
- North America > United States (0.46)
- Asia > Japan (0.05)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- (7 more...)
Extracting ORR Catalyst Information for Fuel Cell from Scientific Literature
Htet, Hein, Ibrahim, Amgad Ahmed Ali, Sasaki, Yutaka, Asahi, Ryoji
The oxygen reduction reaction (ORR) catalyst plays a critical role in enhancing fuel cell efficiency, making it a key focus in material science research. However, extracting structured information about ORR catalysts from vast scientific literature remains a significant challenge due to the complexity and diversity of textual data. In this study, we propose a named entity recognition (NER) and relation extraction (RE) approach using DyGIE++ with multiple pre-trained BERT variants, including MatSciBERT and PubMedBERT, to extract ORR catalyst-related information from the scientific literature, which is compiled into a fuel cell corpus for materials informatics (FC-CoMIcs). A comprehensive dataset was constructed manually by identifying 12 critical entities and two relationship types between pairs of the entities. Our methodology involves data annotation, integration, and fine-tuning of transformer-based models to enhance information extraction accuracy. We assess the impact of different BERT variants on extraction performance and investigate the effects of annotation consistency. Experimental evaluations demonstrate that the fine-tuned PubMedBERT model achieves the highest NER F1-score of 82.19% and the MatSciBERT model attains the best RE F1-score of 66.10%. Furthermore, the comparison with human annotators highlights the reliability of fine-tuned models for ORR catalyst extraction, demonstrating their potential for scalable and automated literature analysis. The results indicate that domain-specific BERT models outperform general scientific models like BlueBERT for ORR catalyst extraction.
- Asia > Japan (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States (0.04)
- Europe > Netherlands > South Holland > Dordrecht (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.91)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.87)
XGraphRAG: Interactive Visual Analysis for Graph-based Retrieval-Augmented Generation
Wang, Ke, Pan, Bo, Feng, Yingchaojie, Wu, Yuwei, Chen, Jieyi, Zhu, Minfeng, Chen, Wei
Graph-based Retrieval-Augmented Generation (RAG) has shown great capability in enhancing Large Language Model (LLM)'s answer with an external knowledge base. Compared to traditional RAG, it introduces a graph as an intermediate representation to capture better structured relational knowledge in the corpus, elevating the precision and comprehensiveness of generation results. However, developers usually face challenges in analyzing the effectiveness of GraphRAG on their dataset due to GraphRAG's complex information processing pipeline and the overwhelming amount of LLM invocations involved during graph construction and query, which limits GraphRAG interpretability and accessibility. This research proposes a visual analysis framework that helps RAG developers identify critical recalls of GraphRAG and trace these recalls through the GraphRAG pipeline. Based on this framework, we develop XGraphRAG, a prototype system incorporating a set of interactive visualizations to facilitate users' analysis process, boosting failure cases collection and improvement opportunities identification. Our evaluation demonstrates the effectiveness and usability of our approach. Our work is open-sourced and available at https://github.com/Gk0Wk/XGraphRAG.
- Asia > Middle East > Israel (0.05)
- Asia > China (0.04)
- North America > United States > New York > Kings County > New York City (0.04)
- (2 more...)
- Questionnaire & Opinion Survey (1.00)
- Workflow (0.68)
- Research Report (0.64)
D4R -- Exploring and Querying Relational Graphs Using Natural Language and Large Language Models -- the Case of Historical Documents
Boeglin, Michel, Kahn, David, Mothe, Josiane, Ortiz, Diego, Panzoli, David
D4R is a digital platform designed to assist non-technical users, particularly historians, in exploring textual documents through advanced graphical tools for text analysis and knowledge extraction. By leveraging a large language model, D4R translates natural language questions into Cypher queries, enabling the retrieval of data from a Neo4J database. A user-friendly graphical interface allows for intuitive interaction, enabling users to navigate and analyse complex relational data extracted from unstructured textual documents. Originally designed to bridge the gap between AI technologies and historical research, D4R's capabilities extend to various other domains. A demonstration video and a live software demo are available.
- Europe > France > Occitanie > Haute-Garonne > Toulouse (0.05)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Europe > Spain (0.04)
- (2 more...)