dbr
VANER: Leveraging Large Language Model for Versatile and Adaptive Biomedical Named Entity Recognition
Biana, Junyi, Zhai, Weiqi, Huang, Xiaodi, Zheng, Jiaxuan, Zhu, Shanfeng
Prevalent solution for BioNER involves using representation learning techniques coupled with sequence labeling. However, such methods are inherently task-specific, demonstrate poor generalizability, and often require dedicated model for each dataset. To leverage the versatile capabilities of recently remarkable large language models (LLMs), several endeavors have explored generative approaches to entity extraction. Yet, these approaches often fall short of the effectiveness of previouly sequence labeling approaches. In this paper, we utilize the open-sourced LLM LLaMA2 as the backbone model, and design specific instructions to distinguish between different types of entities and datasets. By combining the LLM's understanding of instructions with sequence labeling techniques, we use mix of datasets to train a model capable of extracting various types of entities. Given that the backbone LLMs lacks specialized medical knowledge, we also integrate external entity knowledge bases and employ instruction tuning to compel the model to densely recognize carefully curated entities. Our model VANER, trained with a small partition of parameters, significantly outperforms previous LLMs-based models and, for the first time, as a model based on LLM, surpasses the majority of conventional state-of-the-art BioNER systems, achieving the highest F1 scores across three datasets.
- Asia > China > Shanghai > Shanghai (0.05)
- Europe > Serbia > Central Serbia > Belgrade (0.04)
- Oceania > Australia > New South Wales > Goulburn County > Albury (0.04)
- North America > United States > Colorado (0.04)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Therapeutic Area > Neurology (0.68)
Mitigating Covariate Shift in Misspecified Regression with Applications to Reinforcement Learning
Amortila, Philip, Cao, Tongyi, Krishnamurthy, Akshay
A pervasive phenomenon in machine learning applications is distribution shift, where training and deployment conditions for a machine learning model differ. As distribution shift typically results in a degradation in performance, much attention has been devoted to algorithmic interventions that mitigate these detrimental effects. In this paper, we study the effect of distribution shift in the presence of model misspecification, specifically focusing on $L_{\infty}$-misspecified regression and adversarial covariate shift, where the regression target remains fixed while the covariate distribution changes arbitrarily. We show that empirical risk minimization, or standard least squares regression, can result in undesirable misspecification amplification where the error due to misspecification is amplified by the density ratio between the training and testing distributions. As our main result, we develop a new algorithm -- inspired by robust optimization techniques -- that avoids this undesirable behavior, resulting in no misspecification amplification while still obtaining optimal statistical rates. As applications, we use this regression procedure to obtain new guarantees in offline and online reinforcement learning with misspecification and establish new separations between previously studied structural conditions and notions of coverage.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- Asia > Middle East > Jordan (0.04)
- Instructional Material (0.66)
- Research Report > New Finding (0.46)
On Exploring the Reasoning Capability of Large Language Models with Knowledge Graphs
Lo, Pei-Chi, Tsai, Yi-Hang, Lim, Ee-Peng, Hwang, San-Yih
This paper examines the capacity of LLMs to reason with knowledge graphs using their internal knowledge graph, i.e., the knowledge graph they learned during pre-training. Two research questions are formulated to investigate the accuracy of LLMs in recalling information from pre-training knowledge graphs and their ability to infer knowledge graph relations from context. To address these questions, we employ LLMs to perform four distinct knowledge graph reasoning tasks. Furthermore, we identify two types of hallucinations that may occur during knowledge reasoning with LLMs: content and ontology hallucination. Our experimental results demonstrate that LLMs can successfully tackle both simple and complex knowledge graph reasoning tasks from their own memory, as well as infer from input context.
- Asia > Taiwan > Taiwan Province > Taipei (0.05)
- Asia > Singapore (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- (4 more...)
- Media > Film (1.00)
- Leisure & Entertainment (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.33)
Validating ChatGPT Facts through RDF Knowledge Graphs and Sentence Similarity
Mountantonakis, Michalis, Tzitzikas, Yannis
Since ChatGPT offers detailed responses without justifications, and erroneous facts even for popular persons, events and places, in this paper we present a novel pipeline that retrieves the response of ChatGPT in RDF and tries to validate the ChatGPT facts using one or more RDF Knowledge Graphs (KGs). To this end we leverage DBpedia and LODsyndesis (an aggregated Knowledge Graph that contains 2 billion triples from 400 RDF KGs of many domains) and short sentence embeddings, and introduce an algorithm that returns the more relevant triple(s) accompanied by their provenance and a confidence score. This enables the validation of ChatGPT responses and their enrichment with justifications and provenance. To evaluate this service (such services in general), we create an evaluation benchmark that includes 2,000 ChatGPT facts; specifically 1,000 facts for famous Greek Persons, 500 facts for popular Greek Places, and 500 facts for Events related to Greece. The facts were manually labelled (approximately 73% of ChatGPT facts were correct and 27% of facts were erroneous). The results are promising; indicatively for the whole benchmark, we managed to verify the 85.3% of the correct facts of ChatGPT and to find the correct answer for the 58% of the erroneous ChatGPT facts.
- Europe > Italy > Lazio > Rome (0.04)
- Europe > Greece > Central Macedonia > Thessaloniki (0.04)
- Asia > Japan > Honshū > Kansai > Hyogo Prefecture > Kobe (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Fuzzy Attention Neural Network to Tackle Discontinuity in Airway Segmentation
Nan, Yang, Del Ser, Javier, Tang, Zeyu, Tang, Peng, Xing, Xiaodan, Fang, Yingying, Herrera, Francisco, Pedrycz, Witold, Walsh, Simon, Yang, Guang
Airway segmentation is crucial for the examination, diagnosis, and prognosis of lung diseases, while its manual delineation is unduly burdensome. To alleviate this time-consuming and potentially subjective manual procedure, researchers have proposed methods to automatically segment airways from computerized tomography (CT) images. However, some small-sized airway branches (e.g., bronchus and terminal bronchioles) significantly aggravate the difficulty of automatic segmentation by machine learning models. In particular, the variance of voxel values and the severe data imbalance in airway branches make the computational module prone to discontinuous and false-negative predictions. especially for cohorts with different lung diseases. Attention mechanism has shown the capacity to segment complex structures, while fuzzy logic can reduce the uncertainty in feature representations. Therefore, the integration of deep attention networks and fuzzy theory, given by the fuzzy attention layer, should be an escalated solution for better generalization and robustness. This paper presents an efficient method for airway segmentation, comprising a novel fuzzy attention neural network and a comprehensive loss function to enhance the spatial continuity of airway segmentation. The deep fuzzy set is formulated by a set of voxels in the feature map and a learnable Gaussian membership function. Different from the existing attention mechanism, the proposed channel-specific fuzzy attention addresses the issue of heterogeneous features in different channels. Furthermore, a novel evaluation metric is proposed to assess both the continuity and completeness of airway structures. The efficiency, generalization and robustness of the proposed method have been proved by training on normal lung disease while testing on datasets of lung cancer, COVID-19 and pulmonary fibrosis.
- Europe > United Kingdom > England > Greater London > London (0.04)
- Asia > Middle East > Saudi Arabia > Mecca Province > Jeddah (0.04)
- North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
- (6 more...)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.68)
- Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.88)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Dual Behavior Regularized Reinforcement Learning
Siu, Chapman, Traish, Jason, Da Xu, Richard Yi
Reinforcement learning has been shown to perform a range of complex tasks through interaction with an environment or collected leveraging experience. However, many of these approaches presume optimal or near optimal experiences or the presence of a consistent environment. In this work we propose dual, advantage-based behavior policy based on counterfactual regret minimization. We demonstrate the flexibility of this approach and how it can be adapted to online contexts where the environment is available to collect experiences and a variety of other contexts. We demonstrate this new algorithm can outperform several strong baseline models in different contexts based on a range of continuous environments. Additional ablations provide insights into how our dual behavior regularized reinforcement learning approach is designed compared with other plausible modifications and demonstrates its ability to generalize.
- Oceania > Australia > New South Wales > Sydney (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- (2 more...)
Knowledge Graph Question Answering via SPARQL Silhouette Generation
Purkayastha, Sukannya, Dana, Saswati, Garg, Dinesh, Khandelwal, Dinesh, Bhargav, G P Shrivatsa
Knowledge Graph Question Answering (KGQA) has become a prominent area in natural language processing due to the emergence of large-scale Knowledge Graphs (KGs). Recently Neural Machine Translation based approaches are gaining momentum that translates natural language queries to structured query languages thereby solving the KGQA task. However, most of these methods struggle with out-of-vocabulary words where test entities and relations are not seen during training time. In this work, we propose a modular two-stage neural architecture to solve the KGQA task. The first stage generates a sketch of the target SPARQL called SPARQL silhouette for the input question. This comprises of (1) Noise simulator to facilitate out-of-vocabulary words and to reduce vocabulary size (2) seq2seq model for text to SPARQL silhouette generation. The second stage is a Neural Graph Search Module. SPARQL silhouette generated in the first stage is distilled in the second stage by substituting precise relation in the predicted structure. We simulate ideal and realistic scenarios by designing a noise simulator. Experimental results show that the quality of generated SPARQL silhouette in the first stage is outstanding for the ideal scenarios but for realistic scenarios (i.e. noisy linker), the quality of the resulting SPARQL silhouette drops drastically. However, our neural graph search module recovers it considerably. We show that our method can achieve reasonable performance improving the state-of-art by a margin of 3.72% F1 for the LC-QuAD-1 dataset. We believe, our proposed approach is novel and will lead to dynamic KGQA solutions that are suited for practical applications.
- Europe > United Kingdom (0.04)
- Antarctica > French Southern and Antarctic Lands (0.04)
- Africa > French Southern and Antarctic Lands (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.88)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.81)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Question Answering over Knowledge Bases by Leveraging Semantic Parsing and Neuro-Symbolic Reasoning
Kapanipathi, Pavan, Abdelaziz, Ibrahim, Ravishankar, Srinivas, Roukos, Salim, Gray, Alexander, Astudillo, Ramon, Chang, Maria, Cornelio, Cristina, Dana, Saswati, Fokoue, Achille, Garg, Dinesh, Gliozzo, Alfio, Gurajada, Sairam, Karanam, Hima, Khan, Naweed, Khandelwal, Dinesh, Lee, Young-Suk, Li, Yunyao, Luus, Francois, Makondo, Ndivhuwo, Mihindukulasooriya, Nandana, Naseem, Tahira, Neelam, Sumit, Popa, Lucian, Reddy, Revanth, Riegel, Ryan, Rossiello, Gaetano, Sharma, Udit, Bhargav, G P Shrivatsa, Yu, Mo
Knowledge base question answering (KBQA) is an important task in Natural Language Processing. Existing approaches face significant challenges including complex question understanding, necessity for reasoning, and lack of large training datasets. In this work, we propose a semantic parsing and reasoning-based Neuro-Symbolic Question Answering(NSQA) system, that leverages (1) Abstract Meaning Representation (AMR) parses for task-independent question under-standing; (2) a novel path-based approach to transform AMR parses into candidate logical queries that are aligned to the KB; (3) a neuro-symbolic reasoner called Logical Neural Net-work (LNN) that executes logical queries and reasons over KB facts to provide an answer; (4) system of systems approach,which integrates multiple, reusable modules that are trained specifically for their individual tasks (e.g. semantic parsing,entity linking, and relationship linking) and do not require end-to-end training data. NSQA achieves state-of-the-art performance on QALD-9 and LC-QuAD 1.0. NSQA's novelty lies in its modular neuro-symbolic architecture and its task-general approach to interpreting natural language questions.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > Canada (0.06)
- Europe > Spain > Galicia > Madrid (0.05)
- (4 more...)
Uncovering the Semantics of Wikipedia Categories
Heist, Nicolas, Paulheim, Heiko
Two of the most prominent public knowledge graphs, DBpedia [16] and YAGO [18], build rich taxonomies using Wikipedia's infoboxes and category graph, respectively. They describe more than five million entities and contain multiple hundred millions of triples [27]. When it comes to relation assertions (RAs), however, we observe - even for basic properties - a rather low coverage: More than 50% of the 1.35 million persons in DBpedia have no birthplace assigned; even more than 80% of birthplaces are missing in YAGO. At the same time, type assertions (TAs) are not present as well for many instances - for example, there are about half a million persons in DBpedia not explicitly typed as such [23]. Missing knowledge in Wikipedia-based knowledge graphs can be attributed to absent information in Wikipedia, but also to the extraction procedures of knowledge graphs. DBpedia uses infobox mappings to extract RAs for individual instances, but it does not explicate any information implicitly encoded in categories. YAGO uses manually defined patterns to assign RAs to entities of matching categories. For example, they extract a person's year of birth by
- Asia > Middle East > Iran > East Azerbaijan Province > Tabriz (0.04)
- Asia > India (0.04)
- North America > United States > Pennsylvania > Chester County (0.04)
- (5 more...)
- Leisure & Entertainment (1.00)
- Media > Music (0.47)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.48)
Seq2RDF: An end-to-end application for deriving Triples from Natural Language Text
Liu, Yue, Zhang, Tongtao, Liang, Zhicheng, Ji, Heng, McGuinness, Deborah L.
We present an end-to-end approach that takes unstructured textual input and generates structured output compliant with a given vocabulary. Inspired by recent successes in neural machine translation, we treat the triples within a given knowledge graph as an independent graph language and propose an encoder-decoder framework with an attention mechanism that leverages knowledge graph embeddings. Our model learns the mapping from natural language text to triple representation in the form of subject-predicate-object using the selected knowledge graph vocabulary. Experiments on three different data sets show that we achieve competitive F1-Measures over the baselines using our simple yet effective approach. A demo video is included.
- Europe > Germany (0.05)
- Oceania > Australia > New South Wales (0.05)
- North America > United States > New York (0.05)
- North America > United States > California > Santa Clara County > Palo Alto (0.05)