AITopics | Mihindukulasooriya, Nandana

Collaborating Authors

Mihindukulasooriya, Nandana

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Scholarly Wikidata: Population and Exploration of Conference Data in Wikidata using LLMs

Mihindukulasooriya, Nandana, Tiwari, Sanju, Dobriy, Daniil, Nielsen, Finn Årup, Chhetri, Tek Raj, Polleres, Axel

arXiv.org Artificial IntelligenceNov-13-2024

Several initiatives have been undertaken to conceptually model the domain of scholarly data using ontologies and to create respective Knowledge Graphs. Yet, the full potential seems unleashed, as automated means for automatic population of said ontologies are lacking, and respective initiatives from the Semantic Web community are not necessarily connected: we propose to make scholarly data more sustainably accessible by leveraging Wikidata's infrastructure and automating its population in a sustainable manner through LLMs by tapping into unstructured sources like conference Web sites and proceedings texts as well as already existing structured conference datasets. While an initial analysis shows that Semantic Web conferences are only minimally represented in Wikidata, we argue that our methodology can help to populate, evolve and maintain scholarly data as a community within Wikidata. Our main contributions include (a) an analysis of ontologies for representing scholarly data to identify gaps and relevant entities/properties in Wikidata, (b) semi-automated extraction -- requiring (minimal) manual validation -- of conference metadata (e.g., acceptance rates, organizer roles, programme committee members, best paper awards, keynotes, and sponsors) from websites and proceedings texts using LLMs. Finally, we discuss (c) extensions to visualization tools in the Wikidata context for data exploration of the generated scholarly data. Our study focuses on data from 105 Semantic Web-related conferences and extends/adds more than 6000 entities in Wikidata. It is important to note that the method can be more generally applicable beyond Semantic Web-related conferences for enhancing Wikidata's utility as a comprehensive scholarly resource. Source Repository: https://github.com/scholarly-wikidata/ DOI: https://doi.org/10.5281/zenodo.10989709 License: Creative Commons CC0 (Data), MIT (Code)

artificial intelligence, large language model, natural language, (20 more...)

arXiv.org Artificial Intelligence

2411.08696

Country:

North America > United States (0.46)
Europe > Austria > Vienna (0.14)
Asia > Japan > Honshū (0.14)

Genre:

Research Report (0.82)
Personal > Honors (0.34)

Industry: Information Technology (0.34)

Technology:

Information Technology > Communications > Web > Semantic Web (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.92)

Add feedback

Research Trends for the Interplay between Large Language Models and Knowledge Graphs

Khorashadizadeh, Hanieh, Amara, Fatima Zahra, Ezzabady, Morteza, Ieng, Frédéric, Tiwari, Sanju, Mihindukulasooriya, Nandana, Groppe, Jinghua, Sahri, Soror, Benamara, Farah, Groppe, Sven

arXiv.org Artificial IntelligenceJun-12-2024

This survey investigates the synergistic relationship between Large Language Models (LLMs) and Knowledge Graphs (KGs), which is crucial for advancing AI's capabilities in understanding, reasoning, and language processing. It aims to address gaps in current research by exploring areas such as KG Question Answering, ontology generation, KG validation, and the enhancement of KG accuracy and consistency through LLMs. The paper further examines the roles of LLMs in generating descriptive texts and natural language queries for KGs. Through a structured analysis that includes categorizing LLM-KG interactions, examining methodologies, and investigating collaborative uses and potential biases, this study seeks to provide new insights into the combined potential of LLMs and KGs. It highlights the importance of their interaction for improving AI applications and outlines future research directions.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2406.08223

Country:

Europe > France (0.28)
Africa > Middle East > Algeria (0.14)
Asia > India > NCT (0.14)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Matching Table Metadata with Business Glossaries Using Large Language Models

Lobo, Elita, Hassanzadeh, Oktie, Pham, Nhan, Mihindukulasooriya, Nandana, Subramanian, Dharmashankar, Samulowitz, Horst

arXiv.org Artificial IntelligenceSep-7-2023

Enterprises often own large collections of structured data in the form of large databases or an enterprise data lake. Such data collections come with limited metadata and strict access policies that could limit access to the data contents and, therefore, limit the application of classic retrieval and analysis solutions. As a result, there is a need for solutions that can effectively utilize the available metadata. In this paper, we study the problem of matching table metadata to a business glossary containing data labels and descriptions. The resulting matching enables the use of an available or curated business glossary for retrieval and analysis without or before requesting access to the data contents. One solution to this problem is to use manually-defined rules or similarity measures on column names and glossary descriptions (or their vector embeddings) to find the closest match. However, such approaches need to be tuned through manual labeling and cannot handle many business glossaries that contain a combination of simple as well as complex and long descriptions. In this work, we leverage the power of large language models (LLMs) to design generic matching methods that do not require manual tuning and can identify complex relations between column names and glossaries. We propose methods that utilize LLMs in two ways: a) by generating additional context for column names that can aid with matching b) by using LLMs to directly infer if there is a relation between column names and glossary descriptions. Our preliminary experimental results show the effectiveness of our proposed methods.

artificial intelligence, large language model, natural language, (2 more...)

arXiv.org Artificial Intelligence

2309.11506

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Text2KGBench: A Benchmark for Ontology-Driven Knowledge Graph Generation from Text

Mihindukulasooriya, Nandana, Tiwari, Sanju, Enguix, Carlos F., Lata, Kusum

arXiv.org Artificial IntelligenceAug-4-2023

The recent advances in large language models (LLM) and foundation models with emergent capabilities have been shown to improve the performance of many NLP tasks. LLMs and Knowledge Graphs (KG) can complement each other such that LLMs can be used for KG construction or completion while existing KGs can be used for different tasks such as making LLM outputs explainable or fact-checking in Neuro-Symbolic manner. In this paper, we present Text2KGBench, a benchmark to evaluate the capabilities of language models to generate KGs from natural language text guided by an ontology. Given an input ontology and a set of sentences, the task is to extract facts from the text while complying with the given ontology (concepts, relations, domain/range constraints) and being faithful to the input sentences. We provide two datasets (i) Wikidata-TekGen with 10 ontologies and 13,474 sentences and (ii) DBpedia-WebNLG with 19 ontologies and 4,860 sentences. We define seven evaluation metrics to measure fact extraction performance, ontology conformance, and hallucinations by LLMs. Furthermore, we provide results for two baseline models, Vicuna-13B and Alpaca-LoRA-13B using automatic prompt generation from test cases. The baseline results show that there is room for improvement using both Semantic Web and Natural Language Processing techniques.

artificial intelligence, natural language, ontology, (14 more...)

arXiv.org Artificial Intelligence

2308.02357

Country:

North America > Mexico (0.14)
Asia > Middle East > UAE (0.14)
Europe > Spain (0.14)

Genre: Research Report (0.70)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Finspector: A Human-Centered Visual Inspection Tool for Exploring and Comparing Biases among Foundation Models

Kwon, Bum Chul, Mihindukulasooriya, Nandana

arXiv.org Artificial IntelligenceMay-26-2023

Pre-trained transformer-based language models are becoming increasingly popular due to their exceptional performance on various benchmarks. However, concerns persist regarding the presence of hidden biases within these models, which can lead to discriminatory outcomes and reinforce harmful stereotypes. To address this issue, we propose Finspector, a human-centered visual inspection tool designed to detect biases in different categories through log-likelihood scores generated by language models. The goal of the tool is to enable researchers to easily identify potential biases using visual analytics, ultimately contributing to a fairer and more just deployment of these models in both academic and industrial settings. Finspector is available at https://github.com/IBM/finspector.

finspector, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2305.16937

Country: North America > United States > Louisiana (0.14)

Genre:

Overview (0.46)
Research Report (0.40)

Industry: Information Technology (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Exploring In-Context Learning Capabilities of Foundation Models for Generating Knowledge Graphs from Text

Khorashadizadeh, Hanieh, Mihindukulasooriya, Nandana, Tiwari, Sanju, Groppe, Jinghua, Groppe, Sven

arXiv.org Artificial IntelligenceMay-15-2023

Knowledge graphs can represent information about the real-world using entities and their relations in a structured and semantically rich manner and they enable a variety of downstream applications such as question-answering, recommendation systems, semantic search, and advanced analytics. However, at the moment, building a knowledge graph involves a lot of manual effort and thus hinders their application in some situations and the automation of this process might benefit especially for small organizations. Automatically generating structured knowledge graphs from a large volume of natural language is still a challenging task and the research on sub-tasks such as named entity extraction, relation extraction, entity and relation linking, and knowledge graph construction aims to improve the state of the art of automatic construction and completion of knowledge graphs from text. The recent advancement of foundation models with billions of parameters trained in a self-supervised manner with large volumes of training data that can be adapted to a variety of downstream tasks has helped to demonstrate high performance on a large range of Natural Language Processing (NLP) tasks. In this context, one emerging paradigm is in-context learning where a language model is used as it is with a prompt that provides instructions and some examples to perform a task without changing the parameters of the model using traditional approaches such as fine-tuning. This way, no computing resources are needed for re-training/fine-tuning the models and the engineering effort is minimal. Thus, it would be beneficial to utilize such capabilities for generating knowledge graphs from text.

artificial intelligence, natural language, survey article, (13 more...)

arXiv.org Artificial Intelligence

2305.08804

Country:

North America (0.46)
Europe (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.99)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Learning to Transpile AMR into SPARQL

Bornea, Mihaela, Astudillo, Ramon Fernandez, Naseem, Tahira, Mihindukulasooriya, Nandana, Abdelaziz, Ibrahim, Kapanipathi, Pavan, Florian, Radu, Roukos, Salim

arXiv.org Artificial IntelligenceDec-8-2022

We propose a transition-based system to transpile Abstract Meaning Representation (AMR) into SPARQL for Knowledge Base Question Answering (KBQA). This allows us to delegate part of the semantic representation to a strongly pre-trained semantic parser, while learning transpiling with small amount of paired data. We depart from recent work relating AMR and SPARQL constructs, but rather than applying a set of rules, we teach a BART model to selectively use these relations. Further, we avoid explicitly encoding AMR but rather encode the parser state in the attention mechanism of BART, following recent semantic parsing works. The resulting model is simple, provides supporting text for its decisions, and outperforms recent approaches in KBQA across two knowledge bases: DBPedia (LC-QuAD 1.0, QALD-9) and Wikidata (WebQSP, SWQ-WD).

artificial intelligence, natural language, relation, (19 more...)

arXiv.org Artificial Intelligence

2112.07877

Country: North America > United States > Minnesota (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.75)

Add feedback

KnowGL: Knowledge Generation and Linking from Text

Rossiello, Gaetano, Chowdhury, Md Faisal Mahbub, Mihindukulasooriya, Nandana, Cornec, Owen, Gliozzo, Alfio Massimiliano

arXiv.org Artificial IntelligenceNov-22-2022

We propose KnowGL, a tool that allows converting text into structured relational data represented as a set of ABox assertions compliant with the TBox of a given Knowledge Graph (KG), such as Wikidata. We address this problem as a sequence generation task by leveraging pre-trained sequence-to-sequence language models, e.g. BART. Given a sentence, we fine-tune such models to detect pairs of entity mentions and jointly generate a set of facts consisting of the full set of semantic annotations for a KG, such as entity labels, entity types, and their relationships. To showcase the capabilities of our tool, we build a web application consisting of a set of UI widgets that help users to navigate through the semantic data extracted from a given input text. We make the KnowGL model available at https://huggingface.co/ibm/knowgl-large.

computational linguistic, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2210.13952

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Web > Semantic Web (0.51)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.36)

Add feedback

A Benchmark for Generalizable and Interpretable Temporal Question Answering over Knowledge Bases

Neelam, Sumit, Sharma, Udit, Karanam, Hima, Ikbal, Shajith, Kapanipathi, Pavan, Abdelaziz, Ibrahim, Mihindukulasooriya, Nandana, Lee, Young-Suk, Srivastava, Santosh, Pendus, Cezar, Dana, Saswati, Garg, Dinesh, Fokoue, Achille, Bhargav, G P Shrivatsa, Khandelwal, Dinesh, Ravishankar, Srinivas, Gurajada, Sairam, Chang, Maria, Uceda-Sosa, Rosario, Roukos, Salim, Gray, Alexander, Lima, Guilherme, Riegel, Ryan, Luus, Francois, Subramaniam, L Venkata

arXiv.org Artificial IntelligenceJan-15-2022

Knowledge Base Question Answering (KBQA) tasks that involve complex reasoning are emerging as an important research direction. However, most existing KBQA datasets focus primarily on generic multi-hop reasoning over explicit facts, largely ignoring other reasoning types such as temporal, spatial, and taxonomic reasoning. In this paper, we present a benchmark dataset for temporal reasoning, TempQA-WD, to encourage research in extending the present approaches to target a more challenging set of complex reasoning tasks. Specifically, our benchmark is a temporal question answering dataset with the following advantages: (a) it is based on Wikidata, which is the most frequently curated, openly available knowledge base, (b) it includes intermediate sparql queries to facilitate the evaluation of semantic parsing based approaches for KBQA, and (c) it generalizes to multiple knowledge bases: Freebase and Wikidata. The TempQA-WD dataset is available at https://github.com/IBM/tempqa-wd.

artificial intelligence, natural language, question answering, (21 more...)

arXiv.org Artificial Intelligence

2201.05793

Country:

North America > United States > New Mexico (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report (0.40)

Industry: Government > Regional Government > North America Government > United States Government (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.94)

Add feedback

Applying a Generic Sequence-to-Sequence Model for Simple and Effective Keyphrase Generation

Chowdhury, Md Faisal Mahbub, Rossiello, Gaetano, Glass, Michael, Mihindukulasooriya, Nandana, Gliozzo, Alfio

arXiv.org Artificial IntelligenceJan-13-2022

In recent years, a number of keyphrase generation (KPG) approaches were proposed consisting of complex model architectures, dedicated training paradigms and decoding strategies. In this work, we opt for simplicity and show how a commonly used seq2seq language model, BART, can be easily adapted to generate keyphrases from the text in a single batch computation using a simple training procedure. Empirical results on five benchmarks show that our approach is as good as the existing state-of-the-art KPG systems, but using a much simpler and easy to deploy framework.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2201.05302

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback