Goto

Collaborating Authors

 Overview


ChatGPT is not Enough: Enhancing Large Language Models with Knowledge Graphs for Fact-aware Language Modeling

arXiv.org Artificial Intelligence

Recently, ChatGPT, a representative large language model (LLM), has gained considerable attention due to its powerful emergent abilities. Some researchers suggest that LLMs could potentially replace structured knowledge bases like knowledge graphs (KGs) and function as parameterized knowledge bases. However, while LLMs are proficient at learning probabilistic language patterns based on large corpus and engaging in conversations with humans, they, like previous smaller pre-trained language models (PLMs), still have difficulty in recalling facts while generating knowledge-grounded contents. To overcome these limitations, researchers have proposed enhancing data-driven PLMs with knowledge-based KGs to incorporate explicit factual knowledge into PLMs, thus improving their performance to generate texts requiring factual knowledge and providing more informed responses to user queries. This paper reviews the studies on enhancing PLMs with KGs, detailing existing knowledge graph enhanced pre-trained language models (KGPLMs) as well as their applications. Inspired by existing studies on KGPLM, this paper proposes to enhance LLMs with KGs by developing knowledge graph-enhanced large language models (KGLLMs). KGLLM provides a solution to enhance LLMs' factual reasoning ability, opening up new avenues for LLM research.


Why can neural language models solve next-word prediction? A mathematical perspective

arXiv.org Artificial Intelligence

Recently, deep learning has revolutionized the field of natural language processing, with neural language models proving to be very effective for next-word prediction. However, a rigorous theoretical explanation for their success in the context of formal language theory has not yet been developed, as it is unclear why neural language models can learn the combinatorial rules that govern the next-word prediction task. In this paper, we study a class of formal languages that can be used to model real-world examples of English sentences. We construct neural language models can solve the next-word prediction task in this context with zero error. Our proof highlights the different roles of the embedding layer and the fully connected component within the neural language model.


Balanced Mixture of SuperNets for Learning the CNN Pooling Architecture

arXiv.org Artificial Intelligence

Downsampling layers, including pooling and strided convolutions, are crucial components of the convolutional neural network architecture that determine both the granularity/scale of image feature analysis as well as the receptive field size of a given layer. To fully understand this problem, we analyse the performance of models independently trained with each pooling configurations on CIFAR10, using a ResNet20 network, and show that the position of the downsampling layers can highly influence the performance of a network and predefined downsampling configurations are not optimal. Network Architecture Search (NAS) might be used to optimize downsampling configurations as an hyperparameter. However, we find that common one-shot NAS based on a single SuperNet does not work for this problem. We argue that this is because a SuperNet trained for finding the optimal pooling configuration fully shares its parameters among all pooling configurations. This makes its training hard, because learning some configurations can harm the performance of others. Therefore, we propose a balanced mixture of SuperNets that automatically associates pooling configurations to different weight models and helps to reduce the weight-sharing and inter-influence of pooling configurations on the SuperNet parameters. We evaluate our proposed approach on CIFAR10, CIFAR100, as well as Food101 and show that in all cases, our model outperforms other approaches and improves over the default pooling configurations.


Designing Explainable Predictive Machine Learning Artifacts: Methodology and Practical Demonstration

arXiv.org Artificial Intelligence

Machine learning (ML) is a focal element of digitization that affects many areas of modern society: besides driving a plethora of physical and virtual products already woven into our daily lives, such as smartphones and social media platforms, ML techniques can be leveraged to power a wide range of business applications [1, 2]. Although ML as an umbrella term comprises various techniques, some of which are aimed at different purposes, most ML algorithms are designed to calculate empirical predictions based on given data [2]. This prediction-oriented approach to ML is widely referred to as supervised learning, predictive analytics, or predictive modeling, and initially requires at least two data sets: one for model training and one for testing [2, 3]. While the former allows a given ML algorithm to "learn" patterns that connect the model input and output, the latter serves to evaluate the predictive accuracy of a trained model. In practice, if a corresponding ML model is attributed to possess a sufficient degree of predictive power, it may be deployed in a productive environment to compute real-world predictions, e.g., to support managerial decision making. The application of supervised learning in business contexts is highly relevant as it may drive applications in the fields of predictive maintenance, financial fraud detection, personalized product recommendation, and more. Consequently, the global ML market size was valued at US$ 34.56 billion in 2021 and is expected to grow to US$ 74.99 billion by 2028 at a compound annual growth rate of 25.7% [4]. Given the enormous business potential of ML, a considerable number of companies have already begun to launch data analytics initiatives to automate their processes or support their decision making over the last years.


Advancing Biomedicine with Graph Representation Learning: Recent Progress, Challenges, and Future Directions

arXiv.org Artificial Intelligence

Objectives: Graph representation learning (GRL) has emerged as a pivotal field that has contributed significantly to breakthroughs in various fields, including biomedicine. The objective of this survey is to review the latest advancements in GRL methods and their applications in the biomedical field. We also highlight key challenges currently faced by GRL and outline potential directions for future research. Methods: We conducted a comprehensive search of multiple databases, including PubMed, Web of Science, IEEE Xplore, and Google Scholar, to collect relevant publications from the past two years (2021-2022). The studies selected for review were based on their relevance to the topic and the publication quality.


Survey of Malware Analysis through Control Flow Graph using Machine Learning

arXiv.org Artificial Intelligence

Malware is a significant threat to the security of computer systems and networks which requires sophisticated techniques to analyze the behavior and functionality for detection. Traditional signature-based malware detection methods have become ineffective in detecting new and unknown malware due to their rapid evolution. One of the most promising techniques that can overcome the limitations of signature-based detection is to use control flow graphs (CFGs). CFGs leverage the structural information of a program to represent the possible paths of execution as a graph, where nodes represent instructions and edges represent control flow dependencies. Machine learning (ML) algorithms are being used to extract these features from CFGs and classify them as malicious or benign. In this survey, we aim to review some state-of-the-art methods for malware detection through CFGs using ML, focusing on the different ways of extracting, representing, and classifying. Specifically, we present a comprehensive overview of different types of CFG features that have been used as well as different ML algorithms that have been applied to CFG-based malware detection. We provide an in-depth analysis of the challenges and limitations of these approaches, as well as suggest potential solutions to address some open problems and promising future directions for research in this field.


A Survey on Safety-Critical Driving Scenario Generation -- A Methodological Perspective

arXiv.org Artificial Intelligence

Autonomous driving systems have witnessed a significant development during the past years thanks to the advance in machine learning-enabled sensing and decision-making algorithms. One critical challenge for their massive deployment in the real world is their safety evaluation. Most existing driving systems are still trained and evaluated on naturalistic scenarios collected from daily life or heuristically-generated adversarial ones. However, the large population of cars, in general, leads to an extremely low collision rate, indicating that the safety-critical scenarios are rare in the collected real-world data. Thus, methods to artificially generate scenarios become crucial to measure the risk and reduce the cost. In this survey, we focus on the algorithms of safety-critical scenario generation in autonomous driving. We first provide a comprehensive taxonomy of existing algorithms by dividing them into three categories: data-driven generation, adversarial generation, and knowledge-based generation. Then, we discuss useful tools for scenario generation, including simulation platforms and packages. Finally, we extend our discussion to five main challenges of current works -- fidelity, efficiency, diversity, transferability, controllability -- and research opportunities lighted up by these challenges.


Recent Advances in Direct Speech-to-text Translation

arXiv.org Artificial Intelligence

Recently, speech-to-text translation has attracted more and more attention and many studies have emerged rapidly. In this paper, we present a comprehensive survey on direct speech translation aiming to summarize the current state-of-the-art techniques. First, we categorize the existing research work into three directions based on the main challenges -- modeling burden, data scarcity, and application issues. To tackle the problem of modeling burden, two main structures have been proposed, encoder-decoder framework (Transformer and the variants) and multitask frameworks. For the challenge of data scarcity, recent work resorts to many sophisticated techniques, such as data augmentation, pre-training, knowledge distillation, and multilingual modeling. We analyze and summarize the application issues, which include real-time, segmentation, named entity, gender bias, and code-switching. Finally, we discuss some promising directions for future work.


The Cultivated Practices of Text-to-Image Generation

arXiv.org Artificial Intelligence

Humankind is entering a novel creative era in which anybody can synthesize digital information using generative artificial intelligence (AI). Text-to-image generation, in particular, has become vastly popular and millions of practitioners produce AI-generated images and AI art online. This chapter first gives an overview of the key developments that enabled a healthy co-creative online ecosystem around text-to-image generation to rapidly emerge, followed by a high-level description of key elements in this ecosystem. A particular focus is placed on prompt engineering, a creative practice that has been embraced by the AI art community. It is then argued that the emerging co-creative ecosystem constitutes an intelligent system on its own - a system that both supports human creativity, but also potentially entraps future generations and limits future development efforts in AI. The chapter discusses the potential risks and dangers of cultivating this co-creative ecosystem, such as the bias inherent in today's training data, potential quality degradation in future image generation systems due to synthetic data becoming common place, and the potential long-term effects of text-to-image generation on people's imagination, ambitions, and development.


Unifying Large Language Models and Knowledge Graphs: A Roadmap

arXiv.org Artificial Intelligence

Large language models (LLMs), such as ChatGPT and GPT4, are making new waves in the field of natural language processing and artificial intelligence, due to their emergent ability and generalizability. However, LLMs are black-box models, which often fall short of capturing and accessing factual knowledge. In contrast, Knowledge Graphs (KGs), Wikipedia and Huapu for example, are structured knowledge models that explicitly store rich factual knowledge. KGs can enhance LLMs by providing external knowledge for inference and interpretability. Meanwhile, KGs are difficult to construct and evolving by nature, which challenges the existing methods in KGs to generate new facts and represent unseen knowledge. Therefore, it is complementary to unify LLMs and KGs together and simultaneously leverage their advantages. In this article, we present a forward-looking roadmap for the unification of LLMs and KGs. Our roadmap consists of three general frameworks, namely, 1) KG-enhanced LLMs, which incorporate KGs during the pre-training and inference phases of LLMs, or for the purpose of enhancing understanding of the knowledge learned by LLMs; 2) LLM-augmented KGs, that leverage LLMs for different KG tasks such as embedding, completion, construction, graph-to-text generation, and question answering; and 3) Synergized LLMs + KGs, in which LLMs and KGs play equal roles and work in a mutually beneficial way to enhance both LLMs and KGs for bidirectional reasoning driven by both data and knowledge. We review and summarize existing efforts within these three frameworks in our roadmap and pinpoint their future research directions.