Goto

Collaborating Authors

 Overview


A survey on cutting-edge relation extraction techniques based on language models

arXiv.org Artificial Intelligence

This comprehensive survey delves into the latest advancements in Relation Extraction (RE), a pivotal task in natural language processing essential for applications across biomedical, financial, and legal sectors. This study highlights the evolution and current state of RE techniques by analyzing 137 papers presented at the Association for Computational Linguistics (ACL) conferences over the past four years, focusing on models that leverage language models. Our findings underscore the dominance of BERT-based methods in achieving state-of-the-art results for RE while also noting the promising capabilities of emerging large language models (LLMs) like T5, especially in few-shot relation extraction scenarios where they excel in identifying previously unseen relations.


Online Knowledge Integration for 3D Semantic Mapping: A Survey

arXiv.org Artificial Intelligence

Semantic mapping is a key component of robots operating in and interacting with objects in structured environments. Traditionally, geometric and knowledge representations within a semantic map have only been loosely integrated. However, recent advances in deep learning now allow full integration of prior knowledge, represented as knowledge graphs or language concepts, into sensor data processing and semantic mapping pipelines. Semantic scene graphs and language models enable modern semantic mapping approaches to incorporate graph-based prior knowledge or to leverage the rich information in human language both during and after the mapping process. This has sparked substantial advances in semantic mapping, leading to previously impossible novel applications. This survey reviews these recent developments comprehensively, with a focus on online integration of knowledge into semantic mapping. We specifically focus on methods using semantic scene graphs for integrating symbolic prior knowledge and language models for respective capture of implicit common-sense knowledge and natural language concepts


PersonaCraft: Personalized Full-Body Image Synthesis for Multiple Identities from Single References Using 3D-Model-Conditioned Diffusion

arXiv.org Artificial Intelligence

Personalized image generation has been significantly advanced, enabling the creation of highly realistic and customized images. However, existing methods often struggle with generating images of multiple people due to occlusions and fail to accurately personalize full-body shapes. In this paper, we propose PersonaCraft, a novel approach that combines diffusion models with 3D human modeling to address these limitations. Our method effectively manages occlusions by incorporating 3D-aware pose conditioning with SMPLx-ControlNet and accurately personalizes human full-body shapes through SMPLx fitting. Additionally, PersonaCraft enables user-defined body shape adjustments, adding flexibility for individual body customization. Experimental results demonstrate the superior performance of PersonaCraft in generating high-quality, realistic images of multiple individuals while resolving occlusion issues, thus establishing a new standard for multi-person personalized image synthesis. Project page: https://gwang-kim.github.io/persona_craft


A Novel Word Pair-based Gaussian Sentence Similarity Algorithm For Bengali Extractive Text Summarization

arXiv.org Artificial Intelligence

Extractive Text Summarization is the process of selecting the most representative parts of a larger text without losing any key information. Recent attempts at extractive text summarization in Bengali, either relied on statistical techniques like TF-IDF or used naive sentence similarity measures like the word averaging technique. All of these strategies suffer from expressing semantic relationships correctly. Here, we propose a novel Word pair-based Gaussian Sentence Similarity (WGSS) algorithm for calculating the semantic relation between two sentences. WGSS takes the geometric means of individual Gaussian similarity values of word embedding vectors to get the semantic relationship between sentences. It compares two sentences on a word-to-word basis which rectifies the sentence representation problem faced by the word averaging method. The summarization process extracts key sentences by grouping semantically similar sentences into clusters using the Spectral Clustering algorithm. After clustering, we use TF-IDF ranking to pick the best sentence from each cluster. The proposed method is validated using four different datasets, and it outperformed other recent models by 43.2% on average ROUGE scores (ranging from 2.5% to 95.4%). It is also experimented on other low-resource languages i.e. Turkish, Marathi, and Hindi language, where we find that the proposed method performs as similar as Bengali for these languages. In addition, a new high-quality Bengali dataset is curated which contains 250 articles and a pair of summaries for each of them. We believe this research is a crucial addition to Bengali Natural Language Processing (NLP) research and it can easily be extended into other low-resource languages. We made the implementation of the proposed model and data public on https://github.com/FMOpee/WGSS.


AI in cancer research & care: perspectives of three KU Leuven institutes

AIHub

In 2021, cancer was the second leading cause of death in the European Union. Notably, while Europe constitutes only a tenth of the global population, it accounts for almost a quarter of the world's cancer cases, bearing an economic impact of approximately 100 billion annually. Belgian statistics further highlight that every step forward in cancer treatment and care could significantly alleviate the immense personal and societal burden. Globally, extensive efforts are made through a variety of innovative approaches with the ultimate objectives of better prevention, earlier detection, and improved patient outcomes and care. Personalized medicine is considered the holy grail of cancer care in most of these initiatives. Tailoring interventions to the unique characteristics of individual patients promises to revolutionize cancer care. Artificial intelligence (AI) plays a pivotal role in this transformative journey towards precision medicine, aiding researchers and healthcare professionals in accurately predicting cancer risks, enabling earlier diagnoses, and customizing treatment plans to meet individual needs.


Time-Series Forecasting in Smart Manufacturing Systems: An Experimental Evaluation of the State-of-the-art Algorithms

arXiv.org Artificial Intelligence

TSF is growing in various domains including manufacturing. Although numerous TSF algorithms have been developed recently, the validation and evaluation of algorithms hold substantial value for researchers and practitioners and are missing. This study aims to fill this gap by evaluating the SoTA TSF algorithms on thirteen manufacturing datasets, focusing on their applicability in manufacturing. Each algorithm was selected based on its TSF category to ensure a representative set of algorithms. The evaluation includes different scenarios to evaluate the models using two problem categories and two forecasting horizons. To evaluate the performance, the WAPE was calculated, and additional post hoc analyses were conducted to assess the significance of observed differences. Only algorithms with codes from open-source libraries were utilized, and no hyperparameter tuning was done. This allowed us to evaluate the algorithms as "out-of-the-box" solutions that can be easily implemented, ensuring their usability within the manufacturing by practitioners with limited technical knowledge. This aligns to facilitate the adoption of these techniques in smart manufacturing systems. Based on the results, transformer and MLP-based architectures demonstrated the best performance with MLP-based architecture winning the most scenarios. For univariate TSF, PatchTST emerged as the most robust, particularly for long-term horizons, while for multivariate problems, MLP-based architectures like N-HITS and TiDE showed superior results. The study revealed that simpler algorithms like XGBoost could outperform complex algorithms in certain tasks. These findings challenge the assumption that more sophisticated models produce better results. Additionally, the research highlighted the importance of computational resource considerations, showing variations in runtime and memory usage across different algorithms.


New Faithfulness-Centric Interpretability Paradigms for Natural Language Processing

arXiv.org Artificial Intelligence

As machine learning becomes more widespread and is used in more critical applications, it's important to provide explanations for these models, to prevent unintended behavior. Unfortunately, many current interpretability methods struggle with faithfulness. Therefore, this Ph.D. thesis investigates the question "How to provide and ensure faithful explanations for complex general-purpose neural NLP models?" The main thesis is that we should develop new paradigms in interpretability. This is achieved by first developing solid faithfulness metrics and then applying the lessons learned from this investigation to develop new paradigms. The two new paradigms explored are faithfulness measurable models (FMMs) and self-explanations. The idea in self-explanations is to have large language models explain themselves, we identify that current models are not capable of doing this consistently. However, we suggest how this could be achieved. The idea of FMMs is to create models that are designed such that measuring faithfulness is cheap and precise. This makes it possible to optimize an explanation towards maximum faithfulness, which makes FMMs designed to be explained. We find that FMMs yield explanations that are near theoretical optimal in terms of faithfulness. Overall, from all investigations of faithfulness, results show that post-hoc and intrinsic explanations are by default model and task-dependent. However, this was not the case when using FMMs, even with the same post-hoc explanation methods. This shows, that even simple modifications to the model, such as randomly masking the training dataset, as was done in FMMs, can drastically change the situation and result in consistently faithful explanations. This answers the question of how to provide and ensure faithful explanations.


Basic Research, Lethal Effects: Military AI Research Funding as Enlistment

arXiv.org Artificial Intelligence

In the context of unprecedented U.S. Department of Defense (DoD) budgets, this paper examines the recent history of DoD funding for academic research in algorithmically based warfighting. We draw from a corpus of DoD grant solicitations from 2007 to 2023, focusing on those addressed to researchers in the field of artificial intelligence (AI). Considering the implications of DoD funding for academic research, the paper proceeds through three analytic sections. In the first, we offer a critical examination of the distinction between basic and applied research, showing how funding calls framed as basic research nonetheless enlist researchers in a war fighting agenda. In the second, we offer a diachronic analysis of the corpus, showing how a 'one small problem' caveat, in which affirmation of progress in military technologies is qualified by acknowledgement of outstanding problems, becomes justification for additional investments in research. We close with an analysis of DoD aspirations based on a subset of Defense Advanced Research Projects Agency (DARPA) grant solicitations for the use of AI in battlefield applications. Taken together, we argue that grant solicitations work as a vehicle for the mutual enlistment of DoD funding agencies and the academic AI research community in setting research agendas. The trope of basic research in this context offers shelter from significant moral questions that military applications of one's research would raise, by obscuring the connections that implicate researchers in U.S. militarism.


Can bidirectional encoder become the ultimate winner for downstream applications of foundation models?

arXiv.org Artificial Intelligence

Over the past few decades, Artificial Intelligence(AI) has progressed from the initial machine learning stage to the deep learning stage, and now to the stage of foundational models. Foundational models have the characteristics of pre-training, transfer learning, and self-supervised learning, and pre-trained models can be fine-tuned and applied to various downstream tasks. Under the framework of foundational models, models such as Bidirectional Encoder Representations from Transformers(BERT) and Generative Pre-trained Transformer(GPT) have greatly advanced the development of natural language processing(NLP), especially the emergence of many models based on BERT. BERT broke through the limitation of only using one-way methods for language modeling in pre-training by using a masked language model. It can capture bidirectional context information to predict the masked words in the sequence, this can improve the feature extraction ability of the model. This makes the model very useful for downstream tasks, especially for specialized applications. The model using the bidirectional encoder can better understand the domain knowledge and be better applied to these downstream tasks. So we hope to help understand how this technology has evolved and improved model performance in various natural language processing tasks under the background of foundational models and reveal its importance in capturing context information and improving the model's performance on downstream tasks. This article analyzes one-way and bidirectional models based on GPT and BERT and compares their differences based on the purpose of the model. It also briefly analyzes BERT and the improvements of some models based on BERT. The model's performance on the Stanford Question Answering Dataset(SQuAD) and General Language Understanding Evaluation(GLUE) was compared.


VLM-HOI: Vision Language Models for Interpretable Human-Object Interaction Analysis

arXiv.org Artificial Intelligence

The Large Vision Language Model (VLM) has recently addressed remarkable progress in bridging two fundamental modalities. VLM, trained by a sufficiently large dataset, exhibits a comprehensive understanding of both visual and linguistic to perform diverse tasks. To distill this knowledge accurately, in this paper, we introduce a novel approach that explicitly utilizes VLM as an objective function form for the Human-Object Interaction (HOI) detection task (\textbf{VLM-HOI}). Specifically, we propose a method that quantifies the similarity of the predicted HOI triplet using the Image-Text matching technique. We represent HOI triplets linguistically to fully utilize the language comprehension of VLMs, which are more suitable than CLIP models due to their localization and object-centric nature. This matching score is used as an objective for contrastive optimization. To our knowledge, this is the first utilization of VLM language abilities for HOI detection. Experiments demonstrate the effectiveness of our method, achieving state-of-the-art HOI detection accuracy on benchmarks. We believe integrating VLMs into HOI detection represents important progress towards more advanced and interpretable analysis of human-object interactions.