Nguyen, Le-Minh
Improving the Robustness of Representation Misdirection for Large Language Model Unlearning
Huu-Tien, Dang, Thanh-Tung, Hoang, Nguyen, Le-Minh, Inoue, Naoya
Representation Misdirection (RM) and variants are established large language model (LLM) unlearning methods with state-of-the-art performance. In this paper, we show that RM methods inherently reduce models' robustness, causing them to misbehave even when a single non-adversarial forget-token is in the retain-query. Toward understanding underlying causes, we reframe the unlearning process as backdoor attacks and defenses: forget-tokens act as backdoor triggers that, when activated in retain-queries, cause disruptions in RM models' behaviors, similar to successful backdoor attacks. To mitigate this vulnerability, we propose Random Noise Augmentation -- a model and method agnostic approach with theoretical guarantees for improving the robustness of RM methods. Extensive experiments demonstrate that RNA significantly improves the robustness of RM models while enhancing the unlearning performances.
ZeFaV: Boosting Large Language Models for Zero-shot Fact Verification
Luu, Son T., Nguyen, Hiep, Vo, Trung, Nguyen, Le-Minh
In this paper, we propose ZeFaV - a zero-shot based fact-checking verification framework to enhance the performance on fact verification task of large language models by leveraging the in-context learning ability of large language models to extract the relations among the entities within a claim, re-organized the information from the evidence in a relationally logical form, and combine the above information with the original evidence to generate the context from which our fact-checking model provide verdicts for the input claims. We conducted empirical experiments to evaluate our approach on two multi-hop fact-checking datasets including HoVer and FEVEROUS, and achieved potential results results comparable to other state-of-the-art fact verification task methods.
VLSP 2023 -- LTER: A Summary of the Challenge on Legal Textual Entailment Recognition
Tran, Vu, Nguyen, Ha-Thanh, Vo, Trung, Luu, Son T., Dang, Hoang-Anh, Le, Ngoc-Cam, Le, Thi-Thuy, Nguyen, Minh-Tien, Nguyen, Truong-Son, Nguyen, Le-Minh
In this new era of rapid AI development, especially in language processing, the demand for AI in the legal domain is increasingly critical. In the context where research in other languages such as English, Japanese, and Chinese has been well-established, we introduce the first fundamental research for the Vietnamese language in the legal domain: legal textual entailment recognition through the Vietnamese Language and Speech Processing workshop. In analyzing participants' results, we discuss certain linguistic aspects critical in the legal domain that pose challenges that need to be addressed.
Employing Label Models on ChatGPT Answers Improves Legal Text Entailment Performance
Nguyen, Chau, Nguyen, Le-Minh
The objective of legal text entailment is to ascertain whether the assertions in a legal query logically follow from the information provided in one or multiple legal articles. ChatGPT, a large language model, is robust in many natural language processing tasks, including legal text entailment: when we set the temperature = 0 (the ChatGPT answers are deterministic) and prompt the model, it achieves 70.64% accuracy on COLIEE 2022 dataset, which outperforms the previous SOTA of 67.89%. On the other hand, if the temperature is larger than zero, ChatGPT answers are not deterministic, leading to inconsistent answers and fluctuating results. We propose to leverage label models (a fundamental component of weak supervision techniques) to integrate the provisional answers by ChatGPT into consolidated labels. By that way, we treat ChatGPT provisional answers as noisy predictions which can be consolidated by label models. The experimental results demonstrate that this approach can attain an accuracy of 76.15%, marking a significant improvement of 8.26% over the prior state-of-the-art benchmark. Additionally, we perform an analysis of the instances where ChatGPT produces incorrect answers, then we classify the errors, offering insights that could guide potential enhancements for future research endeavors.
CAPTAIN at COLIEE 2023: Efficient Methods for Legal Information Retrieval and Entailment Tasks
Nguyen, Chau, Nguyen, Phuong, Tran, Thanh, Nguyen, Dat, Trieu, An, Pham, Tin, Dang, Anh, Nguyen, Le-Minh
The Competition on Legal Information Extraction/Entailment (COLIEE) is held annually to encourage advancements in the automatic processing of legal texts. Processing legal documents is challenging due to the intricate structure and meaning of legal language. In this paper, we outline our strategies for tackling Task 2, Task 3, and Task 4 in the COLIEE 2023 competition. Our approach involved utilizing appropriate state-of-the-art deep learning methods, designing methods based on domain characteristics observation, and applying meticulous engineering practices and methodologies to the competition. As a result, our performance in these tasks has been outstanding, with first places in Task 2 and Task 3, and promising results in Task 4. Our source code is available at https://github.com/Nguyen2015/CAPTAIN-COLIEE2023/tree/coliee2023.
DA-TransUNet: Integrating Spatial and Channel Dual Attention with Transformer U-Net for Medical Image Segmentation
Sun, Guanqun, Pan, Yizhi, Kong, Weikun, Xu, Zichang, Ma, Jianhua, Racharak, Teeradaj, Nguyen, Le-Minh, Xin, Junyi
Accurate medical image segmentation is critical for disease quantification and treatment evaluation. While traditional Unet architectures and their transformer-integrated variants excel in automated segmentation tasks. However, they lack the ability to harness the intrinsic position and channel features of image. Existing models also struggle with parameter efficiency and computational complexity, often due to the extensive use of Transformers. To address these issues, this study proposes a novel deep medical image segmentation framework, called DA-TransUNet, aiming to integrate the Transformer and dual attention block(DA-Block) into the traditional U-shaped architecture. Unlike earlier transformer-based U-net models, DA-TransUNet utilizes Transformers and DA-Block to integrate not only global and local features, but also image-specific positional and channel features, improving the performance of medical image segmentation. By incorporating a DA-Block at the embedding layer and within each skip connection layer, we substantially enhance feature extraction capabilities and improve the efficiency of the encoder-decoder structure. DA-TransUNet demonstrates superior performance in medical image segmentation tasks, consistently outperforming state-of-the-art techniques across multiple datasets. In summary, DA-TransUNet offers a significant advancement in medical image segmentation, providing an effective and powerful alternative to existing techniques. Our architecture stands out for its ability to improve segmentation accuracy, thereby advancing the field of automated medical image diagnostics. The codes and parameters of our model will be publicly available at https://github.com/SUN-1024/DA-TransUnet.
Improving Vietnamese Legal Question--Answering System based on Automatic Data Enrichment
Vuong, Thi-Hai-Yen, Nguyen, Ha-Thanh, Nguyen, Quang-Huy, Nguyen, Le-Minh, Phan, Xuan-Hieu
Question answering (QA) in law is a challenging problem because legal documents are much more complicated than normal texts in terms of terminology, structure, and temporal and logical relationships. It is even more difficult to perform legal QA for low-resource languages like Vietnamese where labeled data are rare and pre-trained language models are still limited. In this paper, we try to overcome these limitations by implementing a Vietnamese article-level retrieval-based legal QA system and introduce a novel method to improve the performance of language models by improving data quality through weak labeling. Our hypothesis is that in contexts where labeled data are limited, efficient data enrichment can help increase overall performance. Our experiments are designed to test multiple aspects, which demonstrate the effectiveness of the proposed technique.
Law to Binary Tree -- An Formal Interpretation of Legal Natural Language
Nguyen, Ha-Thanh, Tran, Vu, Le, Ngoc-Cam, Le, Thi-Thuy, Nguyen, Quang-Huy, Nguyen, Le-Minh, Satoh, Ken
Knowledge representation and reasoning in law are essential to facilitate the automation of legal analysis and decision-making tasks. In this paper, we propose a new approach based on legal science, specifically legal taxonomy, for representing and reasoning with legal documents. Our approach interprets the regulations in legal documents as binary trees, which facilitates legal reasoning systems to make decisions and resolve logical contradictions. The advantages of this approach are twofold. First, legal reasoning can be performed on the basis of the binary tree representation of the regulations. Second, the binary tree representation of the regulations is more understandable than the existing sentence-based representations. We provide an example of how our approach can be used to interpret the regulations in a legal document.
Attentive Deep Neural Networks for Legal Document Retrieval
Nguyen, Ha-Thanh, Phi, Manh-Kien, Ngo, Xuan-Bach, Tran, Vu, Nguyen, Le-Minh, Tu, Minh-Phuong
Legal text retrieval serves as a key component in a wide range of legal text processing tasks such as legal question answering, legal case entailment, and statute law retrieval. The performance of legal text retrieval depends, to a large extent, on the representation of text, both query and legal documents. Based on good representations, a legal text retrieval model can effectively match the query to its relevant documents. Because legal documents often contain long articles and only some parts are relevant to queries, it is quite a challenge for existing models to represent such documents. In this paper, we study the use of attentive neural network-based text representation for statute law document retrieval. We propose a general approach using deep neural networks with attention mechanisms. Based on it, we develop two hierarchical architectures with sparse attention to represent long sentences and articles, and we name them Attentive CNN and Paraformer. The methods are evaluated on datasets of different sizes and characteristics in English, Japanese, and Vietnamese. Experimental results show that: i) Attentive neural methods substantially outperform non-neural methods in terms of retrieval performance across datasets and languages; ii) Pretrained transformer-based models achieve better accuracy on small datasets at the cost of high computational complexity while lighter weight Attentive CNN achieves better accuracy on large datasets; and iii) Our proposed Paraformer outperforms state-of-the-art methods on COLIEE dataset, achieving the highest recall and F2 scores in the top-N retrieval task.
Transformer-based Approaches for Legal Text Processing
Nguyen, Ha-Thanh, Nguyen, Minh-Phuong, Vuong, Thi-Hai-Yen, Bui, Minh-Quan, Nguyen, Minh-Chau, Dang, Tran-Binh, Tran, Vu, Nguyen, Le-Minh, Satoh, Ken
In this paper, we introduce our approaches using Transformer-based models for different problems of the COLIEE 2021 automatic legal text processing competition. Automated processing of legal documents is a challenging task because of the characteristics of legal documents as well as the limitation of the amount of data. With our detailed experiments, we found that Transformer-based pretrained language models can perform well with automated legal text processing problems with appropriate approaches. We describe in detail the processing steps for each task such as problem formulation, data processing and augmentation, pretraining, finetuning. In addition, we introduce to the community two pretrained models that take advantage of parallel translations in legal domain, NFSP and NMSP. In which, NFSP achieves the state-of-the-art result in Task 5 of the competition. Although the paper focuses on technical reporting, the novelty of its approaches can also be an useful reference in automated legal document processing using Transformer-based models.