legal entity
AugAbEx : Way Forward for Extractive Case Summarization
Bindal, Purnima, Kumar, Vikas, Rathore, Sagar, Bhatnagar, Vasudha
Summarization of legal judgments poses a heavy cognitive burden on law practitioners due to the complexity of the language, context-sensitive legal jargon, and the length of the document. Therefore, the automatic summarization of legal documents has attracted serious attention from natural language processing researchers. Since the abstractive summaries of legal documents generated by deep neural methods remain prone to the risk of misrepresenting nuanced legal jargon or overlooking key contextual details, we envisage a rising trend toward the use of extractive case summarizers. Given the high cost of human annotation for gold standard extractive summaries, we engineer a light and transparent pipeline that leverages existing abstractive gold standard summaries to create the corresponding extractive gold standard versions. The approach ensures that the experts` opinions ensconced in the original gold standard abstractive summaries are carried over to the transformed extractive summaries. We aim to augment seven existing case summarization datasets, which include abstractive summaries, by incorporating corresponding extractive summaries and create an enriched data resource for case summarization research community. To ensure the quality of the augmented extractive summaries, we perform an extensive comparative evaluation with the original abstractive gold standard summaries covering structural, lexical, and semantic dimensions. We also compare the domain-level information of the two summaries. We commit to release the augmented datasets in the public domain for use by the research community and believe that the resource will offer opportunities to advance the field of automatic summarization of legal documents.
Summarisation of German Judgments in conjunction with a Class-based Evaluation
Steffes, Bianca, Wiedemann, Nils Torben, Gratz, Alexander, Hochreither, Pamela, Meyer, Jana Elina, Schilke, Katharina Luise
The automated summarisation of long legal documents can be a great aid for legal experts in their daily work. We automatically create summaries (guiding principles) of German judgments by fine-tuning a decoder-based large language model. We enrich the judgments with information about legal entities before the training. For the evaluation of the created summaries, we define a set of evaluation classes which allows us to measure their language, pertinence, completeness and correctness. Our results show that employing legal entities helps the generative model to find the relevant content, but the quality of the created summaries is not yet sufficient for a use in practice.
LegalViz: Legal Text Visualization by Text To Diagram Generation
Onami, Eri, Miyanishi, Taiki, Maeda, Koki, Kurita, Shuhei
Legal documents including judgments and court orders require highly sophisticated legal knowledge for understanding. To disclose expert knowledge for non-experts, we explore the problem of visualizing legal texts with easy-to-understand diagrams and propose a novel dataset of LegalViz with 23 languages and 7,010 cases of legal document and visualization pairs, using the DOT graph description language of Graphviz. LegalViz provides a simple diagram from a complicated legal corpus identifying legal entities, transactions, legal sources, and statements at a glance, that are essential in each judgment. In addition, we provide new evaluation metrics for the legal diagram visualization by considering graph structures, textual similarities, and legal contents. We conducted empirical studies on few-shot and finetuning large language models for generating legal diagrams and evaluated them with these metrics, including legal content-based evaluation within 23 languages. Models trained with LegalViz outperform existing models including GPTs, confirming the effectiveness of our dataset.
Understand Legal Documents with Contextualized Large Language Models
The growth of pending legal cases in populous countries, such as India, has become a major issue. Developing effective techniques to process and understand legal documents is extremely useful in resolving this problem. In this paper, we present our systems for SemEval-2023 Task 6: understanding legal texts (Modi et al., 2023). Specifically, we first develop the Legal-BERT-HSLN model that considers the comprehensive context information in both intra- and inter-sentence levels to predict rhetorical roles (subtask A) and then train a Legal-LUKE model, which is legal-contextualized and entity-aware, to recognize legal entities (subtask B). Our evaluations demonstrate that our designed models are more accurate than baselines, e.g., with an up to 15.0% better F1 score in subtask B. We achieved notable performance in the task leaderboard, e.g., 0.834 micro F1 score, and ranked No.5 out of 27 teams in subtask A.
Better Transcription of UK Supreme Court Hearings
Saadany, Hadeel, Breslin, Catherine, Orฤsan, Constantin, Walker, Sophie
Transcription of legal proceedings is very important to enable access to justice. However, speech transcription is an expensive and slow process. In this paper we describe part of a combined research and industrial project for building an automated transcription tool designed specifically for the Justice sector in the UK. We explain the challenges involved in transcribing court room hearings and the Natural Language Processing (NLP) techniques we employ to tackle these challenges. We will show that fine-tuning a generic off-the-shelf pre-trained Automatic Speech Recognition (ASR) system with an in-domain language model as well as infusing common phrases extracted with a collocation detection model can improve not only the Word Error Rate (WER) of the transcribed hearings but avoid critical errors that are specific of the legal jargon and terminology commonly used in British courts.
CoLES: Contrastive Learning for Event Sequences with Self-Supervision
Babaev, Dmitrii, Kireev, Ivan, Ovsov, Nikita, Ivanova, Mariya, Gusev, Gleb, Nazarov, Ivan, Tuzhilin, Alexander
We address the problem of self-supervised learning on discrete event sequences generated by real-world users. Self-supervised learning incorporates complex information from the raw data in low-dimensional fixed-length vector representations that could be easily applied in various downstream machine learning tasks. In this paper, we propose a new method "CoLES", which adapts contrastive learning, previously used for audio and computer vision domains, to the discrete event sequences domain in a self-supervised setting. We deployed CoLES embeddings based on sequences of transactions at the large European financial services company. Usage of CoLES embeddings significantly improves the performance of the pre-existing models on downstream tasks and produces significant financial gains, measured in hundreds of millions of dollars yearly. We also evaluated CoLES on several public event sequences datasets and showed that CoLES representations consistently outperform other methods on different downstream tasks.
The Recommendations Regarding Data Protection in the Field of Artificial Intelligence
The Recommendations on Data Protection in the Field of Artificial Intelligence (the "Recommendations") was published by the Turkish Personal Data Protection Authority (the "DPA")1 on its website on 15 September 2021. The scope of the Recommendations address the Developers, Manufacturers, Service Providers and Decision Makers in accordance with the Law on the Protection of Personal Data numbered 6698 and its secondary legislation (the "Law"). This is the first time that DPA has published a document regarding data protection regarding AI-based applications. The Recommendations consist of three parts, namely: (i) general recommendations; (ii) the recommendations for developers; manufacturers and service providers and (iii) recommendations for decision makers. Under the Recommendations the term Artificial Intelligence (the "AI") is defined as the human-specific abilities to be analysed and passed to machines.
Rights for robots: why we need better AI regulation
We live in a world where humans aren't the only ones that have rights. In the eyes of the law, artificial entities have a legal persona too. Corporations, partnerships or nation states also have the same rights and responsibility as human beings. With rapidly evolving technologies, is it time our legal system considered a similar status for artificial intelligence (AI) and robots? "AI is already impacting most aspects of our lives. Given its pervasiveness, how this technology is developed is raising profound legal and ethical questions that need to be addressed," says Julian David, chief executive of industry body techUK.
Clustering-based Automatic Construction of Legal Entity Knowledge Base from Contracts
Song, Fuqi, de la Clergerie, รric
In contract analysis and contract automation, a knowledge base (KB) of legal entities is fundamental for performing tasks such as contract verification, contract generation and contract analytic. However, such a KB does not always exist nor can be produced in a short time. In this paper, we propose a clustering-based approach to automatically generate a reliable knowledge base of legal entities from given contracts without any supplemental references. The proposed method is robust to different types of errors brought by pre-processing such as Optical Character Recognition (OCR) and Named Entity Recognition (NER), as well as editing errors such as typos. We evaluate our method on a dataset that consists of 800 real contracts with various qualities from 15 clients. Compared to the collected ground-truth data, our method is able to recall 84\% of the knowledge.
COPYRIGHT AND ARTIFICIAL INTELLIGENCE - Law Insider
Such work may be protected once the creation becomes an expression of the author and not merely an idea. It refers to the right to enjoy the subject matter and use the same for economic purposes. On the other hand, the artworks based on Artificial Intelligence are relied heavily on the programmer who gives the input for creation of the work. However, with technological advancement, AI has developed a capability of understanding and creating outputs without any human interference.[7] The main issue raised, is regarding the protection of work created by AI.