Case-Based Reasoning
U-CREAT: Unsupervised Case Retrieval using Events extrAcTion
Joshi, Abhinav, Sharma, Akshat, Tanikella, Sai Kiran, Modi, Ashutosh
The task of Prior Case Retrieval (PCR) in the legal domain is about automatically citing relevant (based on facts and precedence) prior legal cases in a given query case. To further promote research in PCR, in this paper, we propose a new large benchmark (in English) for the PCR task: IL-PCR (Indian Legal Prior Case Retrieval) corpus. Given the complex nature of case relevance and the long size of legal documents, BM25 remains a strong baseline for ranking the cited prior documents. In this work, we explore the role of events in legal case retrieval and propose an unsupervised retrieval method-based pipeline U-CREAT (Unsupervised Case Retrieval using Events Extraction). We find that the proposed unsupervised retrieval method significantly increases performance compared to BM25 and makes retrieval faster by a considerable margin, making it applicable to real-time case retrieval systems. Our proposed system is generic, we show that it generalizes across two different legal systems (Indian and Canadian), and it shows state-of-the-art performance on the benchmarks for both the legal systems (IL-PCR and COLIEE corpora).
Mao-Zedong At SemEval-2023 Task 4: Label Represention Multi-Head Attention Model With Contrastive Learning-Enhanced Nearest Neighbor Mechanism For Multi-Label Text Classification
Zhang, Che, Liu, Ping'an, Xiao, Zhenyang, Fei, Haojun
The study of human values is essential in both practical and theoretical domains. With the development of computational linguistics, the creation of large-scale datasets has made it possible to automatically recognize human values accurately. SemEval 2023 Task 4\cite{kiesel:2023} provides a set of arguments and 20 types of human values that are implicitly expressed in each argument. In this paper, we present our team's solution. We use the Roberta\cite{liu_roberta_2019} model to obtain the word vector encoding of the document and propose a multi-head attention mechanism to establish connections between specific labels and semantic components. Furthermore, we use a contrastive learning-enhanced K-nearest neighbor mechanism\cite{su_contrastive_2022} to leverage existing instance information for prediction. Our approach achieved an F1 score of 0.533 on the test set and ranked fourth on the leaderboard.
Online nearest neighbor classification
We study an instance of online non-parametric classification in the realizable setting. In particular, we consider the classical 1-nearest neighbor algorithm, and show that it achieves sublinear regret - that is, a vanishing mistake rate - against dominated or smoothed adversaries in the realizable setting.
A Self-Encoder for Learning Nearest Neighbors
Boschin, Armand, Bonald, Thomas, Jeanmougin, Marc
We present the self-encoder, a neural network trained to guess the identity of each data sample. Despite its simplicity, it learns a very useful representation of data, in a self-supervised way. Specifically, the self-encoder learns to distribute the data samples in the embedding space so that they are linearly separable from one another. This induces a geometry where two samples are close in the embedding space when they are not easy to differentiate. The self-encoder can then be combined with a nearest-neighbor classifier or regressor for any subsequent supervised task. Unlike regular nearest neighbors, the predictions resulting from this encoding of data are invariant to any scaling of features, making any preprocessing like min-max scaling not necessary. The experiments show the efficiency of the approach, especially on heterogeneous data mixing numerical features and categorical features.
Understanding the Role of Human Intuition on Reliance in Human-AI Decision-Making with Explanations
Chen, Valerie, Liao, Q. Vera, Vaughan, Jennifer Wortman, Bansal, Gagan
AI explanations are often mentioned as a way to improve human-AI decision-making, but empirical studies have not found consistent evidence of explanations' effectiveness and, on the contrary, suggest that they can increase overreliance when the AI system is wrong. While many factors may affect reliance on AI support, one important factor is how decision-makers reconcile their own intuition -- beliefs or heuristics, based on prior knowledge, experience, or pattern recognition, used to make judgments -- with the information provided by the AI system to determine when to override AI predictions. We conduct a think-aloud, mixed-methods study with two explanation types (feature- and example-based) for two prediction tasks to explore how decision-makers' intuition affects their use of AI predictions and explanations, and ultimately their choice of when to rely on AI. Our results identify three types of intuition involved in reasoning about AI predictions and explanations: intuition about the task outcome, features, and AI limitations. Building on these, we summarize three observed pathways for decision-makers to apply their own intuition and override AI predictions. We use these pathways to explain why (1) the feature-based explanations we used did not improve participants' decision outcomes and increased their overreliance on AI, and (2) the example-based explanations we used improved decision-makers' performance over feature-based explanations and helped achieve complementary human-AI performance. Overall, our work identifies directions for further development of AI decision-support systems and explanation methods that help decision-makers effectively apply their intuition to achieve appropriate reliance on AI.
Adaptation Approaches for Nearest Neighbor Language Models
Bhardwaj, Rishabh, Polovets, George, Sunkara, Monica
Semi-parametric Nearest Neighbor Language Models ($k$NN-LMs) have produced impressive gains over purely parametric LMs, by leveraging large-scale neighborhood retrieval over external memory datastores. However, there has been little investigation into adapting such models for new domains. This work attempts to fill that gap and suggests the following approaches for adapting $k$NN-LMs -- 1) adapting the underlying LM (using Adapters), 2) expanding neighborhood retrieval over an additional adaptation datastore, and 3) adapting the weights (scores) of retrieved neighbors using a learned Rescorer module. We study each adaptation strategy separately, as well as the combined performance improvement through ablation experiments and an extensive set of evaluations run over seven adaptation domains. Our combined adaptation approach consistently outperforms purely parametric adaptation and zero-shot ($k$NN-LM) baselines that construct datastores from the adaptation data. On average, we see perplexity improvements of 17.1% and 16% for these respective baselines, across domains.
"Private Prediction Strikes Back!'' Private Kernelized Nearest Neighbors with Individual Renyi Filter
Zhu, Yuqing, Zhao, Xuandong, Guo, Chuan, Wang, Yu-Xiang
Despite its many advantages, private training lacks the flexibility in adapting to incremental changes to the training dataset such as deletion requests from exercising GDPR's right to be forgotten. We revisit a long-forgotten alternative, known as private prediction [Dwork and Feldman, 2018], and propose a new algorithm named Individual Kernelized Nearest Neighbor (Ind-KNN). Ind-KNN is easily updatable over dataset changes and it allows precise control of the Rรฉnyi DP at an individual user level -- a user's privacy loss is measured by the exact amount of her contribution to predictions; and a user is removed if her prescribed privacy budget runs out. Our results show that Ind-KNN consistently improves the accuracy over existing private prediction methods for a wide range of ษ on four vision and language tasks. We also illustrate several cases under which Ind-KNN is preferable over private training with NoisySGD.
NOWJ at COLIEE 2023 -- Multi-Task and Ensemble Approaches in Legal Information Processing
Vuong, Thi-Hai-Yen, Nguyen, Hai-Long, Nguyen, Tan-Minh, Nguyen, Hoang-Trung, Nguyen, Thai-Binh, Nguyen, Ha-Thanh
This paper presents the NOWJ team's approach to the COL-IEE 2023 Competition, which focuses on advancing legal information processing techniques and applying them to real-world legal scenarios. Our team tackles the four tasks in the competition, which involve legal case retrieval, legal case entailment, statute law retrieval, and legal textual entailment. We employ state-of-the-art machine learning models and innovative approaches, such as BERT, Longformer, BM25-ranking algorithm, and multi-task learning models. Although our team did not achieve state-of-the-art results, our findings provide valuable insights and pave the way for future improvements in legal information processing.
Voice Conversion With Just Nearest Neighbors
Baas, Matthew, van Niekerk, Benjamin, Kamper, Herman
Any-to-any voice conversion aims to transform source speech into a target voice with just a few examples of the target speaker as a reference. Recent methods produce convincing conversions, but at the cost of increased complexity -- making results difficult to reproduce and build on. Instead, we keep it simple. We propose k-nearest neighbors voice conversion (kNN-VC): a straightforward yet effective method for any-to-any conversion. First, we extract self-supervised representations of the source and reference speech. To convert to the target speaker, we replace each frame of the source representation with its nearest neighbor in the reference. Finally, a pretrained vocoder synthesizes audio from the converted representation. Objective and subjective evaluations show that kNN-VC improves speaker similarity with similar intelligibility scores to existing methods. Code, samples, trained models: https://bshall.github.io/knn-vc
'Transformers' star on becoming a doomsday prepper: Planning in case 's*** hits the fan' in Los Angeles
New Yorkers reveal what they would put in their doomsday bags. "Transformers" and "Las Vegas" star Josh Duhamel has spoken out about becoming a doomsday prepper, stating that he's planning on protecting his family if the "s*** hits the fan" in Los Angeles. The actor, who has starred in the TV show "Las Vegas," gave an interview in which he explained, "I've become a bit of a doomsday prepper, I guess." Duhamel told the website Inverse, "I'm learning how to hunt. He added, "Suddenly I had 54 acres out there.