AITopics

Country:

North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > Texas (0.04)
North America > United States > Hawaii (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Neural Information Processing SystemsMar-13-2024, 10:37:35 GMT

Near-optimal sample compression for nearest neighbors

We present the first sample compression algorithm for nearest neighbors with nontrivial performance guarantees. We complement these guarantees by demonstrating almost matching hardness lower bounds, which show that our bound is nearly optimal. Our result yields new insight into margin-based nearest neighbor classification in metric spaces and allows us to significantly sharpen and simplify existing bounds. Some encouraging empirical results are also presented.

algorithm, nearest neighbor, neighbor, (14 more...)

Country: Asia > Middle East > Israel > Southern District > Beer-Sheva (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.47)

Neural Information Processing SystemsMar-13-2024, 07:45:32 GMT

The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification

We present the Bayesian Case Model (BCM), a general framework for Bayesian case-based reasoning (CBR) and prototype classification and clustering. BCM brings the intuitive power of CBR to a Bayesian generative framework. The BCM learns prototypes, the "quintessential" observations that best represent clusters in a dataset, by performing joint inference on cluster labels, prototypes and important features. Simultaneously, BCM pursues sparsity by learning subspaces, the sets of features that play important roles in the characterization of the prototypes. The prototype and subspace representation provides quantitative benefits in interpretability while preserving classification accuracy. Human subject experiments verify statistically significant improvements to participants' understanding when using explanations produced by BCM, compared to those given by prior art.

bcm, dataset, prototype, (15 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > Norway > Central Norway > Trøndelag > Trondheim (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.47)

Industry:

Media > Film (0.69)
Leisure & Entertainment (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Neural Information Processing SystemsMar-12-2024, 10:46:22 GMT

An algorithm for l nearest neighbor search via monotonic embedding Xinan Wang Fast algorithms for nearest neighbor (NN) search have in large part focused on l

We rigorously establish the correctness of the methodology and show by experimentation using LSH that it is competitive in practice with available alternatives.

machine learning, natural language, projection, (17 more...)

Country:

North America > United States > California > San Diego County > San Diego (0.05)
Asia > Afghanistan > Parwan Province > Charikar (0.05)
North America > United States > New York (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.81)

arXiv.org Artificial IntelligenceMar-3-2024

Logic Rules as Explanations for Legal Case Retrieval

Sun, Zhongxiang, Zhang, Kepu, Yu, Weijie, Wang, Haoyu, Xu, Jun

In this paper, we address the issue of using logic rules to explain the results from legal case retrieval. The task is critical to legal case retrieval because the users (e.g., lawyers or judges) are highly specialized and require the system to provide logical, faithful, and interpretable explanations before making legal decisions. Recently, research efforts have been made to learn explainable legal case retrieval models. However, these methods usually select rationales (key sentences) from the legal cases as explanations, failing to provide faithful and logically correct explanations. In this paper, we propose Neural-Symbolic enhanced Legal Case Retrieval (NS-LCR), a framework that explicitly conducts reasoning on the matching of legal cases through learning case-level and law-level logic rules. The learned rules are then integrated into the retrieval process in a neuro-symbolic manner. Benefiting from the logic and interpretable nature of the logic rules, NS-LCR is equipped with built-in faithful explainability. We also show that NS-LCR is a model-agnostic framework that can be plugged in for multiple legal retrieval models. To showcase NS-LCR's superiority, we enhance existing benchmarks by adding manually annotated logic rules and introducing a novel explainability metric using Large Language Models (LLMs). Our comprehensive experiments reveal NS-LCR's effectiveness for ranking, alongside its proficiency in delivering reliable explanations for legal case retrieval.

explanation, logic rule, retrieval, (15 more...)

2403.01457

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > India > Karnataka > Bengaluru (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Neural Information Processing SystemsFeb-16-2024, 09:52:26 GMT

Estimation of Rényi Entropy and Mutual Information Based on Generalized Nearest-Neighbor Graphs

We present simple and computationally efficient nonparametric estimators of R\'enyi entropy and mutual information based on an i.i.d. The estimators are calculated as the sum of p -th powers of the Euclidean lengths of the edges of the generalized nearest-neighbor' graph of the sample and the empirical copula of the sample respectively. For the first time, we prove the almost sure consistency of these estimators and upper bounds on their rates of convergence, the latter of which under the assumption that the density underlying the sample is Lipschitz continuous. Experiments demonstrate their usefulness in independent subspace analysis.

estimation, generalized nearest-neighbor graph, nyi entropy and mutual information, (1 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.68)

Kilic, Kordag Mehmet, Sima, Jin, Bruck, Jehoshua

Nearest Neighbor Representations of Neurons

arXiv.org Artificial IntelligenceFeb-13-2024

The Nearest Neighbor (NN) Representation is an emerging computational model that is inspired by the brain. We study the complexity of representing a neuron (threshold function) using the NN representations. It is known that two anchors (the points to which NN is computed) are sufficient for a NN representation of a threshold function, however, the resolution (the maximum number of bits required for the entries of an anchor) is $O(n\log{n})$. In this work, the trade-off between the number of anchors and the resolution of a NN representation of threshold functions is investigated. We prove that the well-known threshold functions EQUALITY, COMPARISON, and ODD-MAX-BIT, which require 2 or 3 anchors and resolution of $O(n)$, can be represented by polynomially large number of anchors in $n$ and $O(\log{n})$ resolution. We conjecture that for all threshold functions, there are NN representations with polynomially large size and logarithmic resolution in $n$.

nearest neighbor representation, neuron

2402.08748

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.60)

Oguri, Yutaro, Matsui, Yusuke

Theoretical and Empirical Analysis of Adaptive Entry Point Selection for Graph-based Approximate Nearest Neighbor Search

arXiv.org Artificial IntelligenceFeb-7-2024

We present a theoretical and empirical analysis of the adaptive entry point selection for graph-based approximate nearest neighbor search (ANNS). We introduce novel concepts: $b\textit{-monotonic path}$ and $B\textit{-MSNET}$, which better capture an actual graph in practical algorithms than existing concepts like MSNET. We prove that adaptive entry point selection offers better performance upper bound than the fixed central entry point under more general conditions than previous work. Empirically, we validate the method's effectiveness in accuracy, speed, and memory usage across various datasets, especially in challenging scenarios with out-of-distribution data and hard instances. Our comprehensive study provides deeper insights into optimizing entry points for graph-based ANNS for real-world high-dimensional data applications.

adaptive entry point selection, diskann, theoretical and empirical analysis, (7 more...)

2402.04713

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Oceania > Australia (0.04)

Genre: Research Report > Experimental Study (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.64)

arXiv.org Artificial IntelligenceFeb-2-2024

Shapelet-based Model-agnostic Counterfactual Local Explanations for Time Series Classification

Huang, Qi, Chen, Wei, Bäck, Thomas, van Stein, Niki

In this work, we propose a model-agnostic instance-based post-hoc explainability method for time series classification. The proposed algorithm, namely Time-CF, leverages shapelets and TimeGAN to provide counterfactual explanations for arbitrary time series classifiers. We validate the proposed method on several real-world univariate time series classification tasks from the UCR Time Series Archive. The results indicate that the counterfactual instances generated by Time-CF when compared to state-of-the-art methods, demonstrate better performance in terms of four explainability metrics: closeness, sensibility, plausibility, and sparsity.

classifier, counterfactual explanation, explanation, (8 more...)

2402.01343

Country:

Europe > Netherlands > South Holland > Leiden (0.05)
Europe > Switzerland (0.04)
Europe > Spain (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.46)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.38)

Zhang, Mike, van der Goot, Rob, Kan, Min-Yen, Plank, Barbara

NNOSE: Nearest Neighbor Occupational Skill Extraction

arXiv.org Artificial IntelligenceJan-30-2024

The labor market is changing rapidly, prompting increased interest in the automatic extraction of occupational skills from text. With the advent of English benchmark job description datasets, there is a need for systems that handle their diversity well. We tackle the complexity in occupational skill datasets tasks -- combining and leveraging multiple datasets for skill extraction, to identify rarely observed skills within a dataset, and overcoming the scarcity of skills across datasets. In particular, we investigate the retrieval-augmentation of language models, employing an external datastore for retrieving similar skills in a dataset-unifying manner. Our proposed method, \textbf{N}earest \textbf{N}eighbor \textbf{O}ccupational \textbf{S}kill \textbf{E}xtraction (NNOSE) effectively leverages multiple datasets by retrieving neighboring skills from other datasets in the datastore. This improves skill extraction \emph{without} additional fine-tuning. Crucially, we observe a performance gain in predicting infrequent patterns, with substantial gains of up to 30\% span-F1 in cross-dataset settings.

computational linguistic, dataset, datastore, (13 more...)

2401.17092

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Washington > King County > Seattle (0.14)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
(15 more...)

Genre: Research Report (1.00)

Industry: Banking & Finance > Economy (0.48)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.46)
(2 more...)