Goto

Collaborating Authors

 Case-Based Reasoning


The 25th International Conference on Case-Based Reasoning

AI Magazine

Usually, a CBR process is composed of four steps, namely: retrieve (selection of one or several case(s) from the base); reuse (adaptation of the retrieved case(s) to solve the new problem); revise (presentation of the newly formed case to application domain experts and, as appropriate, corrections to it); and retain (addition of the revised case to the case base, if this addition is judged useful). CBR is an active field of ICCBR is not only an important venue for presenting research that is application-and theory-driven, and it CBRrelated research. It is also an important relates to both machine learning and knowledge representation. Generous funding from NTNU, the Norwegian Each day of the conference began with an invited Research Council, and our other sponsors allowed talk. On the first day, Henri Prade presented an introduction the conference to cover all the meals for the attendees to analogical proportions and analogical reasoning during the conference.


The Continuous Hint Factory - Providing Hints in Vast and Sparsely Populated Edit Distance Spaces

arXiv.org Artificial Intelligence

Intelligent tutoring systems can support students in solving multi-step tasks by providing hints regarding what to do next. However, engineering such next-step hints manually or via an expert model becomes infeasible if the space of possible states is too large. Therefore, several approaches have emerged to infer next-step hints automatically, relying on past students' data. In particular, the Hint Factory (Barnes & Stamper, 2008) recommends edits that are most likely to guide students from their current state towards a correct solution, based on what successful students in the past have done in the same situation. Still, the Hint Factory relies on student data being available for any state a student might visit while solving the task, which is not the case for some learning tasks, such as open-ended programming tasks. In this contribution we provide a mathematical framework for edit-based hint policies and, based on this theory, propose a novel hint policy to provide edit hints in vast and sparsely populated state spaces. In particular, we extend the Hint Factory by considering data of past students in all states which are similar to the student's current state and creating hints approximating the weighted average of all these reference states. Because the space of possible weighted averages is continuous, we call this approach the Continuous Hint Factory. In our experimental evaluation, we demonstrate that the Continuous Hint Factory can predict more accurately what capable students would do compared to existing prediction schemes on two learning tasks, especially in an open-ended programming task, and that the Continuous Hint Factory is comparable to existing hint policies at reproducing tutor hints on a simple UML diagram task.


Judge Won't Toss Manafort Case Based on Leak Allegations

U.S. News

In a statement, AP spokeswoman Lauren Easton said that AP journalists "met with representatives from the Department of Justice in an effort to get information on stories they were reporting, as reporters do. During the course of the meeting, they asked DOJ representatives about a storage locker belonging to Paul Manafort, without sharing its name or location."


Approximate Nearest Neighbor Search in High Dimensions

arXiv.org Machine Learning

The nearest neighbor problem is defined as follows: Given a set $P$ of $n$ points in some metric space $(X,D)$, build a data structure that, given any point $q$, returns a point in $P$ that is closest to $q$ (its "nearest neighbor" in $P$). The data structure stores additional information about the set $P$, which is then used to find the nearest neighbor without computing all distances between $q$ and $P$. The problem has a wide range of applications in machine learning, computer vision, databases and other fields. To reduce the time needed to find nearest neighbors and the amount of memory used by the data structure, one can formulate the {\em approximate} nearest neighbor problem, where the the goal is to return any point $p' \in P$ such that the distance from $q$ to $p'$ is at most $c \cdot \min_{p \in P} D(q,p)$, for some $c \geq 1$. Over the last two decades, many efficient solutions to this problem were developed. In this article we survey these developments, as well as their connections to questions in geometric functional analysis and combinatorial geometry.


New approximate nearest neighbor benchmarks

#artificialintelligence

Anyway, at some point I got a bit tired of reading papers of various algorithms claiming to be the fastest and most accurate, so I built a benchmark suite called ann-benchmarks. It pits a number of algorithms in a brutal showdown. I recently Dockerized it and wrote about it previously on this blog. So why am I blogging about it just three months later? Wellโ€ฆthere's a lot of water under the bridge in the world of approximate nearest neighbors, so I decided to re-run the benchmarks and publish new results. I will probably do this a few times every year, at my own questionable discretion.


Binary Classification in Unstructured Space With Hypergraph Case-Based Reasoning

arXiv.org Artificial Intelligence

Binary classification is one of the most common problem in machine learning. It consists in predicting whether a given element is of a particular class. In this paper, a new algorithm for binary classification is proposed using a hypergraph representation. Each element to be classified is partitioned according to its interactions with the training set. For each class, the total support is calculated as a convex combination of the {\it evidence} strength of the element of the partition. The evidence measure is pre-computed using the hypergraph induced by the training set and iteratively adjusted through a training phase. It does not require structured information, each case being represented by a set of {\it agnostic information} atoms. Empirical validation demonstrates its high potential on a wide range of well-known datasets and the results are compared to the state-of-art. The time complexity is given and empirically validated. Its capacity to provide good performances without hyperparameter tuning compared to standard classification methods is studied. Finally, the limitation of the model space is discussed and some potential solutions proposed.



Evaluating CBR Similarity Functions for BAM Switching in Networks with Dynamic Traffic Profile

arXiv.org Artificial Intelligence

In an increasingly complex scenario for network management, a solution that allows configuration in more autonomous way with less intervention of the network manager is expected. This paper presents an evaluation of similarity functions that are necessary in the context of using a learning strategy for finding solutions. The learning approach considered is based on Case-Based Reasoning (CBR) and is applied to a network scenario where different Bandwidth Allocation Models (BAMs) behaviors are used and must be eventually switched looking for the best possible network operation. In this context, it is required to identify and configure an adequate similarity function that will be used in the learning process to recover similar solutions previously considered. This paper introduces the similarity functions, explains the relevant aspects of the learning process in which the similarity function plays a role and, finally, presents a proof of concept for a specific similarity function adopted. Results show that the similarity function was capable to get similar results from the existing use case database. As such, the use of similarity functions with CBR technique has proved to be potentially satisfactory for supporting BAM switching decisions mostly driven by the dynamics of input traffic profile.


Nearest neighbor density functional estimation based on inverse Laplace transform

arXiv.org Machine Learning

A general approach to $L_2$-consistent estimation of various density functionals using $k$-nearest neighbor distances is proposed, along with the analysis of convergence rates in mean squared error. The construction of the estimator is based on inverse Laplace transforms related to the target density functional, which arises naturally from the convergence of a normalized volume of $k$-nearest neighbor ball to a Gamma distribution in the sample limit. Some instantiations of the proposed estimator rediscover existing $k$-nearest neighbor based estimators of Shannon and Renyi entropies and Kullback--Leibler and Renyi divergences, and discover new consistent estimators for many other functionals, such as Jensen--Shannon divergence and generalized entropies and divergences. A unified finite-sample analysis of the proposed estimator is presented that builds on a recent result by Gao, Oh, and Viswanath (2017) on the finite sample behavior of the Kozachenko--Leoneko estimator of entropy.


An Optimal Footprint Method for Case-Base Maintenance

AAAI Conferences

In Case-Based Reasoning (CBR), new problems are solved by retrieving similar previously solved cases and adapting their solutions. The new case is then stored appropriately in the case-base for future use. It is a fundamental problem to control the growth of case-base and the case-base maintenance step retains cases in the case-base based on an estimate of their usefulness in solving new problems. We propose an optimization formulation to identify an optimal set of representative cases called the optimal footprint of the case-base. The optimization formulation ensures that the optimal footprint set strikes a right trade-off between minimizing the number of cases and maximizing their ability to solve the remaining cases in the case-base. This trade-off is studied empirically in this paper. We also illustrate the trade-off between the size and performance of optimal footprint in the context of regression.