Case-Based Reasoning
SEC selects Casepoint for its cloud-based e-discovery pilot
Casepoint was selected over a number of other e-discovery technology solutions after undergoing a rigorous multi-step evaluation process. The SEC's evaluation factors included assessing software features and functionality, cybersecurity, management and key personnel, past performance, and a competitive proof of concept process. The SEC receives approximately 3,500 new productions, totaling nearly 10TB after processing, each month. The SEC's existing e-discovery systems contain almost 1PB of data across 7,300 cases and must support 600 users in its Washington, D.C. headquarters and 1,200 additional users across its 11 regional offices nationwide. It was imperative that the SEC chose an enterprise-scale solution that is fast, powerful, and easy-to-use.
Fast and Bayes-consistent nearest neighbors
Efremenko, Klim, Kontorovich, Aryeh, Noivirt, Moshe
Research on nearest-neighbor methods tends to focus somewhat dichotomously either on the statistical or the computational aspects -- either on, say, Bayes consistency and rates of convergence or on techniques for speeding up the proximity search. This paper aims at bridging these realms: to reap the advantages of fast evaluation time while maintaining Bayes consistency, and further without sacrificing too much in the risk decay rate. We combine the locality-sensitive hashing (LSH) technique with a novel missing-mass argument to obtain a fast and Bayes-consistent classifier. Our algorithm's prediction runtime compares favorably against state of the art approximate NN methods, while maintaining Bayes-consistency and attaining rates comparable to minimax. On samples of size $n$ in $\R^d$, our pre-processing phase has runtime $O(d n \log n)$, while the evaluation phase has runtime $O(d\log n)$ per query point.
Method for the semantic indexing of concept hierarchies, uniform representation, use of relational database systems and generic and case-based reasoning
Petersohn, Uwe, Zimmer, Sandra, Lehmann, Jens
This paper presents a method for semantic indexing and describes its application in the field of knowledge representation. Starting point of the semantic indexing is the knowledge represented by concept hierarchies. The goal is to assign keys to nodes (concepts) that are hierarchically ordered and syntactically and semantically correct. With the indexing algorithm, keys are computed such that concepts are partially unifiable with all more specific concepts and only semantically correct concepts are allowed to be added. The keys represent terminological relationships. Correctness and completeness of the underlying indexing algorithm are proven. The use of classical relational databases for the storage of instances is described. Because of the uniform representation, inference can be done using case-based reasoning and generic problem solving methods.
Artificial Intelligence: The Growth Factor for Budding Entrepreneurs in Home Automation Industry
The entrance of the world into the digital age has overhauled almost all the aspects of life, out of which one of the most noticeable evolution is โ 'the Smart Homes of New Age'. Automation, which is termed as a method, technique, or arrangement of operating or controlling a procedure by electronic gadgets and reducing human interference to a minimum, has risen as a new industry vertical in the last two decades. Gone are the days when someone has to check the house twice before leaving so that no lights, fans, or appliances are left switched on. It's the age of automation, where relay modules, sensors, and automated systems will take care of the optimal usage of electricity and all devices. Moreover, the past years have witnessed AI evolving as a technology for developing automatic systems and making decisions using case-based reasoning.
Legal AI Platform for the Future: Singularity is Near - Fintech Circle
The use of Artificial Intelligence (AI) in the areas of predicting legal and judiciary decisions based on criteria such as penal codes, state laws and legal precedent is rapidly evolving. Predictions that legal case management software will be using AI techniques for case-based reasoning are increasingly prominent. AI Deep Learning Platforms used in legal practices will have capabilities to carry out client management and updates as well as legal service alerts management. Platforms will allow lawyers to monitor the progress of matters, resource commitments, and budget status in real time on a case-by-case basis. For lawyers, it may provide a gateway to access firm's prior workflows.
Rates of Convergence for Large-scale Nearest Neighbor Classification
Qiao, Xingye, Duan, Jiexin, Cheng, Guang
Nearest neighbor is a popular class of classification methods with many desirable properties. For a large data set which cannot be loaded into the memory of a single machine due to computation, communication, privacy, or ownership limitations, we consider the divide and conquer scheme: the entire data set is divided into small subsamples, on which nearest neighbor predictions are made, and then a final decision is reached by aggregating the predictions on subsamples by majority voting. We name this method the big Nearest Neighbor (bigNN) classifier, and provide its rates of convergence under minimal assumptions, in terms of both the excess risk and the classification instability, which are proven to be the same rates as the oracle nearest neighbor classifier and cannot be improved. To significantly reduce the prediction time that is required for achieving the optimal rate, we also consider the pre-training acceleration technique applied to the bigNN method, with proven convergence rate. We find that in the distributed setting, the optimal choice of the neighbor k should scale with both the total sample size and the number of partitions, and there is a theoretical upper limit for the latter. Numerical studies have verified the theoretical findings.
Nearest Neighbor Search-Based Bitwise Source Separation Using Discriminant Winner-Take-All Hashing
We propose an iteration-free source separation algorithm based on Winner-Take-All (WTA) hash codes, which is a faster, yet accurate alternative to a complex machine learning model for single-channel source separation in a resource-constrained environment. We first generate random permutations with WTA hashing to encode the shape of the multidimensional audio spectrum to a reduced bitstring representation. A nearest neighbor search on the hash codes of an incoming noisy spectrum as the query string results in the closest matches among the hashed mixture spectra. Using the indices of the matching frames, we obtain the corresponding ideal binary mask vectors for denoising. Since both the training data and the search operation are bitwise, the procedure can be done efficiently in hardware implementations. Experimental results show that the WTA hash codes are discriminant and provide an affordable dictionary search mechanism that leads to a competent performance compared to a comprehensive model and oracle masking.
Comparison of Artificial Intelligence Techniques for Project Conceptual Cost Prediction
Developing a reliable parametric cost model at the conceptual stage of the project is crucial for projects managers and decision-makers. Existing methods, such as probabilistic and statistical algorithms have been developed for project cost prediction. However, these methods are unable to produce accurate results for conceptual cost prediction due to small and unstable data samples. Artificial intelligence (AI) and machine learning (ML) algorithms include numerous models and algorithms for supervised regression applications. Therefore, a comparison analysis for AI models is required to guide practitioners to the appropriate model. The study focuses on investigating twenty artificial intelligence (AI) techniques which are conducted for cost modeling such as fuzzy logic (FL) model, artificial neural networks (ANNs), multiple regression analysis (MRA), case-based reasoning (CBR), hybrid models, and ensemble methods such as scalable boosting trees (XGBoost). Field canals improvement projects (FCIPs) are used as an actual case study to analyze the performance of the applied ML models. Out of 20 AI techniques, the results showed that the most accurate and suitable method is XGBoost with 9.091% and 0.929 based on Mean Absolute Percentage Error (MAPE) and adjusted R2. Nonlinear adaptability, handling missing values and outliers, model interpretation and uncertainty have been discussed for the twenty developed AI models. Keywords: Artificial intelligence, Machine learning, ensemble methods, XGBoost, evolutionary fuzzy rules generation, Conceptual cost, and parametric cost model.
A Electric Network Reconfiguration Strategy with Case-Based Reasoning for the Smart Grid
Calhau, Flavio G., Martins, Joberto S. B.
The complexity, heterogeneity and scale of electrical networks have grown far beyond the limits of exclusively human-based management at the Smart Grid (SG). Likewise, researchers cogitate the use of artificial intelligence and heuristics techniques to create cognitive and autonomic management tools that aim better assist and enhance SG management processes like in the grid reconfiguration. The development of self-healing management approaches towards a cognitive and autonomic distribution power network reconfiguration is a scenario in which the scalability and on-the-fly computation are issues. This paper proposes the use of Case-Based Reasoning (CBR) coupled with the HATSGA algorithm for the fast reconfiguration of large distribution power networks. The suitability and the scalability of the CBR-based reconfiguration strategy using HATSGA algorithm are evaluated. The evaluation indicates that the adopted HATSGA algorithm computes new reconfiguration topologies with a feasible computational time for large networks. The CBR strategy looks for managerial acceptable reconfiguration solutions at the CBR database and, as such, contributes to reduce the required number of reconfiguration computation using HATSGA. This suggests CBR can be applied with a fast reconfiguration algorithm resulting in more efficient, dynamic and cognitive grid recovery strategy.
A Comparison of Super-Resolution and Nearest Neighbors Interpolation Applied to Object Detection on Satellite Data
Koester, Evan, Sahin, Cem Safak
As Super-Resolution (SR) has matured as a research topic, it has been applied to additional topics beyond image reconstruction. In particular, combining classification or object detection tasks with a super-resolution preprocessing stage has yielded improvements in accuracy especially with objects that are small relative to the scene. While SR has shown promise, a study comparing SR and naive upscaling methods such as Nearest Neighbors (NN) interpolation when applied as a preprocessing step for object detection has not been performed. We apply the topic to satellite data and compare the Multi-scale Deep Super-Resolution (MDSR) system to NN on the xView challenge dataset. To do so, we propose a pipeline for processing satellite data that combines multi-stage image tiling and upscaling, the YOLOv2 object detection architecture, and label stitching. We compare the effects of training models using an upscaling factor of 4, upscaling images from 30cm Ground Sample Distance (GSD) to an effective GSD of 7.5cm. Upscaling by this factor significantly improves detection results, increasing Average Precision (AP) of a generalized vehicle class by 23 percent. We demonstrate that while SR produces upscaled images that are more visually pleasing than their NN counterparts, object detection networks see little difference in accuracy with images upsampled using NN obtaining nearly identical results to the MDSRx4 enhanced images with a difference of 0.0002 AP between the two methods.