Goto

Collaborating Authors

 Country


User Friendly Automatic Construction of Background Knowledge: Mode Construction from ER Diagrams

arXiv.org Artificial Intelligence

One of the key advantages of Inductive Logic Programming systems is the ability of the domain experts to provide background knowledge as modes that allow for efficient search through the space of hypotheses. However, there is an inherent assumption that this expert should also be an ILP expert to provide effective modes. We relax this assumption by designing a graphical user interface that allows the domain expert to interact with the system using Entity Relationship diagrams. These interactions are used to construct modes for the learning system. We evaluate our algorithm on a probabilistic logic learning system where we demonstrate that the user is able to construct effective background knowledge on par with the expert-encoded knowledge on five data sets.


Classifying Inconsistency Measures Using Graphs

Journal of Artificial Intelligence Research

The aim of measuring inconsistency is to obtain an evaluation of the imperfections in a set of formulas, and this evaluation may then be used to help decide on some course of action (such as rejecting some of the formulas, resolving the inconsistency, seeking better sources of information, etc). A number of proposals have been made to define measures of inconsistency. Each has its rationale. But to date, it is not clear how to delineate the space of options for measures, nor is it clear how we can classify measures systematically. To address these problems, we introduce a general framework for comparing syntactic measures of inconsistency. It is based on the notion of an inconsistency graph for each knowledgebase (a bipartite graph with a set of vertices representing formulas in the knowledgebase, a set of vertices representing minimal inconsistent subsets of the knowledgebase, and edges representing that a formula belongs to a minimal inconsistent subset). We then show that various measures can be computed using the inconsistency graph. Then we introduce abstractions of the inconsistency graph and use them to construct a hierarchy of syntactic inconsistency measures. Furthermore, we extend the inconsistency graph concept with a labeling that extends the hierarchy to include some other types of inconsistency measures.


Robust Automated Thalamic Nuclei Segmentation using a Multi-planar Cascaded Convolutional Neural Network

arXiv.org Machine Learning

Purpose: To develop a fast, accurate, and robust convolutional neural network (CNN) based method for segmentation of thalamic nuclei. Methods: A cascaded multi-planar scheme with a modified residual U-Net architecture was used to segment thalamic nuclei on clinical datasets acquired using the white-matter-nulled Magnetization Prepared Rapid Gradient Echo (MPRAGE) sequence. A single network was optimized for healthy controls and disease types (multiple sclerosis, essential tremor) and magnetic field strengths (3T and 7T). Another network was developed to use conventional MPRAGE data. Clinical utility was assessed by comparing a cohort of MS patients to healthy subjects. Results: Segmentation of each thalamus into 12 nuclei was achieved in under 4 minutes. For 7T WMn-MPRAGE, the proposed method outperformed current state-of-the-art with statistically significant improvements in Dice ranging from 1.2% to 5.3% for MS and from 2.6% to 38.8% for ET patients. Comparable accuracy (Dice/VSI) was achieved between 7T and 3T data, attesting to the robustness of the method. For conventional MPRAGE, Dice of > 0.7 was achieved for larger nuclei and > 0.6 for the smaller nuclei. Atrophy of five thalamic nuclei and the whole thalamus was observed for MS patients compared to healthy control subjects, after controlling for intracranial volume and age (p<0.004). Conclusion: The proposed segmentation method is fast, accurate, and generalizes across disease types and field strengths and shows great potential for improving our understanding of thalamic nuclei involvement in neurological diseases and healthy aging. KEYWORDS Deep learning, convolutional neural network, transfer learning, thalamic nuclei segmentation


AppStreamer: Reducing Storage Requirements of Mobile Games through Predictive Streaming

arXiv.org Machine Learning

Storage has become a constrained resource on smartphones. Gaming is a popular activity on mobile devices and the explosive growth in the number of games coupled with their growing size contributes to the storage crunch. Even where storage is plentiful, it takes a long time to download and install a heavy app before it can be launched. This paper presents AppStreamer, a novel technique for reducing the storage requirements or startup delay of mobile games, and heavy mobile apps in general. AppStreamer is based on the intuition that most apps do not need the entirety of its files (images, audio and video clips, etc.) at any one time. AppStreamer can, therefore, keep only a small part of the files on the device, akin to a "cache", and download the remainder from a cloud storage server or a nearby edge server when it predicts that the app will need them in the near future. AppStreamer continuously predicts file blocks for the near future as the user uses the app, and fetches them from the storage server before the user sees a stall due to missing resources. We implement AppStreamer at the Android file system layer. This ensures that the apps require no source code or modification, and the approach generalizes across apps. We evaluate AppStreamer using two popular games: Dead Effect 2, a 3D first-person shooter, and Fire Emblem Heroes, a 2D turn-based strategy role-playing game. Through a user study, 75% and 87% of the users respectively find that AppStreamer provides the same quality of user experience as the baseline where all files are stored on the device. AppStreamer cuts down the storage requirement by 87% for Dead Effect 2 and 86% for Fire Emblem Heroes.


Human Comprehension of Fairness in Machine Learning

arXiv.org Artificial Intelligence

Bias in machine learning has manifested injustice in several areas, such as medicine, hiring, and criminal justice. In response, computer scientists have developed myriad definitions of fairness to correct this bias in fielded algorithms. While some definitions are based on established legal and ethical norms, others are largely mathematical. It is unclear whether the general public agrees with these fairness definitions, and perhaps more importantly, whether they understand these definitions. We take initial steps toward bridging this gap between ML researchers and the public, by addressing the question: does a non-technical audience understand a basic definition of ML fairness? We develop a metric to measure comprehension of one such definition--demographic parity. We validate this metric using online surveys, and study the relationship between comprehension and sentiment, demographics, and the application at hand.


Evaluating Usage of Images for App Classification

arXiv.org Machine Learning

App classification is useful in a number of applications such as adding apps to an app store or building a user model based on the installed apps. Presently there are a number of existing methods to classify apps based on a given taxonomy on the basis of their text metadata. However, text based methods for app classification may not work in all cases, such as when the text descriptions are in a different language, or missing, or inadequate to classify the app. One solution in such cases is to utilize the app images to supplement the text description. In this paper, we evaluate a number of approaches in which app images can be used to classify the apps. In one approach, we use Optical character recognition (OCR) to extract text from images, which is then used to supplement the text description of the app. In another, we use pic2vec to convert the app images into vectors, then train an SVM to classify the vectors to the correct app label. In another, we use the captionbot.ai tool to generate natural language descriptions from the app images. Finally, we use a method to detect and label objects in the app images and use a voting technique to determine the category of the app based on all the images. We compare the performance of our image-based techniques to classify a number of apps in our dataset. We use a text based SVM app classifier as our base and obtained an improved classification accuracy of 96% for some classes when app images are added.


Predicting the Outcome of Judicial Decisions made by the European Court of Human Rights

arXiv.org Machine Learning

In this study, machine learning models were constructed to predict whether judgments made by the European Court of Human Rights (ECHR) would lead to a violation of an Article in the Convention on Human Rights. The problem is framed as a binary classification task where a judgment can lead to a "violation" or "non-violation" of a particular Article. Using auto-sklearn, an automated algorithm selection package, models were constructed for 12 Articles in the Convention. To train these models, textual features were obtained from the ECHR Judgment documents using N-grams, word embeddings and paragraph embeddings. Additional documents, from the ECHR, were incorporated into the models through the creation of a word embedding (echr2vec) and a doc2vec model. The features obtained using the echr2vec embedding provided the highest cross-validation accuracy for 5 of the Articles. The overall test accuracy, across the 12 Articles, was 68.83%. As far as we could tell, this is the first estimate of the accuracy of such machine learning models using a realistic test set. This provides an important benchmark for future work. As a baseline, a simple heuristic of always predicting the most common outcome in the past was used. The heuristic achieved an overall test accuracy of 86.68% which is 29.7% higher than the models. Again, this was seemingly the first study that included such a heuristic with which to compare model results. The higher accuracy achieved by the heuristic highlights the importance of including such a baseline.


Virtual Reality to Study the Gap Between Offline and Real-Time EMG-based Gesture Recognition

arXiv.org Machine Learning

Within sEMG-based gesture recognition, a chasm exists in the literature between offline accuracy and real-time usability of a classifier. This gap mainly stems from the four main dynamic factors in sEMG-based gesture recognition: gesture intensity, limb position, electrode shift and transient changes in the signal. These factors are hard to include within an offline dataset as each of them exponentially augment the number of segments to be recorded. On the other hand, online datasets are biased towards the sEMG-based algorithms providing feedback to the participants, limiting the usability of such datasets as benchmarks. This paper proposes a virtual reality (VR) environment and a real-time experimental protocol from which the four main dynamic factors can more easily be studied. During the online experiment, the gesture recognition feedback is provided through the leap motion camera, enabling the proposed dataset to be re-used to compare future sEMG-based algorithms. 20 able-bodied persons took part in this study, completing three to four sessions over a period spanning between 14 and 21 days. Finally, TADANN, a new transfer learning-based algorithm, is proposed for long term gesture classification and significantly (p<0.05) outperforms fine-tuning a network.


Automating Vitiligo Skin Lesion Segmentation Using Convolutional Neural Networks

arXiv.org Machine Learning

For several skin conditions such as vitiligo, accurate segmentation of lesions from skin images is the primary measure of disease progression and severity. Existing methods for vitiligo lesion segmentation require manual intervention. Unfortunately, manual segmentation is time and labor-intensive, as well as irreproducible between physicians. We introduce a convolutional neural network (CNN) that quickly and robustly performs vitiligo skin lesion segmentation. Our CNN has a U-Net architecture with a modified contracting path. We use the CNN to generate an initial segmentation of the lesion, then refine it by running the watershed algorithm on high-confidence pixels. We train the network on 247 images with a variety of lesion sizes, complexity, and anatomical sites. The network with our modifications noticeably outperforms the state-of-the-art U-Net, with a Jaccard Index (JI) score of 73.6% (compared to 36.7%). Moreover, our method requires only a few seconds for segmentation, in contrast with the previously proposed semi-autonomous watershed approach, which requires 2-29 minutes per image.


A Unified Framework for Speech Separation

arXiv.org Machine Learning

Speech separation refers to extracting each individual speech source in a given mixed signal. Recent advancements in speech separation and ongoing research in this area, have made these approaches as promising techniques for pre-processing of naturalistic audio streams. After incorporating deep learning techniques into speech separation, performance on these systems is improving faster. The initial solutions introduced for deep learning based speech separation analyzed the speech signals into time-frequency domain with STFT; and then encoded mixed signals were fed into a deep neural network based separator. Most recently, new methods are introduced to separate waveform of the mixed signal directly without analyzing them using STFT. Here, we introduce a unified framework to include both spectrogram and waveform separations into a single structure, while being only different in the kernel function used to encode and decode the data; where, both can achieve competitive performance. This new framework provides flexibility; in addition, depending on the characteristics of the data, or limitations of the memory and latency can set the hyper-parameters to flow in a pipeline of the framework which fits the task properly. We extend single-channel speech separation into multi-channel framework with end-to-end training of the network while optimizing the speech separation criterion (i.e., Si-SNR) directly. We emphasize on how tied kernel functions for calculating spatial features, encoder, and decoder in multi-channel framework can be effective. We simulate spatialized reverberate data for both WSJ0 and LibriSpeech corpora here, and while these two sets of data are different in the matter of size and duration, the effect of capturing shorter and longer dependencies of previous/+future samples are studied in detail. We report SDR, Si-SNR and PESQ to evaluate the performance of developed solutions.