Goto

Collaborating Authors

 Accuracy


Intimate Partner Violence and Injury Prediction From Radiology Reports

arXiv.org Artificial Intelligence

Intimate partner violence (IPV) is an urgent, prevalent, and under-detected public health issue. We present machine learning models to assess patients for IPV and injury. We train the predictive algorithms on radiology reports with 1) IPV labels based on entry to a violence prevention program and 2) injury labels provided by emergency radiology fellowship-trained physicians. Our dataset includes 34,642 radiology reports and 1479 patients of IPV victims and control patients. Our best model predicts IPV a median of 3.08 years before violence prevention program entry with a sensitivity of 64% and a specificity of 95%. We conduct error analysis to determine for which patients our model has especially high or low performance and discuss next steps for a deployed clinical risk model.


Testing for Normality with Neural Networks

arXiv.org Machine Learning

In this paper, we treat the problem of testing for normality as a binary classification problem and construct a feedforward neural network that can successfully detect normal distributions by inspecting small samples from them. The numerical experiments conducted on small samples with no more than 100 elements indicated that the neural network which we trained was more accurate and far more powerful than the most frequently used and most powerful standard tests of normality: Shapiro-Wilk, Anderson-Darling, Lilliefors and Jarque-Berra, as well as the kernel tests of goodness-of-fit. The neural network had the AUROC score of almost 1, which corresponds to the perfect binary classifier. Additionally, the network's accuracy was higher than 96% on a set of larger samples with 250-1000 elements. Since the normality of data is an assumption of numerous techniques for analysis and inference, the neural network constructed in this study has a very high potential for use in everyday practice of statistics, data analysis and machine learning in both science and industry.


Privacy-Aware Recommender Systems Challenge on Twitter's Home Timeline

arXiv.org Machine Learning

Recommender systems constitute the core engine of most social network platforms nowadays, aiming to maximize user satisfaction along with other key business objectives. Twitter is no exception. Despite the fact that Twitter data has been extensively used to understand socioeconomic and political phenomena and user behaviour, the implicit feedback provided by users on Tweets through their engagements on the Home Timeline has only been explored to a limited extent. At the same time, there is a lack of large-scale public social network datasets that would enable the scientific community to both benchmark and build more powerful and comprehensive models that tailor content to user interests. By releasing an original dataset of 160 million Tweets along with engagement information, Twitter aims to address exactly that. During this release, special attention is drawn on maintaining compliance with existing privacy laws. Apart from user privacy, this paper touches on the key challenges faced by researchers and professionals striving to predict user engagements. It further describes the key aspects of the RecSys 2020 Challenge that was organized by ACM RecSys in partnership with Twitter using this dataset.


Tractography filtering using autoencoders

arXiv.org Artificial Intelligence

Current brain white matter fiber tracking techniques show a number of problems, including: generating large proportions of streamlines that do not accurately describe the underlying anatomy; extracting streamlines that are not supported by the underlying diffusion signal; and under-representing some fiber populations, among others. In this paper, we describe a novel unsupervised learning method to filter streamlines from diffusion MRI tractography, and hence, to obtain more reliable tractograms. We show that a convolutional neural network autoencoder provides a straightforward and elegant way to learn a robust representation of brain streamlines, which can be used to filter undesired samples with a nearest neighbor algorithm. Our method, dubbed FINTA (Filtering in Tractography using Autoencoders) comes with several key advantages: training does not need labeled data, as it uses raw tractograms, it is fast and easily reproducible, it does not rely on the input diffusion MRI data, and thus, does not suffer from domain adaptation issues. We demonstrate the ability of FINTA to discriminate between "plausible" and "implausible" streamlines as well as to recover individual streamline group instances from a raw tractogram, from both synthetic and real human brain diffusion MRI tractography data, including partial tractograms. Results reveal that FINTA has a superior filtering performance compared to state-of-the-art methods. Together, this work brings forward a new deep learning framework in tractography based on autoencoders, and shows how it can be applied for filtering purposes. It sets the foundations for opening up new prospects towards more accurate and robust tractometry and connectivity diffusion MRI analyses, which may ultimately lead to improve the imaging of the white matter anatomy.


Abductive Knowledge Induction From Raw Data

arXiv.org Artificial Intelligence

For many reasoning-heavy tasks, it is challenging to find an appropriate end-to-end differentiable approximation to domain-specific inference mechanisms. Neural-Symbolic (NeSy) AI divides the end-to-end pipeline into neural perception and symbolic reasoning, which can directly exploit general domain knowledge such as algorithms and logic rules. However, it suffers from the exponential computational complexity caused by the interface between the two components, where the neural model lacks direct supervision, and the symbolic model lacks accurate input facts. As a result, they usually focus on learning the neural model with a sound and complete symbolic knowledge base while avoiding a crucial problem: where does the knowledge come from? In this paper, we present Abductive Meta-Interpretive Learning ($Meta_{Abd}$), which unites abduction and induction to learn perceptual neural network and first-order logic theories simultaneously from raw data. Given the same amount of domain knowledge, we demonstrate that $Meta_{Abd}$ not only outperforms the compared end-to-end models in predictive accuracy and data efficiency but also induces logic programs that can be re-used as background knowledge in subsequent learning tasks. To the best of our knowledge, $Meta_{Abd}$ is the first system that can jointly learn neural networks and recursive first-order logic theories with predicate invention.


Learning to Explain: Datasets and Models for Identifying Valid Reasoning Chains in Multihop Question-Answering

arXiv.org Artificial Intelligence

Despite the rapid progress in multihop question-answering (QA), models still have trouble explaining why an answer is correct, with limited explanation training data available to learn from. To address this, we introduce three explanation datasets in which explanations formed from corpus facts are annotated. Our first dataset, eQASC, contains over 98K explanation annotations for the multihop question answering dataset QASC, and is the first that annotates multiple candidate explanations for each answer. The second dataset eQASC-perturbed is constructed by crowd-sourcing perturbations (while preserving their validity) of a subset of explanations in QASC, to test consistency and generalization of explanation prediction models. The third dataset eOBQA is constructed by adding explanation annotations to the OBQA dataset to test generalization of models trained on eQASC. We show that this data can be used to significantly improve explanation quality (+14% absolute F1 over a strong retrieval baseline) using a BERT-based classifier, but still behind the upper bound, offering a new challenge for future research. We also explore a delexicalized chain representation in which repeated noun phrases are replaced by variables, thus turning them into generalized reasoning chains (for example: "X is a Y" AND "Y has Z" IMPLIES "X has Z"). We find that generalized chains maintain performance while also being more robust to certain perturbations.


A Retrospective on Mutual Bootstrapping

AI Magazine

When we were invited to write a retrospective article about our AAAI-99 paper on mutual bootstrapping (Riloff and Jones 1999), our first reaction was hesitation because, well, that algorithm seems old and clunky now. But upon reflection, it shaped a great deal of subsequent work on bootstrapped learning for natural language processing, both by ourselves and others. So our second reaction was enthusiasm, for the opportunity to think about the path from 1999 to 2017 and to share the lessons that we learned about bootstrapped learning along the way. This article begins with a brief history of related research that preceded and inspired the mutual bootstrapping work, to position it with respect to that period of time. We then describe the general ideas and approach behind the mutual bootstrapping algorithm.


Squeezing the Most Utility from Your Models

#artificialintelligence

In a previous article we discussed why it's a good idea to prefer probability models to "hard" classification models, and why you should delay setting "hard" classification rules as long as possible. But decisions have to be made, and eventually you will have to set that threshold. A good threshold balances classifier precision/recall or sensitivity/specificity in a way that best meets the project or business needs. One way to quantify and think about this balance is the notion of model utility, which maps the performance of a model to some notion of the value achieved by that performance. In this article, we demonstrate the use of sigr::model_utility() to estimate model utility and pick model thresholds for classification problems.


Characterising Bias in Compressed Models

arXiv.org Artificial Intelligence

Pruning and quantization are widely applied techniques for compressing deep neural networks, often driven by the resource constraints of deploying models to mobile phones or embedded devices (Esteva et al., 2017; Lane & Warden, 2018). To-date, discussion around the relative merits of different compression methods has centered on the tradeoff between level of compression and top-line metrics such as top-1 and top-5 accuracy (Blalock et al., 2020). Along this dimension, compression techniques are remarkably successful. It is possible to prune the majority of weights (Gale et al., 2019; Evci et al., 2019) or heavily quantize the bit representation (Jacob et al., 2017) with negligible decreases to test-set accuracy. However, recent work by Hooker et al. (2019a) has found that the minimal changes to top-line metrics obscure critical differences in generalization between pruned and non-pruned networks. The authors establish that pruning disproportionately impacts predictive performance on a small subset of the dataset. We build upon this work and focus on the implications of these findings for a dataset with sensitive protected attributes such as gender and age. Our work addresses the question: Does compression amplify existing algorithmic bias?


Kernel regression in high dimension: Refined analysis beyond double descent

arXiv.org Machine Learning

In this paper, we provide a precise characterize of generalization properties of high dimensional kernel ridge regression across the under- and over-parameterized regimes, depending on whether the number of training data $n$ exceeds the feature dimension $d$. By establishing a novel bias-variance decomposition of the expected excess risk, we show that, while the bias is independent of $d$ and monotonically decreases with $n$, the variance depends on $n,d$ and can be unimodal or monotonically decreasing under different regularization schemes. Our refined analysis goes beyond the double descent theory by showing that, depending on the data eigen-profile and the level of regularization, the kernel regression risk curve can be a double-descent-like, bell-shaped, or monotonic function of $n$. Experiments on synthetic and real data are conducted to support our theoretical findings.