Goto

Collaborating Authors

 Country


AI in MedTech: Risks and Opportunities of Innovative Technologies in Medical Applications

#artificialintelligence

An increasing number of medical devices incorporate artificial intelligence (AI) capabilities to support therapeutic and diagnostic applications. In spite of the risks connected with this innovative technology, the applicable regulatory framework does not specify any requirements for this class of medical devices. To make matters even more complicated for manufacturers, there are no standards, guidance documents or common specifications for medical devices on how to demonstrate conformity with the essential requirements. The term artificial intelligence (AI) describes the capability of algorithms to take over tasks and decisions by mimicking human intelligence.1 Many experts believe that machine learning, a subset of artificial intelligence, will play a significant role in the medtech sector.2,3 "Machine learning" is the term used to describe algorithms capable of learning directly from a large volume of "training data". The algorithm builds a model based on training data and applies the experience, it has gained from the training to make predictions and decisions on new, unknown data. Artificial neural networks are a subset of machine learning methods, which have evolved from the idea of simulating the human brain.22 Neural networks are information-processing systems used for machine learning and comprise multiple layers of neurons. Between the input layer, which receives information, and the output layer, there are numerous hidden layers of neurons. In simple terms, neural networks comprise neurons – also known as nodes – which receive external information or information from other connected nodes, modify this information, and pass it on, either to the next neuron layer or to the output layer as the final result.5 Deep learning is a variation of artificial neural networks, which consist of multiple hidden neural network layers between the input and output layers. The inner layers are designed to extract higher-level features from the raw external data.


Column - The Power of Artificial Intelligence in the Medical Field - MedTech Intelligence

#artificialintelligence

Artificial intelligence, or AI, is transforming the medical device industry today. As medical devices continue to incorporate artificial intelligence to perform or support medical applications, new regulations require AI-driven medical devices to comply with state-of-the-art requirements and provide objective evidence for repeatability and reliability. AI has the potential to improve patient outcomes as well as the productivity and efficiency of healthcare delivery. It can also improve the day-to-day lives of healthcare providers by allowing them to spend more time caring for patients, hence improving staff morale and retention. It may even accelerate the development of life-saving therapies.


Column

#artificialintelligence

The use of artificial intelligence (AI) in life sciences, or "Life Tech", has increased at a rapid pace. According to World Intellectual Property Organization (WIPO), there has been "a shift from theoretical research to the use of AI technologies in commercial products and services," as reflected in the change in ratio of scientific papers to patent applications over the past decade.1 Indeed, while research into AI began in earnest in the 1950s, more than 1.6 million scientific papers have been published on AI, with more than half of identified AI inventions in the last six years alone.2,3 A review article in Nature Medicine reported last year that despite few peer-reviewed publications on use of machine learning technologies in medical devices, FDA approvals of AI as medical devices have been accelerating.4 Many of these FDA approvals relate to image analysis for diagnostic purposes, such as QuantX, the first AI platform to evaluate breast abnormalities; Aidoc, which detects acute intracranial hemorrhages in head CT scans, assisting radiologists to prioritize patient injuries; and IDx-DR, which analyzes retinal images to detect diabetic retinopathy.


Methods and Models for Interpretable Linear Classification

arXiv.org Machine Learning

We present an integer programming framework to build accurate and interpretable discrete linear classification models. Unlike existing approaches, our framework is designed to provide practitioners with the control and flexibility they need to tailor accurate and interpretable models for a domain of choice. To this end, our framework can produce models that are fully optimized for accuracy, by minimizing the 0--1 classification loss, and that address multiple aspects of interpretability, by incorporating a range of discrete constraints and penalty functions. We use our framework to produce models that are difficult to create with existing methods, such as scoring systems and M-of-N rule tables. In addition, we propose specially designed optimization methods to improve the scalability of our framework through decomposition and data reduction. We show that discrete linear classifiers can attain the training accuracy of any other linear classifier, and provide an Occam's Razor type argument as to why the use of small discrete coefficients can provide better generalization. We demonstrate the performance and flexibility of our framework through numerical experiments and a case study in which we construct a highly tailored clinical tool for sleep apnea diagnosis.


Learning to Transfer Privileged Information

arXiv.org Machine Learning

We introduce a learning framework called learning using privileged information (LUPI) to the computer vision field. We focus on the prototypical computer vision problem of teaching computers to recognize objects in images. We want the computers to be able to learn faster at the expense of providing extra information during training time. As additional information about the image data, we look at several scenarios that have been studied in computer vision before: attributes, bounding boxes and image tags. The information is privileged as it is available at training time but not at test time. We explore two maximum-margin techniques that are able to make use of this additional source of information, for binary and multiclass object classification. We interpret these methods as learning easiness and hardness of the objects in the privileged space and then transferring this knowledge to train a better classifier in the original space. We provide a thorough analysis and comparison of information transfer from privileged to the original data spaces for both LUPI methods. Our experiments show that incorporating privileged information can improve the classification accuracy. Finally, we conduct user studies to understand which samples are easy and which are hard for human learning, and explore how this information is related to easy and hard samples when learning a classifier.


Domain adaptation of weighted majority votes via perturbed variation-based self-labeling

arXiv.org Machine Learning

In machine learning, the domain adaptation problem arrives when the test (target) and the train (source) data are generated from different distributions. A key applied issue is thus the design of algorithms able to generalize on a new distribution, for which we have no label information. We focus on learning classification models defined as a weighted majority vote over a set of real-val ued functions. In this context, Germain et al. (2013) have shown that a measure of disagreement between these functions is crucial to control. The core of this measure is a theoretical bound--the C-bound (Lacasse et al., 2007)--which involves the disagreement and leads to a well performing majority vote learning algorithm in usual non-adaptative supervised setting: MinCq. In this work, we propose a framework to extend MinCq to a domain adaptation scenario. This procedure takes advantage of the recent perturbed variation divergence between distributions proposed by Harel and Mannor (2012). Justified by a theoretical bound on the target risk of the vote, we provide to MinCq a target sample labeled thanks to a perturbed variation-based self-labeling focused on the regions where the source and target marginals appear similar. We also study the influence of our self-labeling, from which we deduce an original process for tuning the hyperparameters. Finally, our framework called PV-MinCq shows very promising results on a rotation and translation synthetic problem.


Deep Tempering

arXiv.org Machine Learning

Restricted Boltzmann Machines (RBMs) are one of the fundamental building blocks of deep learning. Approximate maximum likelihood training of RBMs typically necessitates sampling from these models. In many training scenarios, computationally efficient Gibbs sampling procedures are crippled by poor mixing. In this work we propose a novel method of sampling from Boltzmann machines that demonstrates a computationally efficient way to promote mixing. Our approach leverages an under-appreciated property of deep generative models such as the Deep Belief Network (DBN), where Gibbs sampling from deeper levels of the latent variable hierarchy results in dramatically increased ergodicity. Our approach is thus to train an auxiliary latent hierarchical model, based on the DBN. When used in conjunction with parallel-tempering, the method is asymptotically guaranteed to simulate samples from the target RBM. Experimental results confirm the effectiveness of this sampling strategy in the context of RBM training.


Predicting missing links via correlation between nodes

arXiv.org Machine Learning

As a fundamental problem in many different fields, link prediction aims to estimate the likelihood of an existing link between two nodes based on the observed information. Since this problem is related to many applications ranging from uncovering missing data to predicting the evolution of networks, link prediction has been intensively investigated recently and many methods have been proposed so far. The essential challenge of link prediction is to estimate the similarity between nodes. Most of the existing methods are based on the common neighbor index and its variants. In this paper, we propose to calculate the similarity between nodes by the correlation coefficient. This method is found to be very effective when applied to calculate similarity based on high order paths. We finally fuse the correlation-based method with the resource allocation method, and find that the combined method can substantially outperform the existing methods, especially in sparse networks.


Distributed Detection : Finite-time Analysis and Impact of Network Topology

arXiv.org Machine Learning

This paper addresses the problem of distributed detection in multi-agent networks. Agents receive private signals about an unknown state of the world. The underlying state is globally identifiable, yet informative signals may be dispersed throughout the network. Using an optimization-based framework, we develop an iterative local strategy for updating individual beliefs. In contrast to the existing literature which focuses on asymptotic learning, we provide a finite-time analysis. Furthermore, we introduce a Kullback-Leibler cost to compare the efficiency of the algorithm to its centralized counterpart. Our bounds on the cost are expressed in terms of network size, spectral gap, centrality of each agent and relative entropy of agents' signal structures. A key observation is that distributing more informative signals to central agents results in a faster learning rate. Furthermore, optimizing the weights, we can speed up learning by improving the spectral gap. We also quantify the effect of link failures on learning speed in symmetric networks. We finally provide numerical simulations which verify our theoretical results.


Nonstochastic Multi-Armed Bandits with Graph-Structured Feedback

arXiv.org Machine Learning

We present and study a partial-information model of online learning, where a decision maker repeatedly chooses from a finite set of actions, and observes some subset of the associated losses. This naturally models several situations where the losses of different actions are related, and knowing the loss of one action provides information on the loss of other actions. Moreover, it generalizes and interpolates between the well studied full-information setting (where all losses are revealed) and the bandit setting (where only the loss of the action chosen by the player is revealed). We provide several algorithms addressing different variants of our setting, and provide tight regret bounds depending on combinatorial properties of the information feedback structure.