Goto

Collaborating Authors

 Performance Analysis


Automatic Configuration of Deep Neural Networks with EGO

arXiv.org Machine Learning

Designing the architecture for an artificial neural network is a cumbersome task because of the numerous parameters to configure, including activation functions, layer types, and hyper-parameters. With the large number of parameters for most networks nowadays, it is intractable to find a good configuration for a given task by hand. In this paper an Efficient Global Optimization (EGO) algorithm is adapted to automatically optimize and configure convolutional neural network architectures. A configurable neural network architecture based solely on convolutional layers is proposed for the optimization. Without using any knowledge on the target problem and not using any data augmentation techniques, it is shown that on several image classification tasks this approach is able to find competitive network architectures in terms of prediction accuracy, compared to the best hand-crafted ones in literature. In addition, a very small training budget (200 evaluations and 10 epochs in training) is spent on each optimized architectures in contrast to the usual long training time of hand-crafted networks. Moreover, instead of the standard sequential evaluation in EGO, several candidate architectures are proposed and evaluated in parallel, which saves the execution overheads significantly and leads to an efficient automation for deep neural network design.


Probabilistic Blocking with An Application to the Syrian Conflict

arXiv.org Machine Learning

Entity resolution seeks to merge databases as to remove duplicate entries where unique identifiers are typically unknown. We review modern blocking approaches for entity resolution, focusing on those based upon locality sensitive hashing (LSH). First, we introduce $k$-means locality sensitive hashing (KLSH), which is based upon the information retrieval literature and clusters similar records into blocks using a vector-space representation and projections. Second, we introduce a subquadratic variant of LSH to the literature, known as Densified One Permutation Hashing (DOPH). Third, we propose a weighted variant of DOPH. We illustrate each method on an application to a subset of the ongoing Syrian conflict, giving a discussion of each method.


Applications of PageRank to Function Comparison and Malware Classification

arXiv.org Artificial Intelligence

We classify .NET files as either benign or malicious by examining certain directed graphs extracted from the files via decompilation. Each graph is viewed probabilistically as a Markov chain where each node heuristically represents the possible state of the running file, and by computing the PageRank vector (Perron vector with transport) we can assign a probability measure over the nodes of the given graph. We train a random forest with features derived from computing Lebesgue antiderivatives of functions defined over the vertex sets of the graphs listed above against the PageRank measure. The model was trained on 2.5 million samples of .NET and has an accuracy of 98.3\% on test data. The median time needed for decompilation and scoring was 24ms.


All about Naive Bayes – Towards Data Science

#artificialintelligence

Naive Bayes is the most simple algorithm that you can apply to your data. As the name suggests, here this algorithm makes an assumption as all the variables in the dataset is "Naive" i.e not correlated to each other. Naive Bayes is a very popular classification algorithm that is mostly used to get the base accuracy of the dataset. Let's assume that you are walking on the playground. Now you see some red object in front of you.


Deep learning cardiac motion analysis for human survival prediction

arXiv.org Machine Learning

Motion analysis is used in computer vision to understand the behaviour of moving objects in sequences of images. Optimising the interpretation of dynamic biological systems requires accurate and precise motion tracking as well as efficient representations of high-dimensional motion trajectories so that these can be used for prediction tasks. Here we use image sequences of the heart, acquired using cardiac magnetic resonance imaging, to create time-resolved three-dimensional segmentations using a fully convolutional network trained on anatomical shape priors. This dense motion model formed the input to a supervised denoising autoencoder (4Dsurvival), which is a hybrid network consisting of an autoencoder that learns a task-specific latent code representation trained on observed outcome data, yielding a latent representation optimised for survival prediction. To handle right-censored survival outcomes, our network used a Cox partial likelihood loss function. In a study of 302 patients the predictive accuracy (quantified by Harrell's C-index) was significantly higher (p < .0001) for our model C=0.73 (95$\%$ CI: 0.68 - 0.78) than the human benchmark of C=0.59 (95$\%$ CI: 0.53 - 0.65). This work demonstrates how a complex computer vision task using high-dimensional medical image data can efficiently predict human survival.


JOBS: Joint-Sparse Optimization from Bootstrap Samples

arXiv.org Machine Learning

Classical signal recovery based on $\ell_1$ minimization solves the least squares problem with all available measurements via sparsity-promoting regularization. In practice, it is often the case that not all measurements are available or required for recovery. Measurements might be corrupted/missing or they arrive sequentially in streaming fashion. In this paper, we propose a global sparse recovery strategy based on subsets of measurements, named JOBS, in which multiple measurements vectors are generated from the original pool of measurements via bootstrapping, and then a joint-sparse constraint is enforced to ensure support consistency among multiple predictors. The final estimate is obtained by averaging over the $K$ predictors. The performance limits associated with different choices of number of bootstrap samples $L$ and number of estimates $K$ is analyzed theoretically. Simulation results validate some of the theoretical analysis, and show that the proposed method yields state-of-the-art recovery performance, outperforming $\ell_1$ minimization and a few other existing bootstrap-based techniques in the challenging case of low levels of measurements and is preferable over other bagging-based methods in the streaming setting since it performs better with small $K$ and $L$ for data-sets with large sizes.


A Unified Dynamic Approach to Sparse Model Selection

arXiv.org Artificial Intelligence

Sparse model selection is ubiquitous from linear regression to graphical models where regularization paths, as a family of estimators upon the regularization parameter varying, are computed when the regularization parameter is unknown or decided data-adaptively. Traditional computational methods rely on solving a set of optimization problems where the regularization parameters are fixed on a grid that might be inefficient. In this paper, we introduce a simple iterative regularization path, which follows the dynamics of a sparse Mirror Descent algorithm or a generalization of Linearized Bregman Iterations with nonlinear loss. Its performance is competitive to \texttt{glmnet} with a further bias reduction. A path consistency theory is presented that under the Restricted Strong Convexity (RSC) and the Irrepresentable Condition (IRR), the path will first evolve in a subspace with no false positives and reach an estimator that is sign-consistent or of minimax optimal $\ell_2$ error rate. Early stopping regularization is required to prevent overfitting. Application examples are given in sparse logistic regression and Ising models for NIPS coauthorship.


How Alexa Is Learning to Converse More Naturally : Alexa Blogs

#artificialintelligence

To handle more-natural spoken interactions, Alexa must track references through several rounds of conversation. If, for instance, a customer says, "How far is it to Redmond?" and after the answer follows up by saying, "Find good Indian restaurants there", Alexa should be able to infer that "there" refers to Redmond. We call the task of reference tracking "context carryover," and it's a capability that is currently being phased in to the Alexa experience. At this year's Interspeech, the largest conference on spoken-language understanding, my colleagues and I will present a paper titled "Contextual Slot Carryover for Disparate Schemas," which describes our solution to the problem of slot carryover, a crucial aspect of context carryover. "Domain" describes the type of application -- or "skill" -- that the utterance should invoke; for instance, mapping skills should answer questions about geographic distance.


Stanford AI detects even the smallest earthquakes from seismic data

#artificialintelligence

Microearthquakes -- low-intensity earthquakes that register 2.0 or less magnitude on the moment magnitude scale -- rarely cause property damage. And as a result of background noise, small events, and false positives, they're not always picked up by seismic monitoring systems. A possible solution is described in a new paper from the Department of Geophysics at Stanford University, where scientists have developed an AI system -- dubbed Cnn-Rnn Earthquake Detector, or CRED -- that can isolate and identify a range of seismic signals from historical and continuous data. It builds on the work of Harvard and Google, which in August created an AI model capable of predicting the location of aftershocks up to one year after a major earthquake. The researchers' system consists of neural network layers -- interconnected processing nodes that loosely mimic the function of neurons in the brain -- of two types: convolutional neural networks and recurrent neural networks.


Fighting breast cancer with AI early detection Hack and Craft

#artificialintelligence

Breast cancer awareness month is here and, with it, the latest statistics send a stark reminder of just how important early detection is in combating this brutal disease. With revolutionary strides forward in Artificial Intelligence (AI) all that looks set to change for the better. One of the leading causes of death for cancer patients is a late diagnosis, too often brought about by inferior testing facilities, human factors, such as fatigue and loss of concentration, or by the patients themselves, who put off seeing a specialist due to the fear of what they might discover. But now, thanks to nothing short of revolutionary strides forward in Artificial Intelligence (AI) all that looks set to change for the better. AI is capable of advanced learning using large complex datasets and has the potential to perform tasks such as image interpretation.