Deep Learning
The Kernel Mixture Network: A Nonparametric Method for Conditional Density Estimation of Continuous Random Variables
Ambrogioni, Luca, Güçlü, Umut, van Gerven, Marcel A. J., Maris, Eric
This paper introduces the kernel mixture network, a new method for nonparametric estimation of conditional probability densities using neural networks. We model arbitrarily complex conditional densities as linear combinations of a family of kernel functions centered at a subset of training points. The weights are determined by the outer layer of a deep neural network, trained by minimizing the negative log likelihood. This generalizes the popular quantized softmax approach, which can be seen as a kernel mixture network with square and non-overlapping kernels. We test the performance of our method on two important applications, namely Bayesian filtering and generative modeling. In the Bayesian filtering example, we show that the method can be used to filter complex nonlinear and non-Gaussian signals defined on manifolds. The resulting kernel mixture network filter outperforms both the quantized softmax filter and the extended Kalman filter in terms of model likelihood. Finally, our experiments on generative models show that, given the same architecture, the kernel mixture network leads to higher test set likelihood, less overfitting and more diversified and realistic generated samples than the quantized softmax approach.
Toward Deeper Understanding of Neural Networks: The Power of Initialization and a Dual View on Expressivity
Daniely, Amit, Frostig, Roy, Singer, Yoram
We develop a general duality between neural networks and compositional kernels, striving towards a better understanding of deep learning. We show that initial representations generated by common random initializations are sufficiently rich to express all functions in the dual kernel space. Hence, though the training objective is hard to optimize in the worst case, the initial weights form a good starting point for optimization. Our dual view also reveals a pragmatic and aesthetic perspective of neural networks and underscores their expressive power.
Learning Feature Nonlinearities with Non-Convex Regularized Binned Regression
Oymak, Samet, Mahdavi, Mehrdad, Chen, Jiasi
Recently, substantial progress has been made on the problem of high-dimensional sparse linear models [22]. In particular, Lasso has been shown to be remarkably successful, and is statistically well-behaved and generates interpretable solutions. However, in the presence of non-linearity (i.e., the relation between the covariates and response is nonlinear), boosted decision trees, deep learning models, and kernel methods are regarded as the most effective models that deliver substantial performance boost over linear models; however, their interpretability is limited. As a result, there is a significant gap between the statistical performance and the interpretability, and it is often desirable to have computationally efficient algorithms that learn interpretable models without sacrificing statistical guarantees. This raises a natural question that we aim to tackle: Is there any algorithm which has similar statistical performance to complex models, while still retaining much of the interpretability of Lasso? In this paper, we answer the above question affirmatively and propose a novel way of learning the feature non-linearities with provable statistical and computational guarantees.
CardiacNET: Segmentation of Left Atrium and Proximal Pulmonary Veins from MRI Using Multi-View CNN
Mortazi, Aliasghar, Karim, Rashed, Rhode, Kawal, Burt, Jeremy, Bagci, Ulas
Anatomical and biophysical modeling of left atrium (LA) and proximal pulmonary veins (PPVs) is important for clinical management of several cardiac diseases. Magnetic resonance imaging (MRI) allows qualitative assessment of LA and PPVs through visualization. However, there is a strong need for an advanced image segmentation method to be applied to cardiac MRI for quantitative analysis of LA and PPVs. In this study, we address this unmet clinical need by exploring a new deep learning-based segmentation strategy for quantification of LA and PPVs with high accuracy and heightened efficiency. Our approach is based on a multi-view convolutional neural network (CNN) with an adaptive fusion strategy and a new loss function that allows fast and more accurate convergence of the backpropagation based optimization. After training our network from scratch by using more than 60K 2D MRI images (slices), we have evaluated our segmentation strategy to the STACOM 2013 cardiac segmentation challenge benchmark. Qualitative and quantitative evaluations, obtained from the segmentation challenge, indicate that the proposed method achieved the state-of-the-art sensitivity (90%), specificity (99%), precision (94%), and efficiency levels (10 seconds in GPU, and 7.5 minutes in CPU).
SGD Learns the Conjugate Kernel Class of the Network
We show that the standard stochastic gradient decent (SGD) algorithm is guaranteed to learn, in polynomial time, a function that is competitive with the best function in the conjugate kernel space of the network, as defined in Daniely, Frostig and Singer. The result holds for log-depth networks from a rich family of architectures. To the best of our knowledge, it is the first polynomial-time guarantee for the standard neural network learning algorithm for networks of depth more that two. As corollaries, it follows that for neural networks of any depth between $2$ and $\log(n)$, SGD is guaranteed to learn, in polynomial time, constant degree polynomials with polynomially bounded coefficients. Likewise, it follows that SGD on large enough networks can learn any continuous function (not in polynomial time), complementing classical expressivity results.
Classification of Alzheimer's Disease Structural MRI Data by Deep Learning Convolutional Neural Networks
Sarraf, Saman, Tofighi, Ghassem
Recently, machine learning techniques especially predictive modeling and pattern recognition in biomedical sciences from drug delivery system to medical imaging has become one of the important methods which are assisting researchers to have deeper understanding of entire issue and to solve complex medical problems. Deep learning is a powerful machine learning algorithm in classification while extracting low to high-level features. In this paper, we used convolutional neural network to classify Alzheimer's brain from normal healthy brain. The importance of classifying this kind of medical data is to potentially develop a predict model or system in order to recognize the type disease from normal subjects or to estimate the stage of the disease. Classification of clinical data such as Alzheimer's disease has been always challenging and most problematic part has been always selecting the most discriminative features. Using Convolutional Neural Network (CNN) and the famous architecture LeNet-5, we successfully classified structural MRI data of Alzheimer's subjects from normal controls where the accuracy of test data on trained data reached 98.84%. This experiment suggests us the shift and scale invariant features extracted by CNN followed by deep learning classification is most powerful method to distinguish clinical data from healthy data in fMRI. This approach also enables us to expand our methodology to predict more complicated systems.
There's a big problem with AI: even its creators can't explain how it works
The car's underlying AI technology, known as deep learning, has proved very powerful at solving problems in recent years, and it has been widely deployed for tasks like image captioning, voice recognition, and language translation. The resulting program, which the researchers named Deep Patient, was trained using data from about 700,000 individuals, and when tested on new records, it proved incredibly good at predicting disease. But it was not until the start of this decade, after several clever tweaks and refinements, that very large--or "deep"--neural networks demonstrated dramatic improvements in automated perception. Deep learning has transformed computer vision and dramatically improved machine translation.
Shivon Zilis - Machine Intelligence
Almost a year ago, we published our now-annual landscape of machine intelligence companies, and goodness have we seen a lot of activity since then. This year's landscape has a third more companies than our first one did two years ago, and it feels even more futile to try to be comprehensive, since this just scratches the surface of all of the activity out there. As has been the case for the last couple of years, our fund still obsesses over "problem first" machine intelligence--we've invested in 35 machine intelligence companies solving 35 meaningful problems in areas from security to recruiting to software development. At the same time, the hype around machine intelligence methods continues to grow: the words "deep learning" now equally represent a series of meaningful breakthroughs (wonderful) but also a hyped phrase like "big data" (not so good!). We care about whether a founder uses the right method to solve a problem, not the fanciest one.
Deep Learning Key Terms, Explained
Enjoying a surge in research and industry, due mainly to its incredible successes in a number of different areas, deep learning is the process of applying deep neural network technologies - that is, neural network architectures with multiple hidden layers - to solve problems. Deep learning is a process, like data mining, which employs deep neural network architectures, which are particular types of machine learning algorithms. As defined above, deep learning is the process of applying deep neural network technologies to solve problems. Like data mining, deep learning refers to a process, which employs deep neural network architectures, which are particular types of machine learning algorithms.
Picasso: A free open-source visualizer for CNNs – merantix – Medium
While it's easier than ever to define and train deep neural networks (DNNs), understanding the learning process remains somewhat opaque. Monitoring the loss or classification error during training won't always prevent your model from learning the wrong thing or learning a proxy for your intended classification task. Regardless of the veracity of this tale, the point is familiar to machine learning researchers: training metrics don't always tell the whole story. And the stakes are higher than ever before: for rising applications of deep learning like autonomous vehicles, these kinds of training errors can be deadly [2]. Fortunately, standard visualizations like partial occlusion [3] and saliency maps [4] provide a sanity check on the learning process.