Deep Learning
Using Deep Neural Networks to Automate Large Scale Statistical Analysis for Big Data Applications
Zhang, Rongrong, Deng, Wei, Zhu, Michael Yu
Statistical analysis (SA) is a complex process to deduce population properties from analysis of data. It usually takes a well-trained analyst to successfully perform SA, and it becomes extremely challenging to apply SA to big data applications. We propose to use deep neural networks to automate the SA process. In particular, we propose to construct convolutional neural networks (CNNs) to perform automatic model selection and parameter estimation, two most important SA tasks. We refer to the resulting CNNs as the neural model selector and the neural model estimator, respectively, which can be properly trained using labeled data systematically generated from candidate models. Simulation study shows that both the selector and estimator demonstrate excellent performances. The idea and proposed framework can be further extended to automate the entire SA process and have the potential to revolutionize how SA is performed in big data analytics.
Gaussian Prototypical Networks for Few-Shot Learning on Omniglot
We propose a novel architecture for $k$-shot classification on the Omniglot dataset. Building on prototypical networks, we extend their architecture to what we call Gaussian prototypical networks. Prototypical networks learn a map between images and embedding vectors, and use their clustering for classification. In our model, a part of the encoder output is interpreted as a confidence region estimate about the embedding point, and expressed as a Gaussian covariance matrix. Our network then constructs a direction and class dependent distance metric on the embedding space, using uncertainties of individual data points as weights. We show that Gaussian prototypical networks are a preferred architecture over vanilla prototypical networks with an equivalent number of parameters. We report state-of-the-art performance in 1-shot and 5-shot classification both in 5-way and 20-way regime (for 5-shot 5-way, we are comparable to previous state-of-the-art) on the Omniglot dataset. We explore artificially down-sampling a fraction of images in the training set, which improves our performance even further. We therefore hypothesize that Gaussian prototypical networks might perform better in less homogeneous, noisier datasets, which are commonplace in real world applications.
Jet Constituents for Deep Neural Network Based Top Quark Tagging
Pearkes, Jannicke, Fedorko, Wojciech, Lister, Alison, Gay, Colin
Recent literature on deep neural networks for tagging of highly energetic jets resulting from top quark decays has focused on image based techniques or multivariate approaches using high-level jet substructure variables. Here, a sequential approach to this task is taken by using an ordered sequence of jet constituents as training inputs. Unlike the majority of previous approaches, this strategy does not result in a loss of information during pixelisation or the calculation of high level features. The jet classification method achieves a background rejection of 45 at a 50% efficiency operating point for reconstruction level jets with transverse momentum range of 600 to 2500 GeV and is insensitive to multiple proton-proton interactions at the levels expected throughout Run 2 of the LHC.
Whiteout: Gaussian Adaptive Noise Regularization in FeedForward Neural Networks
Noise injection (NI) is an approach to mitigate over-fitting in feedforward neural networks (NNs). The Bernoulli NI procedure as implemented in dropout and shakeout has connections with $l_1$ and $l_2$ regularization on the NN model parameters and demonstrates the efficiency and feasibility of NI in regularizing NNs. We propose whiteout, a new NI regularization technique with adaptive Gaussian noise in NNs. Whiteout is more versatile than dropout and shakeout. We show that the optimization objective function associated with whiteout in generalized linear models has a closed-form penalty term that has connections with a wide range of regularization and includes the bridge, lasso, ridge, and elastic net penalization as special cases; it can be also extended to offer regularization similar to the adaptive lasso and group lasso. We prove that whiteout can also be viewed as robust learning of NNs in the presence of small perturbations in input and hidden nodes. We establish that the noise-perturbed empirical loss function with whiteout converges almost surely to the ideal loss function, and the estimates of NN parameters obtained from minimizing the former loss function are consistent with those obtained from minimizing the ideal loss function. Computationally, whiteout can be easily incorporated in the back-propagation algorithm. The superiority of whiteout over dropout and shakeout in learning NNs with relatively small sized training data is demonstrated using the the LSVT voice rehabilitation data and the LIBRAS hand movement data.
Confession of a so-called AI expert
I have a confession to make. I feel like a fraud. Every few days, I receive an email from either a friend, a friend of a friend, or a random company that asks me for my insights in Artificial Intelligence. These include entrepreneurs who have just sold their startups, Stanford MBA graduates who reject half a million dollar offers, venture capitalists, even major bank executives. A couple of years earlier, I wouldn't even have the courage to approach those people, let alone dreaming about them wanting to talk to me.
deeplearning.ai: Announcing new Deep Learning courses on Coursera
I have been working on three new AI projects, and am thrilled to announce the first one: deeplearning.ai, These courses will help you master Deep Learning, apply it effectively, and build a career in AI. Just as electricity transformed every major industry starting about 100 years ago, AI is now poised to do the same. Several large tech companies have built AI divisions, and started transforming themselves with AI. But in the next few years, companies of all sizes and across all industries will realize that they too must be part of this AI-powered future.
Andrew Ng's Next Trick: Training a Million AI Experts
Andrew Ng, one of the world's best-known artificial-intelligence experts, is launching an online effort to create millions more AI experts across a range of industries. Ng, an early pioneer in online learning, hopes his new deep-learning course on Coursera will train people to use the most powerful idea to have emerged in AI in recent years. AI experts have become some of the most sought-after and well-paid employees in today's tech economy. Deep learning involves teaching a machine to perform a complex task using large amounts of data along with a large simulated neural network. The technique has typically required deep technical knowledge and expertise to master (see "10 Breakthrough Technologies 2013: Deep Learning").
The evolution of machine learning
Catherine Dong is a summer associate at Bloomberg Beta and will be working at Facebook as a machine learning engineer. Major tech companies have actively reoriented themselves around AI and machine learning: Google is now "AI-first," Uber has ML running through its veins and internal AI research labs keep popping up. They're pouring resources and attention into convincing the world that the machine intelligence revolution is arriving now. They tout deep learning, in particular, as the breakthrough driving this transformation and powering new self-driving cars, virtual assistants and more. Despite this hype around the state of the art, the state of the practice is less futuristic.
A Review of Popular Deep Learning Models
In the financial services industry, deep learning models are being used for "predictive analytics," which have helped improve forecasting, recommendations, and risk analysis. HR departments are using AI-based tools to help streamline the process of talent acquisition and management. Facebook has just developed a Bot store to help businesses take advantage of chat bots that can act as customer service representatives for sales or with simple troubleshooting. As deep learning algorithms become increasingly prevalent across industries, deep learning models are also becoming more accessible to people outside of mathematics, engineering and robotics. We hope this showcase can inspire you to see what is possible.
The Amazing Ways Google Uses Deep Learning AI
Deep learning is the area of artificial intelligence where the real magic is happening right now. Traditionally computers, while being very fast, have not been very smart – they have no ability to learn from their mistakes and have to be given precise instructions in order to carry out any task. Deep learning involves building artificial neural networks which attempt to mimic the way organic(living) brains sort and process information. The "deep" in deep learning signifies the use of many layers of neural networks all stacked on top of each other. This data processing configuration is known as a deep neural network, and its complexity means it is able to process data to a more thorough and refined degree than other AI technologies which have come before it.