Deep Learning
Artificial Intelligence: The Customer Experience Imperative
Data science includes advanced tools and methods for leveraging the plethora of data sources available to CX professionals. These tools and methods include artificial intelligence, machine learning and deep learning. These are the ways that companies are automating insights to drive their company forward. Artificial intelligence is a field in computer science that focuses on developing computer systems to perform tasks that usually require human intelligence, including visual perception, speech recognition, decision-making, and translation between languages. Machine learning uses statistics/math to allow computers to find hidden insights (i.e., make predictions) without being explicitly programmed where to look.
How Machine Learning Could Help to Improve Climate Forecasts
As Earth-observing satellites become more plentiful and climate models more powerful, researchers who study global warming are facing a deluge of data. Some are now turning to the latest trend in artificial intelligence (AI) to help trawl through all the information, in the hope of discovering new climate patterns and improving forecasts. "Climate is now a data problem," says Claire Monteleoni, a computer scientist at George Washington University in Washington DC who has helped to pioneer the marriage of machine-learning techniques with climate science. In machine learning, AI systems improve in performance as the amount of data that they analyse grows. This approach is a natural fit for climate science: a single run of a high-resolution climate model can produce a petabyte of data, and the archive of climate data maintained by the UK Met Office, the national weather service, now holds about 45 petabytes of information--and adds 0.085 petabytes a day.
MIT researchers use machine learning to predict ICU interventions
Researchers at the Massachusetts Institute of Technology's Computer Science and Artificial Intelligence Laboratory have developed a machine learning algorithm that leverages large amounts of intensive care unit (ICU) data to predict actionable interventions for patients and improve health outcomes. By tapping into an MIT database of de-identified data for 40,000 critical care patients--including demographics, laboratory tests, medications and vital signs--the research team is able to use deep learning to determine what kinds of treatments are needed for different symptoms. The approach--called ICU Intervene--was presented in a paper this past weekend at the Machine Learning for Healthcare Conference in Boston. According to the authors, their model is the first to use deep neural networks to predict both onset and weaning of interventions using all available modalities of ICU data. "The decisions that are made in the ICU are made in a particularly high-stress and high-demand environment," says Harini Suresh, a PhD student and lead author on the paper, who adds that clinicians in these situations are bombarded with different types of data for many patients and as a result it can be difficult to make real-time treatment decisions.
Using Artificial Intelligence to Improve Quality Control
When we were in the city of Danyang, China, we witnessed a real-life paradox. Danyang is best known for its explosive growth in optical lens manufacturing over the last decade, sprouting hundreds of factories with cleanrooms chock-full of gleaming, automated machinery. There is a lot that goes into manufacturing a lens, and this machinery performs the bulk of it: from lens curing, lens cleaning, to lens coating. As Stanford AI researchers and engineers wildly interested in Chinese manufacturing, we were impressed by this caliber of automation. But the paradox greeted us as soon as we walked into the rooms responsible for the most critical part of the entire process: quality control.
[P] Neptune - Machine Learning Lab (experiment tracking & history, easy GPU computing in the cloud) • r/MachineLearning
OK, so here it is - the newest version of Neptune, a tool for building and deploying machine learning models. Run things in the cloud with a single command line neptune send, track your models with charts, compare & reproduce your previous models. We give you $100 for cloud computing in Google Cloud (we charge per second, so it's a lot of computing power). Let us know if you find Neptune helpful for your work (whether business projects, Kaggle competitions, some side projects or hands-on learning deep learning). We would be excited to hear your feedback, so we can keep improving Neptune.
What Can Deep Neural Networks Teach Us About Human Thought?
Intelligence is a property of networks, not objects - Each neuron in a neural network is extremely simple, but if you wire them together in the right way then the network can be extremely intelligent. Similarly what makes humanity so intelligent is the way that relatively unintelligent individuals can be wired together into super-intelligent societies. Understanding is less important than iteration and measurement - Neural networks generate models that are far too complex for us to understand. However, that's okay provided we have metrics that tell us how effective they are at whatever we want them to do, and provided we have ways to iteratively improve their performance against that metric. Similarly, humans rarely have a good rational understanding of how they actually solve problems and learn best when they have the opportunity to repeatedly practice something and know when they are getting better.
Modular Learning Component Attacks: Today's Reality, Tomorrow's Challenge
Zhang, Xinyang, Ji, Yujie, Wang, Ting
Many of today's machine learning (ML) systems are not built from scratch, but are compositions of an array of {\em modular learning components} (MLCs). The increasing use of MLCs significantly simplifies the ML system development cycles. However, as most MLCs are contributed and maintained by third parties, their lack of standardization and regulation entails profound security implications. In this paper, for the first time, we demonstrate that potentially harmful MLCs pose immense threats to the security of ML systems. We present a broad class of {\em logic-bomb} attacks in which maliciously crafted MLCs trigger host systems to malfunction in a predictable manner. By empirically studying two state-of-the-art ML systems in the healthcare domain, we explore the feasibility of such attacks. For example, we show that, without prior knowledge about the host ML system, by modifying only 3.3{\textperthousand} of the MLC's parameters, each with distortion below $10^{-3}$, the adversary is able to force the misdiagnosis of target victims' skin cancers with 100\% success rate. We provide analytical justification for the success of such attacks, which points to the fundamental characteristics of today's ML models: high dimensionality, non-linearity, and non-convexity. The issue thus seems fundamental to many ML systems. We further discuss potential countermeasures to mitigate MLC-based attacks and their potential technical challenges.
SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient
Yu, Lantao, Zhang, Weinan, Wang, Jun, Yu, Yong
As a new way of training generative models, Generative Adversarial Nets (GAN) that uses a discriminative model to guide the training of the generative model has enjoyed considerable success in generating real-valued data. However, it has limitations when the goal is for generating sequences of discrete tokens. A major reason lies in that the discrete outputs from the generative model make it difficult to pass the gradient update from the discriminative model to the generative model. Also, the discriminative model can only assess a complete sequence, while for a partially generated sequence, it is non-trivial to balance its current score and the future one once the entire sequence has been generated. In this paper, we propose a sequence generation framework, called SeqGAN, to solve the problems. Modeling the data generator as a stochastic policy in reinforcement learning (RL), SeqGAN bypasses the generator differentiation problem by directly performing gradient policy update. The RL reward signal comes from the GAN discriminator judged on a complete sequence, and is passed back to the intermediate state-action steps using Monte Carlo search. Extensive experiments on synthetic data and real-world tasks demonstrate significant improvements over strong baselines.
Robust Task Clustering for Deep Many-Task Learning
Yu, Mo, Guo, Xiaoxiao, Yi, Jinfeng, Chang, Shiyu, Potdar, Saloni, Tesauro, Gerald, Wang, Haoyu, Zhou, Bowen
We investigate task clustering for deep-learning based multi-task and few-shot learning in a many-task setting. We propose a new method to measure task similarities with cross-task transfer performance matrix for the deep learning scenario. Although this matrix provides us critical information regarding similarity between tasks, its asymmetric property and unreliable performance scores can affect conventional clustering methods adversely. Additionally, the uncertain task-pairs, i.e., the ones with extremely asymmetric transfer scores, may collectively mislead clustering algorithms to output an inaccurate task-partition. To overcome these limitations, we propose a novel task-clustering algorithm by using the matrix completion technique. The proposed algorithm constructs a partially-observed similarity matrix based on the certainty of cluster membership of the task-pairs. We then use a matrix completion algorithm to complete the similarity matrix. Our theoretical analysis shows that under mild constraints, the proposed algorithm will perfectly recover the underlying "true" similarity matrix with a high probability. Our results show that the new task clustering method can discover task clusters for training flexible and superior neural network models in a multi-task learning setup for sentiment classification and dialog intent classification tasks. Our task clustering approach also extends metric-based few-shot learning methods to adapt multiple metrics, which demonstrates empirical advantages when the tasks are diverse.
Understanding and Comparing Deep Neural Networks for Age and Gender Classification
Lapuschkin, Sebastian, Binder, Alexander, Müller, Klaus-Robert, Samek, Wojciech
Recently, deep neural networks have demonstrated excellent performances in recognizing the age and gender on human face images. However, these models were applied in a black-box manner with no information provided about which facial features are actually used for prediction and how these features depend on image preprocessing, model initialization and architecture choice. We present a study investigating these different effects. In detail, our work compares four popular neural network architectures, studies the effect of pretraining, evaluates the robustness of the considered alignment preprocessings via cross-method test set swapping and intuitively visualizes the model's prediction strategies in given preprocessing conditions using the recent Layer-wise Relevance Propagation (LRP) algorithm. Our evaluations on the challenging Adience benchmark show that suitable parameter initialization leads to a holistic perception of the input, compensating artefactual data representations. With a combination of simple preprocessing steps, we reach state of the art performance in gender recognition.