Goto

Collaborating Authors

 Deep Learning



State-of-the-art neural coreference resolution for chatbots

#artificialintelligence

Coreference is a rather old NLP research topic [1]. It* has seen a revival of interest in the past two years as several research groups [2] applied cutting-edge deep-learning and reinforcement-learning techniques to it. It was published earlier this year that coreference resolution may be instrumental in improving the performances of NLP neural architectures like RNN and LSTM (see "Linguistic Knowledge as Memory for Recurrent Neural Networks" by B. Dhingra, Z. Yang, W. W. Cohen, and R. Salakhutdinov). Traditionally the set of features was hand-engineered from linguistic features and it could be huge. Some high quality systems use 120 features [4]! Here comes the nice thing about modern NLP techniques like word vectors and neural nets.


Why the AI Industry Needs to Rethink Storage - Pure Storage Blog

#artificialintelligence

When deploying a deep learning training cluster, system-level perspective is needed for a well-balanced solution. Let's take an example (shown above) of DGX-1 systems running Microsoft Cognitive Toolkit (formerly known as CNTK) framework using AlexNet. NVIDIA published results showing a DGX-1 can train at a throughput of 13K images per second. If images have an average size of 115KB, 10 DGX-1 has an ingest throughput requirement of 15 GB per second to keep the training job busy. Small-file read performance and IOPS are critical at this point, and can be the limiter in time to solution.


Why chatbots need a big push from deep learning

#artificialintelligence

Most tech giants are investing heavily both in applications and research, hoping to stay ahead of the curve of what many believe to be an inevitable AI led paradigm shift. At the forefront of this resurgence are the fields of conversational interactions (personal assistants or chatbots), computer vision and autonomous navigation, which thanks to advances in hardware, data availability and revolutionary machine learning techniques, have enjoyed tremendous progress within the span of just a few years. AI advances are turning problems previously thought to lie beyond the realm of what machines could tackle into commodities that are percolating our everyday life. Tailing the remarkable growth in popularity enjoyed by AI, a new generation of chatbots has recently flooded the market, and with them the promise of a world where many of our online interactions won't happen on a website or in an app, but in a conversation. Helping turn this promise into reality is a combination of better user interfaces, the omnipresence of smart-phones, and new, state of the art, machine learning techniques. Perhaps one of the main drivers behind this wave of novel AI applications is deep learning, an area of machine learning that, despite existing for roughly 50 years, has recently revolutionized fields such as computer vision and natural language processing (NLP).


A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks

arXiv.org Artificial Intelligence

Transfer and multi-task learning have traditionally focused on either a single source-target pair or very few, similar tasks. Ideally, the linguistic levels of morphology, syntax and semantics would benefit each other by being trained in a single model. We introduce a joint many-task model together with a strategy for successively growing its depth to solve increasingly complex tasks. Higher layers include shortcut connections to lower-level task predictions to reflect linguistic hierarchies. We use a simple regularization term to allow for optimizing all model weights to improve one task's loss without exhibiting catastrophic interference of the other tasks.


Stein Variational Adaptive Importance Sampling

arXiv.org Machine Learning

We propose a novel adaptive importance sampling algorithm which incorporates Stein variational gradient decent algorithm (SVGD) with importance sampling (IS). Our algorithm leverages the nonparametric transforms in SVGD to iteratively decrease the KL divergence between our importance proposal and the target distribution. The advantages of this algorithm are twofold: first, our algorithm turns SVGD into a standard IS algorithm, allowing us to use standard diagnostic and analytic tools of IS to evaluate and interpret the results; second, we do not restrict the choice of our importance proposal to predefined distribution families like traditional (adaptive) IS methods. Empirical experiments demonstrate that our algorithm performs well on evaluating partition functions of restricted Boltzmann machines and testing likelihood of variational auto-encoders.


Learning Robot Activities from First-Person Human Videos Using Convolutional Future Regression

arXiv.org Artificial Intelligence

We design a new approach that allows robot learning of new activities from unlabeled human example videos. Given videos of humans executing the same activity from a human's viewpoint (i.e., first-person videos), our objective is to make the robot learn the temporal structure of the activity as its future regression network, and learn to transfer such model for its own motor execution. We present a new deep learning model: We extend the state-of-the-art convolutional object detection network for the representation/estimation of human hands in training videos, and newly introduce the concept of using a fully convolutional network to regress (i.e., predict) the intermediate scene representation corresponding to the future frame (e.g., 1-2 seconds later). Combining these allows direct prediction of future locations of human hands and objects, which enables the robot to infer the motor control plan using our manipulation network. We experimentally confirm that our approach makes learning of robot activities from unlabeled human interaction videos possible, and demonstrate that our robot is able to execute the learned collaborative activities in real-time directly based on its camera input.


Her2 Challenge Contest: A Detailed Assessment of Automated Her2 Scoring Algorithms in Whole Slide Images of Breast Cancer Tissues

arXiv.org Artificial Intelligence

Evaluating expression of the Human epidermal growth factor receptor 2 (Her2) by visual examination of immunohistochemistry (IHC) on invasive breast cancer (BCa) is a key part of the diagnostic assessment of BCa due to its recognised importance as a predictive and prognostic marker in clinical practice. However, visual scoring of Her2 is subjective and consequently prone to inter-observer variability. Given the prognostic and therapeutic implications of Her2 scoring, a more objective method is required. In this paper, we report on a recent automated Her2 scoring contest, held in conjunction with the annual PathSoc meeting held in Nottingham in June 2016, aimed at systematically comparing and advancing the state-of-the-art Artificial Intelligence (AI) based automated methods for Her2 scoring. The contest dataset comprised of digitised whole slide images (WSI) of sections from 86 cases of invasive breast carcinoma stained with both Haematoxylin & Eosin (H&E) and IHC for Her2. The contesting algorithms automatically predicted scores of the IHC slides for an unseen subset of the dataset and the predicted scores were compared with the 'ground truth' (a consensus score from at least two experts). We also report on a simple Man vs Machine contest for the scoring of Her2 and show that the automated methods could beat the pathology experts on this contest dataset. This paper presents a benchmark for comparing the performance of automated algorithms for scoring of Her2. It also demonstrates the enormous potential of automated algorithms in assisting the pathologist with objective IHC scoring.


There's a big problem with AI: even its creators can't explain how it works

#artificialintelligence

The car's underlying AI technology, known as deep learning, has proved very powerful at solving problems in recent years, and it has been widely deployed for tasks like image captioning, voice recognition, and language translation. The resulting program, which the researchers named Deep Patient, was trained using data from about 700,000 individuals, and when tested on new records, it proved incredibly good at predicting disease. But it was not until the start of this decade, after several clever tweaks and refinements, that very large--or "deep"--neural networks demonstrated dramatic improvements in automated perception. Deep learning has transformed computer vision and dramatically improved machine translation.


Deep Learning and the Future of Auditing - The CPA Journal

#artificialintelligence

This article introduces deep learning technology--an emerging form of artificial intelligence that can be trained to recognize patterns in vast volumes of data that would be impossible for humans to process. This still evolving technology represents a way to utilize big data to create supplementary audit evidence that improves the effectiveness and efficiency of audit automation and decision making. The authors also discuss the application of these techniques to audit procedures. In the current business environment, the development of data-intensive technologies (e.g., ERP systems, sensors, cloud storage, remote communication tools) facilitates the production and maintenance of large amounts of data, which necessitates a new data environment and serves as a motivator for audit automation. Leading accounting firms have leveraged deep learning, a cutting-edge use of artificial intelligence, to conduct audit tasks.