"Many researchers … speculate that the information-processing abilities of biological neural systems must follow from highly parallel processes operating on representations that are distributed over many neurons. [Artificial neural networks] capture this kind of highly parallel computation based on distributed representations"
– from Machine Learning (Section 4.1.1; page 82) by Tom M. Mitchell, McGraw Hill Companies, Inc. (1997).
"We may be in the eternal spring of AI," says Andrew Ng, a luminary in the field of machine learning. Ng, a co-founder and former director of Google's AI team, sat down for an interview with ZDNet to discuss his just-published "playbook" for how to use the technology, which is available as a free download. He dismissed worries that artificial intelligence technology may be entering another one of its periodic "winters," when interest, and funding, drops off sharply. Andrew Ng explains the five principles of his "Playbook for AI." Machine learning, in the form of "connectionist" theories that model computing loosely along the lines of neurons in the brain, has gone through boom and bust cycles, flowering initially with Frank Rosenblatt's "perceptron" in the late 1950s, cooling in the late 60s, emerging again in the late 1980s only to again fall out of favor, and now suddenly back in vogue in the last several years. Those periodic coolings have been termed an "AI winter."
This week, at the International Electron Devices Meeting (IEDM) and the Conference on Neural Information Processing Systems (NeurIPS), IBM researchers will showcase new hardware that will take AI further than it's been before: right to the edge. Our novel approaches for digital and analog AI chips boost speed and slash energy demand for deep learning, without sacrificing accuracy. On the digital side, we're setting the stage for a new industry standard in AI training with an approach that achieves full accuracy with eight-bit precision, accelerating training time by two to four times over today's systems. On the analog side, we report eight-bit precision--the highest yet--for an analog chip, roughly doubling accuracy compared with previous analog chips while consuming 33x less energy than a digital architecture of similar precision. These achievements herald a new era of computing hardware designed to unleash the full potential of AI.
What happens as more of the world's computer tasks get handed over to neural networks? That's an intriguing prospect, of course, for Nvidia, a company selling a whole heck of a lot of chips to train neural networks. The prospect cheers Bryan Catanzaro, who is the head of applied deep learning research at Nvidia. "We would love for model-based to be more of the workload," Catanzaro told ZDNet this week during an interview at Nvidia's booth at the NeurIPS machine learning conference in Montreal. Catanzaro was the first person doing neural network work at Nvidia when he took a job there in 2011 after receiving his PhD from the University of California at Berkeley in electrical engineering and computer science.
IBM is unveiling new hardware that brings power efficiency and improved training times to artificial intelligence (AI) projects this week at the International Electron Devices Meeting (IEDM) and the Conference on Neural Information Processing Systems (NeurIPS), with 8-bit precision for both their analog and digital chips for AI. Over the last decade, computing performance for AI has improved at a rate of 2.5x per year, due in part to the use of GPUs to accelerate deep learning tasks, the company noted in a press release. However, this improvement is not sustainable, as most of the potential performance from this design model--a general-purpose computing solution tailored to AI--will not be able to keep pace with hardware designed exclusively for AI training and development. Per the press release, "Scaling AI with new hardware solutions is part of a wider effort at IBM Research to move from narrow AI, often used to solve specific, well-defined tasks, to broad AI, which reaches across disciplines to help humans solve our most pressing problems." While traditional computing has been in a decades-long path of increasing address width--with most consumer, professional, and enterprise-grade hardware using 64-bit processors--AI is going the opposite direction.
There are some detection problems in the world that only experts can solve, and by doing so are saving lives every day. Radiologists looking for intracerebral hemorrhage (ICH) save lives, but their time is scarce and expensive. But what if we could build an AI to perform this sort of detection? It is no simple task to train a CNN model, such as U-Net, to achieve this. But with the progress of deep learning libraries such as TensorFlow, the revolution of cloud providers such as AWS, Azure, and GCP, and deep learning platforms such as MissingLink, it's becoming increasingly feasible for startups to build an app at almost any scale--including to mimic the work of radiologists and other experts.
We will walk you through all the aspects of machine learning from simple linear regressions to the latest neural networks, and you will learn not only how to use them but also how to build them from scratch. Big part of this path is oriented on Computer Vision(CV), because it's the fastest way to get general knowledge, and the experience from CV can be simply transferred to any ML area. We will use TensorFlow as a ML framework, as it is the most promising and production ready. Learning will be better if you work on theoretical and practical materials at the same time to get practical experience on the learned material. Also if you want to compete with other people solving real life problems I would recommend you to register on Kaggle, as it could be a good addition to your resume.
Semiconductor Engineering sat down to discuss artificial intelligence (AI), machine learning, and chip and photomask manufacturing technologies with Aki Fujimura, chief executive of D2S; Jerry Chen, business and ecosystem development manager at Nvidia; Noriaki Nakayamada, senior technologist at NuFlare; and Mikael Wahlsten, director and product area manager at Mycronic. What follows are excerpts of that conversation. To read part one, click here. SE: Artificial neural networks, the precursor of machine learning, was a hot topic in the 1980s. In neural networks, a system crunches data and identifies patterns.
The new MIT Stephen A. Schwarzman College of Computing will incorporate the modern tools of computing into disciplines across the Institute. "The college will equip students to be as fluent in computing and AI [artificial intelligence] as they are in their own disciplines -- and ready to use these digital tools wisely and humanely to help make a better world," says MIT President Rafael Reif. As often happens, it appears MIT students are already there. We recently spoke with six undergraduate students who are participating in the Advanced Undergraduate Research Opportunities Program (SuperUROP), and found them already thinking deeply about how new computational technologies can be put to use in fields outside of computer science. These students are working on a huge range of problems that share a common theme: Solving them will provide tangible benefits to society.
Networks are fundamental building blocks for representing data, and computations. Remarkable progress in learning in structurally defined (shallow or deep) networks has recently been achieved. Here we introduce evolutionary exploratory search and learning method of topologically flexible networks under the constraint of producing elementary computational steady-state input-output operations. Our results include; (1) the identification of networks, over four orders of magnitude, implementing computation of steady-state input-output functions, such as a band-pass filter, a threshold function, and an inverse band-pass function. Next, (2) the learned networks are technically controllable as only a small number of driver nodes are required to move the system to a new state. Furthermore, we find that the fraction of required driver nodes is constant during evolutionary learning, suggesting a stable system design. (3), our framework allows multiplexing of different computations using the same network. For example, using a binary representation of the inputs, the network can readily compute three different input-output functions. Finally, (4) the proposed evolutionary learning demonstrates transfer learning. If the system learns one function A, then learning B requires on average less number of steps as compared to learning B from tabula rasa. We conclude that the constrained evolutionary learning produces large robust controllable circuits, capable of multiplexing and transfer learning. Our study suggests that network-based computations of steady-state functions, representing either cellular modules of cell-to-cell communication networks or internal molecular circuits communicating within a cell, could be a powerful model for biologically inspired computing. This complements conceptualizations such as attractor based models, or reservoir computing.
Anomaly detection in supercomputers is a very difficult problem due to the big scale of the systems and the high number of components. The current state of the art for automated anomaly detection employs Machine Learning methods or statistical regression models in a supervised fashion, meaning that the detection tool is trained to distinguish among a fixed set of behaviour classes (healthy and unhealthy states). We propose a novel approach for anomaly detection in High Performance Computing systems based on a Machine (Deep) Learning technique, namely a type of neural network called autoencoder. The key idea is to train a set of autoencoders to learn the normal (healthy) behaviour of the supercomputer nodes and, after training, use them to identify abnormal conditions. This is different from previous approaches which where based on learning the abnormal condition, for which there are much smaller datasets (since it is very hard to identify them to begin with). We test our approach on a real supercomputer equipped with a fine-grained, scalable monitoring infrastructure that can provide large amount of data to characterize the system behaviour. The results are extremely promising: after the training phase to learn the normal system behaviour, our method is capable of detecting anomalies that have never been seen before with a very good accuracy (values ranging between 88% and 96%).