Editor's note: The name of the NVIDIA Transfer Learning Toolkit was changed to NVIDIA TAO Toolkit in August 2021. All references to the name have been updated in this blog. You probably have a career. But hit the books for a graduate degree or take online certificate courses by night, and you could start a new career building on your past experience. Transfer learning is the same idea.
Artificial intelligence (AI) has become a part of everyday conversation and our lives. It is considered as the new electricity that is revolutionizing the world. AI is heavily invested in both industry and academy. However, there is also a lot of hype in the current AI debate. AI based on so-called deep learning has achieved impressive results in many problems, but its limits are already visible. AI has been under research since the 1940s, and the industry has seen many ups and downs due to over-expectations and related disappointments that have followed. The purpose of this book is to give a realistic picture of AI, its history, its potential and limitations. We believe that AI is a helper, not a ruler of humans. We begin by describing what AI is and how it has evolved over the decades. After fundamentals, we explain the importance of massive data for the current mainstream of artificial intelligence. The most common representations for AI, methods, and machine learning are covered. In addition, the main application areas are introduced. Computer vision has been central to the development of AI. The book provides a general introduction to computer vision, and includes an exposure to the results and applications of our own research. Emotions are central to human intelligence, but little use has been made in AI. We present the basics of emotional intelligence and our own research on the topic. We discuss super-intelligence that transcends human understanding, explaining why such achievement seems impossible on the basis of present knowledge,and how AI could be improved. Finally, a summary is made of the current state of AI and what to do in the future. In the appendix, we look at the development of AI education, especially from the perspective of contents at our own university.
In the era of big data, data-driven based classification has become an essential method in smart manufacturing to guide production and optimize inspection. The industrial data obtained in practice is usually time-series data collected by soft sensors, which are highly nonlinear, nonstationary, imbalanced, and noisy. Most existing soft-sensing machine learning models focus on capturing either intra-series temporal dependencies or pre-defined inter-series correlations, while ignoring the correlation between labels as each instance is associated with multiple labels simultaneously. In this paper, we propose a novel graph based soft-sensing neural network (GraSSNet) for multivariate time-series classification of noisy and highly-imbalanced soft-sensing data. The proposed GraSSNet is able to 1) capture the inter-series and intra-series dependencies jointly in the spectral domain; 2) exploit the label correlations by superimposing label graph that built from statistical co-occurrence information; 3) learn features with attention mechanism from both textual and numerical domain; and 4) leverage unlabeled data and mitigate data imbalance by semi-supervised learning. Comparative studies with other commonly used classifiers are carried out on Seagate soft sensing data, and the experimental results validate the competitive performance of our proposed method.
Mixed precision training offers significant computational speedup by performing operations in half-precision format, while storing minimal information in single-precision to retain as much information as possible in critical parts of the network. Since the introduction of Tensor Cores in the Volta and Turing architectures, significant training speedups are experienced by switching to mixed precision -- up to 3x overall speedup on the most arithmetically intense model architectures. The ability to train deep learning networks with lower precision was introduced in the Pascal architecture and first supported in CUDA 8 in the NVIDIA Deep Learning SDK. Mixed precision is the combined use of different numerical precisions in a computational method. Half precision (also known as FP16) data compared to higher precision FP32 vs FP64 reduces memory usage of the neural network, allowing training and deployment of larger networks, and FP16 data transfers take less time than FP32 or FP64 transfers.
With the rapid development of AI technology in recent years, there have been many studies with deep learning models in soft sensing area. However, the models have become more complex, yet, the data sets remain limited: researchers are fitting million-parameter models with hundreds of data samples, which is insufficient to exercise the effectiveness of their models and thus often fail to perform when implemented in industrial applications. To solve this long-lasting problem, we are providing large scale, high dimensional time series manufacturing sensor data from Seagate Technology to the public. We demonstrate the challenges and effectiveness of modeling industrial big data by a Soft Sensing Transformer model on these data sets. Transformer is used because, it has outperformed state-of-the-art techniques in Natural Language Processing, and since then has also performed well in the direct application to computer vision without introduction of image-specific inductive biases. We observe the similarity of a sentence structure to the sensor readings and process the multi-variable sensor readings in a time series in a similar manner of sentences in natural language. The high-dimensional time-series data is formatted into the same shape of embedded sentences and fed into the transformer model. The results show that transformer model outperforms the benchmark models in soft sensing field based on auto-encoder and long short-term memory (LSTM) models. To the best of our knowledge, we are the first team in academia or industry to benchmark the performance of original transformer model with large-scale numerical soft sensing data.
A new learned legged locomotion study uses massive parallelism on a single GPU to get robots up and walking on flat terrain in under four minutes, and on uneven terrain in twenty minutes. Although deep reinforcement learning (DRL) has achieved impressive results in robotics, the amount of data required to train a policy increases dramatically with task complexity. One way to improve the quality and time-to-deployment of DRL policies is to use massive parallelism. In the paper Learning to Walk in Minutes Using Massively Parallel Deep Reinforcement Learning, a research team from ETH Zurich and NVIDIA proposes a training framework that enables fast policy generation for real-world robotic tasks using massive parallelism on a single workstation GPU. Compared to previous methods, the approach can reduce training time by multiple orders of magnitude.
If you're tackling a degree in science, technology, engineering, or mathematics, there's nothing more frustrating than a machine that can't keep up with the apps you need for your coursework. Here's where a powerful gaming laptop proves its mettle. With GPU acceleration, your machine delivers super-fast image processing, real-time rendering for complex component designs, and it lets you work quickly and efficiently. For engineering students, this means more interactive, real-time rendering for 3D design and modeling, plus faster solutions and visualization for mechanical, structural, and electrical simulations. For computer science, data science, and economics students, NVIDIA's GeForce RTX 30 Series laptops enable faster data analytics for processing large data sets -- all with efficient training for deep learning and traditional machine learning models for computer vision, natural language processing, and tabular data.
Good news for folks looking to learn about the latest AI development techniques: Nvidia is now allowing the general public to access the online workshops it provides through its Deep Learning Institute (DLI). The GPU giant today announced today that selected workshops in the DLI catalog will be open to everybody. These workshops previously were available only to companies that wanted specialized training for their in-house developers, or to folks who had attended the company's GPU Technology Conferences. Two of the open courses will take place next month, including "Fundamentals of Accelerated Computing with CUDA Python," which explores developing parallel workloads with CUDA and NumPy and cost $500. There is also "Applications of AI for Predictive Maintenance," which explores technologies like XGBoost, LSTM, Keras, and Tensorflow, and costs $700.
It is a long-term goal to transfer biological processing principles as well as the power of human recognition into machine vision and engineering systems. One of such principles is visual attention, a smart human concept which focuses processing on a part of a scene. In this contribution, we utilize attention to improve the automatic detection of defect patterns for wafers within the domain of semiconductor manufacturing. Previous works in the domain have often utilized classical machine learning approaches such as KNNs, SVMs, or MLPs, while a few have already used modern approaches like deep neural networks (DNNs). However, one problem in the domain is that the faults are often very small and have to be detected within a larger size of the chip or even the wafer. Therefore, small structures in the size of pixels have to be detected in a vast amount of image data. One interesting principle of the human brain for solving this problem is visual attention. Hence, we employ here a biologically plausible model of visual attention for automatic visual inspection. We propose a hybrid system of visual attention and a deep neural network. As demonstrated, our system achieves among other decisive advantages an improvement in accuracy from 81% to 92%, and an increase in accuracy for detecting faults from 67% to 88%. Hence, the error rates are reduced from 19% to 8%, and notably from 33% to 12% for detecting a fault in a chip. These results show that attention can greatly improve the performance of visual inspection systems. Furthermore, we conduct a broad evaluation, identifying specific advantages of the biological attention model in this application, and benchmarks standard deep learning approaches as an alternative with and without attention. This work is an extended arXiv version of the original conference article published in "IECON 2020", which has been extended regarding visual attention.
Following MIT, researchers at NVIDIA have recently developed a new augmented method for training Generative Adversarial Networks (GANs) with a limited amount of data. The approach is an adaptive discriminator augmentation mechanism that significantly stabilised training in limited data regimes. Machine learning models are data-hungry. As a matter of fact, in the past few years, we have seen that models that are fed with silos of data produce outstanding predictive outcomes. Alongside, with significant growth, Generative Adversarial Networks have been successfully used for various applications including high-fidelity natural image synthesis, data augmentation tasks, improving image compressions, etc. From emoting realistic expressions to traversing the deep space, and from bridging the gap between humans and machines to introduce new and unique art forms, GANs have it all covered. Although deep neural network models, including GANs, have shown impressive results, yet there remains a challenge of collecting a large number of specific datasets.