Goto

Collaborating Authors

 deep learning performance


The Impact of Scanner Domain Shift on Deep Learning Performance in Medical Imaging: an Experimental Study

Szumel, Gregory, Guo, Brian, Lu, Darui, Gui, Rongze, Wang, Tingyu, Konz, Nicholas, Mazurowski, Maciej A.

arXiv.org Artificial Intelligence

Purpose: Medical images acquired using different scanners and protocols can differ substantially in their appearance. This phenomenon, scanner domain shift, can result in a drop in the performance of deep neural networks which are trained on data acquired by one scanner and tested on another. This significant practical issue is well-acknowledged, however, no systematic study of the issue is available across different modalities and diagnostic tasks. Materials and Methods: In this paper, we present a broad experimental study evaluating the impact of scanner domain shift on convolutional neural network performance for different automated diagnostic tasks. We evaluate this phenomenon in common radiological modalities, including X-ray, CT, and MRI. Results: We find that network performance on data from a different scanner is almost always worse than on same-scanner data, and we quantify the degree of performance drop across different datasets. Notably, we find that this drop is most severe for MRI, moderate for X-ray, and quite small for CT, on average, which we attribute to the standardized nature of CT acquisition systems which is not present in MRI or X-ray. We also study how injecting varying amounts of target domain data into the training set, as well as adding noise to the training data, helps with generalization. Conclusion: Our results provide extensive experimental evidence and quantification of the extent of performance drop caused by scanner domain shift in deep learning across different modalities, with the goal of guiding the future development of robust deep learning models for medical image analysis.


Optimizing AI and Deep Learning Performance

#artificialintelligence

Optimizing AI and Deep Learning Performance Sven Breuner (DesignRage/Shutterstock) As AI and deep learning uses skyrocket, organizations are finding they are running these systems on similar resource as they do with high-performance computing (HPC) systems – and wondering if this is the path to peak efficiency. Ostensibly AI and HPC architectures have a lot in common, as AI has evolved into even more data-intensive machine learning (ML) and deep learning (DL) domains (Figure 1). Workloads often require multiple GPU systems as a cluster, and share those systems in a coordinated way among multiple data scientists. Secondly, both AI and HPC workloads require shared access to data at a high level of performance and communicate over a fast RDMA-enabled network. Especially in scientific research, the classic HPC systems nowadays tend to have GPUs added to the compute nodes to have the same cluster suitable for classic HPC and new AI/DL workloads.


Deep learning performance on Red Hat OpenShift with Supermicro

#artificialintelligence

Red Hat and Supermicro ran the AI workload, MLPerf Training v0.6, on the Red Hat OpenShift Container Platform with Supermicro hardware and compared it to the MLPerf Training v0.6 results published by Nvidia. In addition to excellent performance, we demonstrated how OpenShift provides easy access to high-performance machine learning model training when running on this Supermicro reference architecture.


Flex Logix Improves Deep Learning Performance By 10X With New EFLX4K AI eFPGA Core

#artificialintelligence

This new core has been specifically designed to enhance the performance of deep learning by 10X and enable more neural network processing per square millimeter. Many companies are using FPGA to implement AI and more specifically machine learning, deep learning and neural networks as approaches to achieve AI. The key function needed for AI are matrix multipliers, which consist of arrays of MACs (multiplier accumulators). In existing FPGA and eFPGAs, the MACs are optimized for DSPs with larger multipliers, pre-adders and other logic which are overkill for AI. For AI applications, smaller multipliers such as 16 bits or 8 bits, with the ability to support both modes with accumulators, allow more neural network processing per square millimeter.


Nvidia Volta GPU has over 120 Teraflops for Deep Learning and 5X power of Nvidia Pascal GPU NextBigFuture.com

#artificialintelligence

The company also announced its first Volta-based processor, the NVIDIA Tesla V100 data center GPU, which brings extraordinary speed and scalability for AI inferencing and training, as well as for accelerating HPC and graphics workloads. "Artificial intelligence is driving the greatest technology advances in human history," said Jensen Huang, founder and chief executive officer of NVIDIA, who unveiled Volta at his GTC keynote. "It will automate intelligence and spur a wave of social progress unmatched since the industrial revolution. "Deep learning, a groundbreaking AI approach that creates computer software that learns, has insatiable demand for processing power. Thousands of NVIDIA engineers spent over three years crafting Volta to help meet this need, enabling the industry to realize AI's life-changing potential," he said.


The Next Battleground for Deep Learning Performance

#artificialintelligence

The frameworks are in place, the hardware infrastructure is robust, but what has been keeping machine learning performance at bay has far less to do with the system-level capabilities and more to do with intense model optimization. While it might not be the sexy story that generates the unending wave of headlines around deep learning, hyperparameter tuning is a big barrier when it comes to new leaps in deep learning performance. In more traditional machine learning, there are plenty of open sources tools for this, but where it is needed most is in deep learning--an area that does appear to be gaining a solid enterprise foothold outside of the initial web companies that spun services based on image, speech, and video recognition. Optimizing traditional machine learning and newer deep learning frameworks like TensorFlow is not simple--and it can have an incredible impact when it is done (or not done) well, providing many orders of magnitude improvements in accuracy, performance, or efficiency--depending on what users tune for. Configuring around the number and scope of hypermeters in a TensorFlow-driven workload leaves humans in the dust and optimizing with brute force methods is computationally wasteful, at least if there is a more targeted, streamlined way of knob-turning for the desired model modifications (performance, accuracy, etc.).


How To Improve Deep Learning Performance - Machine Learning Mastery

#artificialintelligence

How can you get better performance from your deep learning model? It is one of the most common questions I get asked. What can I do if my neural network performs poorly? I often reply with "I don't know exactly, but I have lots of ideas." Then I proceed to list out all of the ideas I can think of that might give a lift in performance. Rather than write out that list again, I've decided to put all of my ideas into this post. The ideas won't just help you with deep learning, but really any machine learning algorithm. How To Improve Deep Learning Performance Photo by Pedro Ribeiro Simões, some rights reserved. This list of ideas is not complete but it is a great start.


How To Improve Deep Learning Performance - Machine Learning Mastery

#artificialintelligence

How can you get better performance from your deep learning model? It is one of the most common questions I get asked. What can I do if my neural network performs poorly? I often reply with "I don't know exactly, but I have lots of ideas." Then I proceed to list out all of the ideas I can think of that might give a lift in performance. Rather than write out that list again, I've decided to put all of my ideas into this post. The ideas won't just help you with deep learning, but really any machine learning algorithm. How To Improve Deep Learning Performance Photo by Pedro Ribeiro Simões, some rights reserved. This list of ideas is not complete but it is a great start.


"Better Than GPU" Deep Learning Performance with Intel Scalable System Framework

#artificialintelligence

Intel Scalable Systems Framework (Intel SSF) reduces confusion given the wealth of new technologies now available to HPC customers, and offers guidance for the right mix of balanced and validated hardware and software technologies. Intel SSF incorporates a host of software and hardware technologies including Intel Omni-Path Architecture (Intel OPA), Intel Optane SSDs built on 3D XPoint technology, and new Intel Silicon Photonics – plus it incorporates Intel's compute and storage products, including Intel Xeon processors, Intel Xeon Phi processors, and Intel Enterprise Edition for Lustre* software. Benchmarks show that a combination of Intel SSF technologies (Intel Xeon Phi and Intel OPA) provide significantly better scaling and performance when training deep learning neural networks than GPU-based products on well-known benchmarks such as AlexNet and GoogleNet [1]. These and other deep-learning benchmarks can be viewed on the Intel machine learning portal. Intel Xeon Phi processors deliver superior neural networking training performance using up to seventy two (72) processing cores per processor where each core contains two Intel AVX-512 vector processing units.


Intel to Acquire AI Startup Nervana Systems

#artificialintelligence

San Diego, California-based Nervana will help develop Intel's artificial intelligence portfolio and enhance the deep learning performance of Intel Xeon and Intel Xeon Phi processors, the company said in a blog post. Investors in Nervana include Global Playground, CME Ventures, Lux Capital, Allen & Co and AME Cloud Ventures. Allen & Co LLC is the exclusive financial adviser to Nervana in the deal. "Success in this space requires continued innovation to deliver an optimized, scalable platform providing the highest performance at lowest total cost of ownership… I'm excited to announce that Intel signed a definitive agreement to acquire Nervana Systems, a recognized leader in deep learning. Founded in 2014, Nervana has a fully-optimized software and hardware stack for deep learning. Their IP and expertise in accelerating deep learning algorithms will expand Intel's capabilities in the field of AI. We will apply Nervana's software expertise to further optimize the Intel Math Kernel Library and its integration into industry standard frameworks. Nervana's Engine and silicon expertise will advance Intel's AI portfolio and enhance the deep learning performance and TCO of our Intel Xeon and Intel Xeon Phi processors. We will share more about artificial intelligence and the amazing experiences it enables at our Intel Developer Forum next week."