Pillai, Padmanabhan
Accelerating Deep Learning by Focusing on the Biggest Losers
Jiang, Angela H., Wong, Daniel L. -K., Zhou, Giulio, Andersen, David G., Dean, Jeffrey, Ganger, Gregory R., Joshi, Gauri, Kaminksy, Michael, Kozuch, Michael, Lipton, Zachary C., Pillai, Padmanabhan
This paper introduces Selective-Backprop, a technique that accelerates the training of deep neural networks (DNNs) by prioritizing examples with high loss at each iteration. Selective-Backprop uses the output of a training example's forward pass to decide whether to use that example to compute gradients and update parameters, or to skip immediately to the next example. By reducing the number of computationally-expensive backpropagation steps performed, Selective-Backprop accelerates training. Evaluation on CIFAR10, CIFAR100, and SVHN, across a variety of modern image models, shows that Selective-Backprop converges to target error rates up to 3.5x faster than with standard SGD and between 1.02--1.8x faster than a state-of-the-art importance sampling approach. Further acceleration of 26% can be achieved by using stale forward pass results for selection, thus also skipping forward passes of low priority examples.
Scheduling in Visual Fog Computing: NP-Completeness and Practical Efficient Solutions
Chu, Hong-Min (National Taiwan University) | Yang, Shao-Wen (Intel) | Pillai, Padmanabhan (Intel) | Chen, Yen-Kuang (Intel)
The visual fog paradigm envisions tens of thousands of heterogeneous, camera-enabled edge devices distributed across the Internet, providing live sensing for a myriad of different visual processing applications. The scale, computational demands, and bandwidth needed for visual computing pipelines necessitates offloading intelligently to distributed computing infrastructure, including the cloud, Internet gateway devices, and the edge devices themselves. This paper focuses on the visual fog scheduling problem of assigning the visual computing tasks to various devices to optimize network utilization. We first prove this problem is NP-complete, and then formulate a practical, efficient solution. We demonstrate sub-minute computation time to optimally schedule 20,000 tasks across over 7,000 devices, and just 7-minute execution time to place 60,000 tasks across 20,000 devices, showing our approach is ready to meet the scale challenges introduced by visual fog.
Automatic Tuning of Interactive Perception Applications
Zhu, Qian, Kveton, Branislav, Mummert, Lily, Pillai, Padmanabhan
Interactive applications incorporating high-data rate sensing and computer vision are becoming possible due to novel runtime systems and the use of parallel computation resources. To allow interactive use, such applications require careful tuning of multiple application parameters to meet required fidelity and latency bounds. This is a nontrivial task, often requiring expert knowledge, which becomes intractable as resources and application load characteristics change. This paper describes a method for automatic performance tuning that learns application characteristics and effects of tunable parameters online, and constructs models that are used to maximize fidelity for a given latency constraint. The paper shows that accurate latency models can be learned online, knowledge of application structure can be used to reduce the complexity of the learning task, and operating points can be found that achieve 90% of the optimal fidelity by exploring the parameter space only 3% of the time.
Beyond Audio and Video: Using Claytronics to Enable Pario
Goldstein, Seth Copen (Carnegie Mellon University) | Mowry, Todd C. (Carnegie Mellon University) | Campbell, Jason D. (Intel Research Pittsburgh) | Ashley-Rollman, Michael P (Carnegie Mellon University) | Rosa, Michael De (Carnegie Mellon University) | Funiak, Stanislav (Carnegie Mellon University) | Hoburg, James F. (Carnegie Mellon University) | Karagozler, Mustafa E. (Carnegie Mellon University) | Kirby, Brian (Carnegie Mellon University) | Lee, Peter (Carnegie Mellon University) | Pillai, Padmanabhan (Carnegie Mellon University) | Reid, J. Robert (Hanscom Air Force Base) | Stancil, Daniel D. (Carnegie Mellon University) | Weller, Michael P. (Carnegie Mellon University)
In this article, we describe the hardware and software challenges involved in realizing Claytronics, a form of programmable matter made out of very large numbers-potentially millions-of submillimeter sized spherical robots. The goal of the claytronics project is to create ensembles of cooperating submillimeter robots, which work together to form dynamic 3D physical objects. For example, claytronics might be used in telepresense to mimic, with high-fidelity and in 3-dimensional solid form, the look, feel, and motion of the person at the other end of the telephone call. To achieve this long-range vision we are investigating hardware mechanisms for constructing submillimeter robots, which can be manufactured en masse using photolithography. We also propose the creation of a new media type, which we call pario. The idea behind pario is to render arbitrary moving, physical 3-dimensional objects that you can see, touch, and even hold in your hands. In parallel with our hardware effort, we are developing novel distributed programming languages and algorithms to control the ensembles, LDP and Meld. Pario may fundamentally change how we communicate with others and interact with the world around us. Our research results to date suggest that there is a viable path to implementing both the hardware and software necessary for claytronics, which is a form of programmable matter that can be used to implement pario. While we have made significant progress, there is still much research ahead in order to turn this vision into reality.