Goto

Collaborating Authors

Three ways to fix DRAM's latency problem

ZDNet

In a brilliant PhD thesis, Understanding and Improving the Latency of DRAM-Based Memory Systems, Kevin K. Chang of CMU tackles the DRAM issue, and suggests some novel architectural enhancements to make substantial improvements in DRAM latency.


What is Edge Computing?

#artificialintelligence

Edge computing is made up of two words in which edge means edge and computing means computation. Unlike cloud computing, under computing, data is collected near devices for computing purposes. In other words, it is a new networking system under which data sources / servers and data processing are brought closer to the computing process to reduce latency and bandwidth problems and increase the capacity of an application. In contrast, the source of data in cloud computing can be located thousands of kilometers away from the machine. Under edge computing, the data server is deployed locally and the data collection and processing is done locally and only the required data is sent to the remote cloud.


The Cloud-Era Of Computing Is Just About Over, So What's Next?

#artificialintelligence

Most technology startups founded in the past decade are either building software for the cloud or running their applications in the cloud. It has become so trendy that many venture investors will simply not invest in any "tech" that is not "cloud-based." With the cloud as popular as ever, it may seem like heresy to say that it is all about to change, but that is exactly what happens with technology and innovation. As soon as you start to get comfortable, everything changes. I believe the next era of computing will transition the focus to what is known as "edge computing" which, in many ways, is the anti-cloud.


Over configuring - and how to fix it

ZDNet

Enterprises spend untold hours on IT spend - capital and operations - justifying hardware, software, and cloud projects. It's a major expense, but an under examined one, for a very simple reason. Users hate it when apps slow down or, worse, crash, and their complaints make handy excuses for business units. So, from the earliest days, IT's incentive is to over configure the infrastructure with massive headroom to handle demand spikes. That was easy when IT could reasonably say, "since it takes 6-9 months to stand up new servers and storage, we need that much runway to keep the business running."


LC-NAS: Latency Constrained Neural Architecture Search for Point Cloud Networks

arXiv.org Artificial Intelligence

Point cloud architecture design has become a crucial problem for 3D deep learning. Several efforts exist to manually design architectures with high accuracy in point cloud tasks such as classification, segmentation, and detection. Recent progress in automatic Neural Architecture Search (NAS) minimizes the human effort in network design and optimizes high performing architectures. However, these efforts fail to consider important factors such as latency during inference. Latency is of high importance in time critical applications like self-driving cars, robot navigation, and mobile applications, that are generally bound by the available hardware. In this paper, we introduce a new NAS framework, dubbed LC-NAS, where we search for point cloud architectures that are constrained to a target latency. We implement a novel latency constraint formulation to trade-off between accuracy and latency in our architecture search. Contrary to previous works, our latency loss guarantees that the final network achieves latency under a specified target value. This is crucial when the end task is to be deployed in a limited hardware setting. Extensive experiments show that LC-NAS is able to find state-of-the-art architectures for point cloud classification in ModelNet40 with minimal computational cost. We also show how our searched architectures achieve any desired latency with a reasonably low drop in accuracy. Finally, we show how our searched architectures easily transfer to a different task, part segmentation on PartNet, where we achieve state-of-the-art results while lowering latency by a factor of 10.