Goto

Collaborating Authors

Turing Award For Pixar, EfficientNet Lite Release And More:Top AI News

#artificialintelligence

Regardless of what is happening around the world, the AI community are one productive bunch, and they have something interesting to share almost every day. Here's what is new this week: The short history of deep learning indicates the incredible effectiveness of infinitely wide networks. Insights from these infinitely wide networks can be used as a lens to study deep learning. However, implementing infinite-width models in an efficient and scalable way requires significant engineering proficiency. To address these challenges and accelerate theoretical progress in deep learning, Google's AI team released Neural Tangents, a new open-source software library written in JAX.


Turing Award For Pixar, EfficientNet Lite Release And More:Top AI News

#artificialintelligence

Regardless of what is happening around the world, the AI community are one productive bunch, and they have something interesting to share almost every day. Here's what is new this week: The short history of deep learning indicates the incredible effectiveness of infinitely wide networks. Insights from these infinitely wide networks can be used as a lens to study deep learning. However, implementing infinite-width models in an efficient and scalable way requires significant engineering proficiency. To address these challenges and accelerate theoretical progress in deep learning, Google's AI team released Neural Tangents, a new open-source software library written in JAX.


Real-Time Object Detection and Recognition on Low-Compute Humanoid Robots using Deep Learning

arXiv.org Machine Learning

We envision that in the near future, humanoid robots would share home space and assist us in our daily and routine activities through object manipulations. One of the fundamental technologies that need to be developed for robots is to enable them to detect objects and recognize them for effective manipulations and take real-time decisions involving those objects. In this paper, we describe a novel architecture that enables multiple low-compute NAO robots to perform real-time detection, recognition and localization of objects in its camera view and take programmable actions based on the detected objects. The proposed algorithm for object detection and localization is an empirical modification of YOLOv3, based on indoor experiments in multiple scenarios, with a smaller weight size and lesser computational requirements. Quantization of the weights and re-adjusting filter sizes and layer arrangements for convolutions improved the inference time for low-resolution images from the robot s camera feed. YOLOv3 was chosen after a comparative study of bounding box algorithms was performed with an objective to choose one that strikes the perfect balance among information retention, low inference time and high accuracy for real-time object detection and localization. The architecture also comprises of an effective end-to-end pipeline to feed the real-time frames from the camera feed to the neural net and use its results for guiding the robot with customizable actions corresponding to the detected class labels.


DeepBbox: Accelerating Precise Ground Truth Generation for Autonomous Driving Datasets

arXiv.org Artificial Intelligence

DeepBbox: Accelerating Precise Ground Truth Generation for Autonomous Driving Datasets Govind Rathore, Wan-Yi Lin and Ji Eun Kim Abstract -- Autonomous driving requires various computer vision algorithms, such as object detection and tracking. Precisely-labeled datasets (i.e., objects are fully contained in bounding boxes with only a few extra pixels) are preferred for training such algorithms, so that the algorithms can detect exact locations of the objects. However, it is very time-consuming and hence expensive to generate precise labels for image sequences at scale. In this paper, we propose DeepBbox, an algorithm that "corrects" loose object labels into right bounding boxes to reduce human annotation efforts. We use Cityscapes [1] dataset to show annotation efficiency and accuracy improvement using DeepBbox. Experimental results show that, with DeepBbox, we can increase the number of object edges that are labeled automatically (within 1% error) by 50% to reduce manual annotation time.


Leveraging Pre-Trained 3D Object Detection Models For Fast Ground Truth Generation

arXiv.org Artificial Intelligence

Training 3D object detectors for autonomous driving has been limited to small datasets due to the effort required to generate annotations. Reducing both task complexity and the amount of task switching done by annotators is key to reducing the effort and time required to generate 3D bounding box annotations. This paper introduces a novel ground truth generation method that combines human supervision with pretrained neural networks to generate per-instance 3D point cloud segmentation, 3D bounding boxes, and class annotations. The annotators provide object anchor clicks which behave as a seed to generate instance segmentation results in 3D. The points belonging to each instance are then used to regress object centroids, bounding box dimensions, and object orientation. Our proposed annotation scheme requires 30x lower human annotation time. We use the KITTI 3D object detection dataset to evaluate the efficiency and the quality of our annotation scheme. We also test the the proposed scheme on previously unseen data from the Autonomoose self-driving vehicle to demonstrate generalization capabilities of the network.