Goto

Collaborating Authors

 Vedder, Kyle


ZeroFlow: Scalable Scene Flow via Distillation

arXiv.org Artificial Intelligence

Scene flow estimation is the task of describing the 3D motion field between temporally successive point clouds. State-of-the-art methods use strong priors and test-time optimization techniques, but require on the order of tens of seconds to process full-size point clouds, making them unusable as computer vision primitives for real-time applications such as open world object detection. Feedforward methods are considerably faster, running on the order of tens to hundreds of milliseconds for full-size point clouds, but require expensive human supervision. To address both limitations, we propose Scene Flow via Distillation, a simple, scalable distillation framework that uses a label-free optimization method to produce pseudo-labels to supervise a feedforward model. Our instantiation of this framework, ZeroFlow, achieves state-of-the-art performance on the Argoverse 2 Self-Supervised Scene Flow Challenge while using zero human labels by simply training on large-scale, diverse unlabeled data. At test-time, ZeroFlow is over 1000x faster than label-free state-of-the-art optimization-based methods on full-size point clouds (34 FPS vs 0.028 FPS) and over 1000x cheaper to train on unlabeled data compared to the cost of human annotation (\$394 vs ~\$750,000). To facilitate further research, we will release our code, trained model weights, and high quality pseudo-labels for the Argoverse 2 and Waymo Open datasets.


A Domain-Agnostic Approach for Characterization of Lifelong Learning Systems

arXiv.org Artificial Intelligence

Despite the advancement of machine learning techniques in recent years, state-of-the-art systems lack robustness to "real world" events, where the input distributions and tasks encountered by the deployed systems will not be limited to the original training context, and systems will instead need to adapt to novel distributions and tasks while deployed. This critical gap may be addressed through the development of "Lifelong Learning" systems that are capable of 1) Continuous Learning, 2) Transfer and Adaptation, and 3) Scalability. Unfortunately, efforts to improve these capabilities are typically treated as distinct areas of research that are assessed independently, without regard to the impact of each separate capability on other aspects of the system. We instead propose a holistic approach, using a suite of metrics and an evaluation framework to assess Lifelong Learning in a principled way that is agnostic to specific domains or system techniques. Through five case studies, we show that this suite of metrics can inform the development of varied and complex Lifelong Learning systems. We highlight how the proposed suite of metrics quantifies performance trade-offs present during Lifelong Learning system development - both the widely discussed Stability-Plasticity dilemma and the newly proposed relationship between Sample Efficient and Robust Learning. Further, we make recommendations for the formulation and use of metrics to guide the continuing development of Lifelong Learning systems and assess their progress in the future.


Sparse PointPillars: Maintaining and Exploiting Input Sparsity to Improve Runtime on Embedded Systems

arXiv.org Artificial Intelligence

Abstract-- Bird's Eye View (BEV) is a popular representation Our main contributions include: 1) A new pipeline that maintains and exploits representational I. In the autonomous vehicle space, high-end desktop same power budget or modest runtime speedups for GPUs and CPUs are often available on-board, but this a significantly smaller power budget, all in exchange hardware still faces power and cost limits and must be for a modest decrease in detection quality. This 3) A general design approach centered around representational challenge is even more pronounced for intelligent mobile sparsity for efficient embedded system pipelines. GPUs and high-end CPUs in order to run its control stack. One generalpurpose the problem of developing machine learning models that have solution to this problem, model quantization [5]-[9], significantly reduced resource usage compared to existing first trains models in a standard fashion using floating point models while preserving their performance -- models need weights and then, after training, converts some [5] or all [6], to be shrunk not just to fit on smaller devices, but to fit while [7] weights into integer [8] or binary [9] quantized values that sharing these resources with other components.