Liang, Ruofan
INRet: A General Framework for Accurate Retrieval of INRs for Shapes
Guan, Yushi, Kwan, Daniel, Liang, Ruofan, Panneer, Selvakumar, Jain, Nilesh, Ahuja, Nilesh, Vijaykumar, Nandita
Implicit neural representations (INRs) have become an important method for encoding various data types, such as 3D objects or scenes, images, and videos. They have proven to be particularly effective at representing 3D content, e.g., 3D scene reconstruction from 2D images, novel 3D content creation, as well as the representation, interpolation, and completion of 3D shapes. With the widespread generation of 3D data in an INR format, there is a need to support effective organization and retrieval of INRs saved in a data store. A key aspect of retrieval and clustering of INRs in a data store is the formulation of similarity between INRs that would, for example, enable retrieval of similar INRs using a query INR. In this work, we propose INRet, a method for determining similarity between INRs that represent shapes, thus enabling accurate retrieval of similar shape INRs from an INR data store. INRet flexibly supports different INR architectures such as INRs with octree grids, triplanes, and hash grids, as well as different implicit functions including signed/unsigned distance function and occupancy field. We demonstrate that our method is more general and accurate than the existing INR retrieval method, which only supports simple MLP INRs and requires the same architecture between the query and stored INRs. Furthermore, compared to converting INRs to other representations (e.g., point clouds or multi-view images) for 3D shape retrieval, INRet achieves higher accuracy while avoiding the conversion overhead.
DISORF: A Distributed Online NeRF Training and Rendering Framework for Mobile Robots
Li, Chunlin, Liang, Ruofan, Fan, Hanrui, Zhang, Zhengen, Durvasula, Sankeerth, Vijaykumar, Nandita
We present a framework, DISORF, to enable online 3D reconstruction and visualization of scenes captured by resource-constrained mobile robots and edge devices. To address the limited compute capabilities of edge devices and potentially limited network availability, we design a framework that efficiently distributes computation between the edge device and remote server. We leverage on-device SLAM systems to generate posed keyframes and transmit them to remote servers that can perform high quality 3D reconstruction and visualization at runtime by leveraging NeRF models. We identify a key challenge with online NeRF training where naive image sampling strategies can lead to significant degradation in rendering quality. We propose a novel shifted exponential frame sampling method that addresses this challenge for online NeRF training. We demonstrate the effectiveness of our framework in enabling high-quality real-time reconstruction and visualization of unknown scenes as they are captured and streamed from cameras in mobile robots and edge devices.
SPIDR: SDF-based Neural Point Fields for Illumination and Deformation
Liang, Ruofan, Zhang, Jiahao, Li, Haoda, Yang, Chen, Guan, Yushi, Vijaykumar, Nandita
Neural radiance fields (NeRFs) have recently emerged as a promising approach for 3D reconstruction and novel view synthesis. However, NeRF-based methods encode shape, reflectance, and illumination implicitly and this makes it challenging for users to manipulate these properties in the rendered images explicitly. Existing approaches only enable limited editing of the scene and deformation of the geometry. Furthermore, no existing work enables accurate scene illumination after object deformation. In this work, we introduce SPIDR, a new hybrid neural SDF representation. SPIDR combines point cloud and neural implicit representations to enable the reconstruction of higher quality object surfaces for geometry deformation and lighting estimation. meshes and surfaces for object deformation and lighting estimation. To more accurately capture environment illumination for scene relighting, we propose a novel neural implicit model to learn environment light. To enable more accurate illumination updates after deformation, we use the shadow mapping technique to approximate the light visibility updates caused by geometry editing. We demonstrate the effectiveness of SPIDR in enabling high quality geometry editing with more accurate updates to the illumination of the scene.
ENVIDR: Implicit Differentiable Renderer with Neural Environment Lighting
Liang, Ruofan, Chen, Huiting, Li, Chunlin, Chen, Fan, Panneer, Selvakumar, Vijaykumar, Nandita
Recent advances in neural rendering have shown great potential for reconstructing scenes from multiview images. However, accurately representing objects with glossy surfaces remains a challenge for existing methods. In this work, we introduce ENVIDR, a rendering and modeling framework for high-quality rendering and reconstruction of surfaces with challenging specular reflections. To achieve this, we first propose a novel neural renderer with decomposed rendering components to learn the interaction between surface and environment lighting. This renderer is trained using existing physically based renderers and is decoupled from actual scene representations. We then propose an SDF-based neural surface model that leverages this learned neural renderer to represent general scenes. Our model additionally synthesizes indirect illuminations caused by inter-reflections from shiny surfaces by marching surface-reflected rays. We demonstrate that our method outperforms state-of-art methods on challenging shiny scenes, providing high-quality rendering of specular reflections while also enabling material editing and scene relighting.
Operator Fusion in XLA: Analysis and Evaluation
Snider, Daniel, Liang, Ruofan
Machine learning (ML) compilers are an active area of research because they offer the potential to automatically speedup tensor programs. Kernel fusion is often cited as an important optimization performed by ML compilers. However, there exists a knowledge gap about how XLA, the most common ML compiler, applies this nuanced optimization, what kind of speedup it can afford, and what low-level effects it has on hardware. Our paper aims to bridge this knowledge gap by studying key compiler passes of XLA's source code. Our evaluation on a reinforcement learning environment Cartpole shows how different fusion decisions in XLA are made in practice. Furthermore, we implement several XLA kernel fusion strategies that can achieve up to 10.56x speedup compared to our baseline implementation.
A Simulation Platform for Multi-tenant Machine Learning Services on Thousands of GPUs
Liang, Ruofan, He, Bingsheng, Yan, Shengen, Sun, Peng
Multi-tenant machine learning services have become emerging data-intensive workloads in data centers with heavy usage of GPU resources. Due to the large scale, many tuning parameters and heavy resource usage, it is usually impractical to evaluate and benchmark those machine learning services on real clusters. In this demonstration, we present AnalySIM, a cluster simulator that allows efficient design explorations for multi-tenant machine learning services. Specifically, by trace-driven cluster workload simulation, AnalySIM can easily test and analyze various scheduling policies in a number of performance metrics such as GPU resource utilization. AnalySIM simulates the cluster computational resource based on both physical topology and logical partition. The tool has been used in SenseTime to understand the impact of different scheduling policies with the trace from a real production cluster of over 1000 GPUs. We find that preemption and migration are able to significantly reduce average job completion time and mitigate the resource fragmentation problem.
Knowledge Isomorphism between Neural Networks
Liang, Ruofan, Li, Tianlin, Li, Longfei, Zhang, Quanshi
This paper aims to analyze knowledge isomorphism between pre-trained deep neural networks. We propose a generic definition for knowledge isomorphism between neural networks at different fuzziness levels, and design a task-agnostic and model-agnostic method to disentangle and quantify isomorphic features from intermediate layers of a neural network. As a generic tool, our method can be broadly used for different applications. In preliminary experiments, we have used knowledge isomorphism as a tool to diagnose feature representations of neural networks. Knowledge isomorphism provides new insights to explain the success of existing deep-learning techniques, such as knowledge distillation and network compression. More crucially, it has been shown that knowledge isomorphism can also be used to refine pre-trained networks and boost performance.