Not enough data to create a plot.
Try a different view from the menu above.
Jiang, Junchen
GRACE: Loss-Resilient Real-Time Video through Neural Codecs
Cheng, Yihua, Zhang, Ziyi, Li, Hanchen, Arapin, Anton, Zhang, Yue, Zhang, Qizheng, Liu, Yuhan, Zhang, Xu, Yan, Francis Y., Mazumdar, Amrita, Feamster, Nick, Jiang, Junchen
In real-time video communication, retransmitting lost packets over high-latency networks is not viable due to strict latency requirements. To counter packet losses without retransmission, two primary strategies are employed -- encoder-based forward error correction (FEC) and decoder-based error concealment. The former encodes data with redundancy before transmission, yet determining the optimal redundancy level in advance proves challenging. The latter reconstructs video from partially received frames, but dividing a frame into independently coded partitions inherently compromises compression efficiency, and the lost information cannot be effectively recovered by the decoder without adapting the encoder. We present a loss-resilient real-time video system called GRACE, which preserves the user's quality of experience (QoE) across a wide range of packet losses through a new neural video codec. Central to GRACE's enhanced loss resilience is its joint training of the neural encoder and decoder under a spectrum of simulated packet losses. In lossless scenarios, GRACE achieves video quality on par with conventional codecs (e.g., H.265). As the loss rate escalates, GRACE exhibits a more graceful, less pronounced decline in quality, consistently outperforming other loss-resilient schemes. Through extensive evaluation on various videos and real network traces, we demonstrate that GRACE reduces undecodable frames by 95% and stall duration by 90% compared with FEC, while markedly boosting video quality over error concealment methods. In a user study with 240 crowdsourced participants and 960 subjective ratings, GRACE registers a 38% higher mean opinion score (MOS) than other baselines.
OneAdapt: Fast Adaptation for Deep Learning Applications via Backpropagation
Du, Kuntai, Liu, Yuhan, Hao, Yitian, Zhang, Qizheng, Wang, Haodong, Huang, Yuyang, Ananthanarayanan, Ganesh, Jiang, Junchen
Deep learning inference on streaming media data, such as object detection in video or LiDAR feeds and text extraction from audio waves, is now ubiquitous. To achieve high inference accuracy, these applications typically require significant network bandwidth to gather high-fidelity data and extensive GPU resources to run deep neural networks (DNNs). While the high demand for network bandwidth and GPU resources could be substantially reduced by optimally adapting the configuration knobs, such as video resolution and frame rate, current adaptation techniques fail to meet three requirements simultaneously: adapt configurations (i) with minimum extra GPU or bandwidth overhead; (ii) to reach near-optimal decisions based on how the data affects the final DNN's accuracy, and (iii) do so for a range of configuration knobs. This paper presents OneAdapt, which meets these requirements by leveraging a gradient-ascent strategy to adapt configuration knobs. The key idea is to embrace DNNs' differentiability to quickly estimate the accuracy's gradient to each configuration knob, called AccGrad. Specifically, OneAdapt estimates AccGrad by multiplying two gradients: InputGrad (i.e. how each configuration knob affects the input to the DNN) and DNNGrad (i.e. how the DNN input affects the DNN inference output). We evaluate OneAdapt across five types of configurations, four analytic tasks, and five types of input data. Compared to state-of-the-art adaptation schemes, OneAdapt cuts bandwidth usage and GPU usage by 15-59% while maintaining comparable accuracy or improves accuracy by 1-5% while using equal or fewer resources.
Ekya: Continuous Learning of Video Analytics Models on Edge Compute Servers
Bhardwaj, Romil, Xia, Zhengxu, Ananthanarayanan, Ganesh, Jiang, Junchen, Karianakis, Nikolaos, Shu, Yuanchao, Hsieh, Kevin, Bahl, Victor, Stoica, Ion
Video analytics applications use edge compute servers for the analytics of the videos (for bandwidth and privacy). Compressed models that are deployed on the edge servers for inference suffer from data drift, where the live video data diverges from the training data. Continuous learning handles data drift by periodically retraining the models on new data. Our work addresses the challenge of jointly supporting inference and retraining tasks on edge servers, which requires navigating the fundamental tradeoff between the retrained model's accuracy and the inference accuracy. Our solution Ekya balances this tradeoff across multiple models and uses a micro-profiler to identify the models that will benefit the most by retraining. Ekya's accuracy gain compared to a baseline scheduler is 29% higher, and the baseline requires 4x more GPU resources to achieve the same accuracy as Ekya.
Addressing Training Bias via Automated Image Annotation
Xiao, Zhujun, Zhu, Yanzi, Chen, Yuxin, Zhao, Ben Y., Jiang, Junchen, Zheng, Haitao
Build accurate DNN models requires training on large labeled, context specific datasets, especially those matching the target scenario. We believe advances in wireless localization, working in unison with cameras, can produce automated annotation of targets on images and videos captured in the wild. Using pedestrian and vehicle detection as examples, we demonstrate the feasibility, benefits, and challenges of an automatic image annotation system. Our work calls for new technical development on passive localization, mobile data analytics, and error-resilient ML models, as well as design issues in user privacy policies.