Not enough data to create a plot.
Try a different view from the menu above.
Wang, Guan
ModEL: A Modularized End-to-end Reinforcement Learning Framework for Autonomous Driving
Wang, Guan, Niu, Haoyi, Zhu, Desheng, Hu, Jianming, Zhan, Xianyuan, Zhou, Guyue
Heated debates continue over the best autonomous driving framework. The classic modular pipeline is widely adopted in the industry owing to its great interpretability and stability, whereas the end-to-end paradigm has demonstrated considerable simplicity and learnability along with the rise of deep learning. We introduce a new modularized end-to-end reinforcement learning framework (ModEL) for autonomous driving, which combines the merits of both previous approaches. The autonomous driving stack of ModEL is decomposed into perception, planning, and control module, leveraging scene understanding, end-to-end reinforcement learning, and PID control respectively. Furthermore, we build a fully functional autonomous vehicle to deploy this framework. Through extensive simulation and real-world experiments, our framework has shown great generalizability to various complicated scenarios and outperforms the competing baselines.
A Deep Learning Inference Scheme Based on Pipelined Matrix Multiplication Acceleration Design and Non-uniform Quantization
Zhang, Yuyang, Leung, Dik Hin, Guo, Min, Xiao, Yijia, Liu, Haoyue, Li, Yunfei, Zhang, Jiyuan, Wang, Guan, Chen, Zhen
Matrix multiplication is the bedrock in Deep Learning inference application. When it comes to hardware acceleration on edge computing devices, matrix multiplication often takes up a great majority of the time. To achieve better performance in edge computing, we introduce a low-power Multi-layer Perceptron (MLP) accelerator based on a pipelined matrix multiplication scheme and a nonuniform quantization methodology. The implementation is running on Field-programmable Gate Array (FPGA) devices and tested its performance on handwritten digit classification and Q-learning tasks. Results show that our method can achieve better performance with fewer power consumption.
Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential Advertising
Hao, Xiaotian, Peng, Zhaoqing, Ma, Yi, Wang, Guan, Jin, Junqi, Hao, Jianye, Chen, Shan, Bai, Rongquan, Xie, Mingzhou, Xu, Miao, Zheng, Zhenzhe, Yu, Chuan, Li, Han, Xu, Jian, Gai, Kun
In E-commerce, advertising is essential for merchants to reach their target users. The typical objective is to maximize the advertiser's cumulative revenue over a period of time under a budget constraint. In real applications, an advertisement (ad) usually needs to be exposed to the same user multiple times until the user finally contributes revenue (e.g., places an order). However, existing advertising systems mainly focus on the immediate revenue with single ad exposures, ignoring the contribution of each exposure to the final conversion, thus usually falls into suboptimal solutions. In this paper, we formulate the sequential advertising strategy optimization as a dynamic knapsack problem. We propose a theoretically guaranteed bilevel optimization framework, which significantly reduces the solution space of the original optimization space while ensuring the solution quality. To improve the exploration efficiency of reinforcement learning, we also devise an effective action space reduction approach. Extensive offline and online experiments show the superior performance of our approaches over state-of-the-art baselines in terms of cumulative revenue.
DCAF: A Dynamic Computation Allocation Framework for Online Serving System
Jiang, Biye, Zhang, Pengye, Chen, Rihan, Dai, Binding, Luo, Xinchen, Yang, Yin, Wang, Guan, Zhou, Guorui, Zhu, Xiaoqiang, Gai, Kun
Modern large-scale systems such as recommender system and online advertising system are built upon computation-intensive infrastructure. The typical objective in these applications is to maximize the total revenue, e.g. GMV~(Gross Merchandise Volume), under a limited computation resource. Usually, the online serving system follows a multi-stage cascade architecture, which consists of several stages including retrieval, pre-ranking, ranking, etc. These stages usually allocate resource manually with specific computing power budgets, which requires the serving configuration to adapt accordingly. As a result, the existing system easily falls into suboptimal solutions with respect to maximizing the total revenue. The limitation is due to the face that, although the value of traffic requests vary greatly, online serving system still spends equal computing power among them. In this paper, we introduce a novel idea that online serving system could treat each traffic request differently and allocate "personalized" computation resource based on its value. We formulate this resource allocation problem as a knapsack problem and propose a Dynamic Computation Allocation Framework~(DCAF). Under some general assumptions, DCAF can theoretically guarantee that the system can maximize the total revenue within given computation budget. DCAF brings significant improvement and has been deployed in the display advertising system of Taobao for serving the main traffic. With DCAF, we are able to maintain the same business performance with 20\% computation resource reduction.
Measure Contribution of Participants in Federated Learning
Wang, Guan, Dang, Charlie Xiaoqian, Zhou, Ziye
--Federated Machine Learning (FML) creates an ecosystem for multiple parties to collaborate on building models while protecting data privacy for the participants. A measure of the contribution for each party in FML enables fair credits allocation. In this paper we develop simple but powerful techniques to fairly calculate the contributions of multiple parties in FML, in the context of both horizontal FML and vertical FML. For Horizontal FML we use deletion method to calculate the grouped instance influence. For V ertical FML we use Shapley V alues to calculate the grouped feature importance. Our methods open the door for research in model contribution and credit allocation in the context of federated machine learning. I NTRODUCTION Federated Learning or Federated Machine Learning (FML) [1] is introduced to solve privacy issues in machine learning using data from multiple parties.
Cooperative Lane Changing via Deep Reinforcement Learning
Wang, Guan, Hu, Jianming, Li, Zhiheng, Li, Li
In this paper, we study how to learn an appropriate lane changing strategy for autonomous vehicles by using deep reinforcement learning. We show that the reward of the system should consider the overall traffic efficiency instead of the travel efficiency of an individual vehicle. In summary, cooperation leads to a more harmonic and efficient traffic system rather than competition
Interpret Federated Learning with Shapley Values
Wang, Guan
Federated Learning is introduced to protect privacy by distributing training data into multiple parties. Each party trains its own model and a meta-model is constructed from the sub models. In this way the details of the data are not disclosed in between each party. In this paper we investigate the model interpretation methods for Federated Learning, specifically on the measurement of feature importance of vertical Federated Learning where feature space of the data is divided into two parties, namely host and guest. For host party to interpret a single prediction of vertical Federated Learning model, the interpretation results, namely the feature importance, are very likely to reveal the protected data from guest party. We propose a method to balance the model interpretability and data privacy in vertical Federated Learning by using Shapley values to reveal detailed feature importance for host features and a unified importance value for federated guest features. Our experiments indicate robust and informative results for interpreting Federated Learning models.