Chen, Han
FedCompass: Efficient Cross-Silo Federated Learning on Heterogeneous Client Devices using a Computing Power Aware Scheduler
Li, Zilinghan, Chaturvedi, Pranshu, He, Shilan, Chen, Han, Singh, Gagandeep, Kindratenko, Volodymyr, Huerta, E. A., Kim, Kibaek, Madduri, Ravi
Cross-silo federated learning offers a promising solution to collaboratively train robust and generalized AI models without compromising the privacy of local datasets, e.g., healthcare, financial, as well as scientific projects that lack a centralized data facility. Nonetheless, because of the disparity of computing resources among different clients (i.e., device heterogeneity), synchronous federated learning algorithms suffer from degraded efficiency when waiting for straggler clients. Similarly, asynchronous federated learning algorithms experience degradation in the convergence rate and final model accuracy on non-identically and independently distributed (non-IID) heterogeneous datasets due to stale local models and client drift. To address these limitations in cross-silo federated learning with heterogeneous clients and data, we propose FedCompass, an innovative semiasynchronous federated learning algorithm with a computing power aware scheduler on the server side, which adaptively assigns varying amounts of training tasks to different clients using the knowledge of the computing power of individual clients. FedCompass ensures that multiple locally trained models from clients are received almost simultaneously as a group for aggregation, effectively reducing the staleness of local models. At the same time, the overall training process remains asynchronous, eliminating prolonged waiting periods from straggler clients. Using diverse non-IID heterogeneous distributed datasets, we demonstrate that FedCompass achieves faster convergence and higher accuracy than other asynchronous algorithms while remaining more efficient than synchronous algorithms when performing federated learning on heterogeneous clients. Federated learning (FL) is a collaborative model training approach where multiple clients train a global model under the orchestration of a central server (Koneฤnแปณ et al., 2016; McMahan et al., 2017; Yang et al., 2019; Kairouz et al., 2021). FL typically runs two steps iteratively: (i) the server distributes the global model to clients to train it using their local data; (ii) the server collects the locally trained models and updates the global model by aggregating them. Federated Averaging (FedAvg) (McMahan et al., 2017) is the most popular FL algorithm where each client trains a model using local data for Q local steps in each training round, after which the orchestration server aggregates all local models by performing a weighted averaging and sends the updated global model back to all clients for the next round of training. By leveraging the training data from multiple clients without explicitly sharing, FL empowers the training of more robust and generalized models while preserving the privacy of client data.
Synergistic Signal Denoising for Multimodal Time Series of Structure Vibration
Yu, Yang, Chen, Han
Structural health monitoring (SHM) has emerged as a vital field of research, geared towards preserving the longevity and safety of civil infrastructure [1]. A critical component of SHM is the analysis of vibration time series data, which offers insights into the behavior, health, and performance of structures [2]. As infrastructure, especially in urban regions, is subject to a myriad of dynamic forces--ranging from wind to traffic loads - it becomes pivotal to extract clear and meaningful data from the complex vibration signatures that these forces induce. However, one of the significant challenges plaguing SHM practitioners is the interference of noise in these vibration signals, which can distort interpretations and lead to unreliable conclusions. The dynamic response of structures is often manifested as multimodal vibrations, meaning multiple modes or patterns of vibration coexist. These modes, each characterized by its frequency and shape, provide a fingerprint of the structure's health and dynamic properties.
Spatial-temporal Transformer-guided Diffusion based Data Augmentation for Efficient Skeleton-based Action Recognition
Jiang, Yifan, Chen, Han, Ko, Hanseok
Recently, skeleton-based human action has become a hot research topic because the compact representation of human skeletons brings new blood to this research domain. As a result, researchers began to notice the importance of using RGB or other sensors to analyze human action by extracting skeleton information. Leveraging the rapid development of deep learning (DL), a significant number of skeleton-based human action approaches have been presented with fine-designed DL structures recently. However, a well-trained DL model always demands high-quality and sufficient data, which is hard to obtain without costing high expenses and human labor. In this paper, we introduce a novel data augmentation method for skeleton-based action recognition tasks, which can effectively generate high-quality and diverse sequential actions. In order to obtain natural and realistic action sequences, we propose denoising diffusion probabilistic models (DDPMs) that can generate a series of synthetic action sequences, and their generation process is precisely guided by a spatial-temporal transformer (ST-Trans). Experimental results show that our method outperforms the state-of-the-art (SOTA) motion generation approaches on different naturality and diversity metrics. It proves that its high-quality synthetic data can also be effectively deployed to existing action recognition models with significant performance improvement.
CSGCL: Community-Strength-Enhanced Graph Contrastive Learning
Chen, Han, Zhao, Ziwen, Li, Yuhua, Zou, Yixiong, Li, Ruixuan, Zhang, Rui
Graph Contrastive Learning (GCL) is an effective way to learn generalized graph representations in a self-supervised manner, and has grown rapidly in recent years. However, the underlying community semantics has not been well explored by most previous GCL methods. Research that attempts to leverage communities in GCL regards them as having the same influence on the graph, leading to extra representation errors. To tackle this issue, we define ''community strength'' to measure the difference of influence among communities. Under this premise, we propose a Community-Strength-enhanced Graph Contrastive Learning (CSGCL) framework to preserve community strength throughout the learning process. Firstly, we present two novel graph augmentation methods, Communal Attribute Voting (CAV) and Communal Edge Dropping (CED), where the perturbations of node attributes and edges are guided by community strength. Secondly, we propose a dynamic ''Team-up'' contrastive learning scheme, where community strength is used to progressively fine-tune the contrastive objective. We report extensive experiment results on three downstream tasks: node classification, node clustering, and link prediction. CSGCL achieves state-of-the-art performance compared with other GCL methods, validating that community strength brings effectiveness and generality to graph representations. Our code is available at https://github.com/HanChen-HUST/CSGCL.
Action Recognition with Domain Invariant Features of Skeleton Image
Chen, Han, Jiang, Yifan, Ko, Hanseok
Due to the fast processing-speed and robustness it can achieve, skeleton-based action recognition has recently received the attention of the computer vision community. The recent Convolutional Neural Network (CNN)-based methods have shown commendable performance in learning spatio-temporal representations for skeleton sequence, which use skeleton image as input to a CNN. Since the CNN-based methods mainly encoding the temporal and skeleton joints simply as rows and columns, respectively, the latent correlation related to all joints may be lost caused by the 2D convolution. To solve this problem, we propose a novel CNN-based method with adversarial training for action recognition. We introduce a two-level domain adversarial learning to align the features of skeleton images from different view angles or subjects, respectively, thus further improve the generalization. We evaluated our proposed method on NTU RGB+D. It achieves competitive results compared with state-of-the-art methods and 2.4$\%$, 1.9$\%$ accuracy gain than the baseline for cross-subject and cross-view.
Identification and Avoidance of Static and Dynamic Obstacles on Point Cloud for UAVs Navigation
Chen, Han, Lu, Peng
Avoiding hybrid obstacles in unknown scenarios with an efficient flight strategy is a key challenge for unmanned aerial vehicle applications. In this paper, we introduce a technique to distinguish dynamic obstacles from static ones with only point cloud input. Then, a computationally efficient obstacle avoidance motion planning approach is proposed and it is in line with an improved relative velocity method. The approach is able to avoid both static obstacles and dynamic ones in the same framework. For static and dynamic obstacles, the collision check and motion constraints are different, and they are integrated into one framework efficiently. In addition, we present several techniques to improve the algorithm performance and deal with the time gap between different submodules. The proposed approach is implemented to run onboard in real-time and validated extensively in simulation and hardware tests. Our average single step calculating time is less than 20 ms.
Non-Convex Projected Gradient Descent for Generalized Low-Rank Tensor Regression
Chen, Han, Raskutti, Garvesh, Yuan, Ming
In this paper, we consider the problem of learning high-dimensional tensor regression problems with low-rank structure. One of the core challenges associated with learning high-dimensional models is computation since the underlying optimization problems are often non-convex. While convex relaxations could lead to polynomial-time algorithms they are often slow in practice. On the other hand, limited theoretical guarantees exist for non-convex methods. In this paper we provide a general framework that provides theoretical guarantees for learning high-dimensional tensor regression models under different low-rank structural assumptions using the projected gradient descent algorithm applied to a potentially non-convex constraint set $\Theta$ in terms of its \emph{localized Gaussian width}. We juxtapose our theoretical results for non-convex projected gradient descent algorithms with previous results on regularized convex approaches. The two main differences between the convex and non-convex approach are: (i) from a computational perspective whether the non-convex projection operator is computable and whether the projection has desirable contraction properties and (ii) from a statistical upper bound perspective, the non-convex approach has a superior rate for a number of examples. We provide three concrete examples of low-dimensional structure which address these issues and explain the pros and cons for the non-convex and convex approaches. We supplement our theoretical results with simulations which show that, under several common settings of generalized low rank tensor regression, the projected gradient descent approach is superior both in terms of statistical error and run-time provided the step-sizes of the projected descent algorithm are suitably chosen.