Huang, Yan
Achieving Near-Optimal Convergence for Distributed Minimax Optimization with Adaptive Stepsizes
Huang, Yan, Li, Xiang, Shen, Yipeng, He, Niao, Xu, Jinming
In this paper, we show that applying adaptive methods directly to distributed minimax problems can result in non-convergence due to inconsistency in locally computed adaptive stepsizes. To address this challenge, we propose D-AdaST, a Distributed Adaptive minimax method with Stepsize Tracking. The key strategy is to employ an adaptive stepsize tracking protocol involving the transmission of two extra (scalar) variables. This protocol ensures the consistency among stepsizes of nodes, eliminating the steady-state error due to the lack of coordination of stepsizes among nodes that commonly exists in vanilla distributed adaptive methods, and thus guarantees exact convergence. For nonconvex-strongly-concave distributed minimax problems, we characterize the specific transient times that ensure time-scale separation of stepsizes and quasi-independence of networks, leading to a near-optimal convergence rate of $\tilde{\mathcal{O}} \left( \epsilon ^{-\left( 4+\delta \right)} \right)$ for any small $\delta > 0$, matching that of the centralized counterpart. To our best knowledge, D-AdaST is the first distributed adaptive method achieving near-optimal convergence without knowing any problem-dependent parameters for nonconvex minimax problems. Extensive experiments are conducted to validate our theoretical results.
A score-based particle method for homogeneous Landau equation
Huang, Yan, Wang, Li
The Landau equation stands as one of the fundamental kinetic equations, modeling the evolution of charged particles undergoing Coulomb interaction [27]. It is particularly useful for plasmas where collision effects become non-negligible. Computing the Landau equation presents numerous challenges inherent in kinetic equations, including high dimensionality, multiple scales, and strong nonlinearity and non-locality. On the other hand, deep learning has progressively transformed the numerical computation of partial differential equations by leveraging neural networks' ability to approximate complex functions and the powerful optimization toolbox. However, straightforward application of deep learning to compute PDEs often encounters training difficulties and leads to a loss of physical fidelity. In this paper, we propose a score-based particle method that elegantly combines learning with structure-preserving particle methods. This method inherits the favorable conservative properties of deterministic particle methods while relying only on light training to dynamically obtain the score function over time. The learning component replaces the expensive density estimation in previous particle methods, drastically accelerating computation.
PrivSGP-VR: Differentially Private Variance-Reduced Stochastic Gradient Push with Tight Utility Bounds
Zhu, Zehan, Huang, Yan, Wang, Xin, Xu, Jinming
In this paper, we propose a differentially private decentralized learning method (termed PrivSGP-VR) which employs stochastic gradient push with variance reduction and guarantees $(\epsilon, \delta)$-differential privacy (DP) for each node. Our theoretical analysis shows that, under DP Gaussian noise with constant variance, PrivSGP-VR achieves a sub-linear convergence rate of $\mathcal{O}(1/\sqrt{nK})$, where $n$ and $K$ are the number of nodes and iterations, respectively, which is independent of stochastic gradient variance, and achieves a linear speedup with respect to $n$. Leveraging the moments accountant method, we further derive an optimal $K$ to maximize the model utility under certain privacy budget in decentralized settings. With this optimized $K$, PrivSGP-VR achieves a tight utility bound of $\mathcal{O}\left( \sqrt{d\log \left( \frac{1}{\delta} \right)}/(\sqrt{n}J\epsilon) \right)$, where $J$ and $d$ are the number of local samples and the dimension of decision variable, respectively, which matches that of the server-client distributed counterparts, and exhibits an extra factor of $1/\sqrt{n}$ improvement compared to that of the existing decentralized counterparts, such as A(DP)$^2$SGD. Extensive experiments corroborate our theoretical findings, especially in terms of the maximized utility with optimized $K$, in fully decentralized settings.
HCL-MTSAD: Hierarchical Contrastive Consistency Learning for Accurate Detection of Industrial Multivariate Time Series Anomalies
Sun, Haili, Huang, Yan, Han, Lansheng, Fu, Cai, Zhou, Chunjie
Multivariate Time Series (MTS) anomaly detection focuses on pinpointing samples that diverge from standard operational patterns, which is crucial for ensuring the safety and security of industrial applications. The primary challenge in this domain is to develop representations capable of discerning anomalies effectively. The prevalent methods for anomaly detection in the literature are predominantly reconstruction-based and predictive in nature. However, they typically concentrate on a single-dimensional instance level, thereby not fully harnessing the complex associations inherent in industrial MTS. To address this issue, we propose a novel self-supervised hierarchical contrastive consistency learning method for detecting anomalies in MTS, named HCL-MTSAD. It innovatively leverages data consistency at multiple levels inherent in industrial MTS, systematically capturing consistent associations across four latent levels-measurement, sample, channel, and process. By developing a multi-layer contrastive loss, HCL-MTSAD can extensively mine data consistency and spatio-temporal association, resulting in more informative representations. Subsequently, an anomaly discrimination module, grounded in self-supervised hierarchical contrastive learning, is designed to detect timestamp-level anomalies by calculating multi-scale data consistency. Extensive experiments conducted on six diverse MTS datasets retrieved from real cyber-physical systems and server machines, in comparison with 20 baselines, indicate that HCL-MTSAD's anomaly detection capability outperforms the state-of-the-art benchmark models by an average of 1.8\% in terms of F1 score.
Unsupervised Spatio-Temporal State Estimation for Fine-grained Adaptive Anomaly Diagnosis of Industrial Cyber-physical Systems
Sun, Haili, Huang, Yan, Han, Lansheng, Fu, Cai, Zhou, Chunjie
Accurate detection and diagnosis of abnormal behaviors such as network attacks from multivariate time series (MTS) are crucial for ensuring the stable and effective operation of industrial cyber-physical systems (CPS). However, existing researches pay little attention to the logical dependencies among system working states, and have difficulties in explaining the evolution mechanisms of abnormal signals. To reveal the spatio-temporal association relationships and evolution mechanisms of the working states of industrial CPS, this paper proposes a fine-grained adaptive anomaly diagnosis method (i.e. MAD-Transformer) to identify and diagnose anomalies in MTS. MAD-Transformer first constructs a temporal state matrix to characterize and estimate the change patterns of the system states in the temporal dimension. Then, to better locate the anomalies, a spatial state matrix is also constructed to capture the inter-sensor state correlation relationships within the system. Subsequently, based on these two types of state matrices, a three-branch structure of series-temporal-spatial attention module is designed to simultaneously capture the series, temporal, and space dependencies among MTS. Afterwards, three associated alignment loss functions and a reconstruction loss are constructed to jointly optimize the model. Finally, anomalies are determined and diagnosed by comparing the residual matrices with the original matrices. We conducted comparative experiments on five publicly datasets spanning three application domains (service monitoring, spatial and earth exploration, and water treatment), along with a petroleum refining simulation dataset collected by ourselves. The results demonstrate that MAD-Transformer can adaptively detect fine-grained anomalies with short duration, and outperforms the state-of-the-art baselines in terms of noise robustness and localization performance.
Dual-modal Tactile E-skin: Enabling Bidirectional Human-Robot Interaction via Integrated Tactile Perception and Feedback
Mu, Shilong, Zhao, Runze, Lin, Zenan, Huang, Yan, Li, Shoujie, Li, Chenchang, Zhang, Xiao-Ping, Ding, Wenbo
To foster an immersive and natural human-robot interaction, the implementation of tactile perception and feedback becomes imperative, effectively bridging the conventional sensory gap. In this paper, we propose a dual-modal electronic skin (e-skin) that integrates magnetic tactile sensing and vibration feedback for enhanced human-robot interaction. The dual-modal tactile e-skin offers multi-functional tactile sensing and programmable haptic feedback, underpinned by a layered structure comprised of flexible magnetic films, soft silicone, a Hall sensor and actuator array, and a microcontroller unit. The e-skin captures the magnetic field changes caused by subtle deformations through Hall sensors, employing deep learning for accurate tactile perception. Simultaneously, the actuator array generates mechanical vibrations to facilitate haptic feedback, delivering diverse mechanical stimuli. Notably, the dual-modal e-skin is capable of transmitting tactile information bidirectionally, enabling object recognition and fine-weighing operations. This bidirectional tactile interaction framework will enhance the immersion and efficiency of interactions between humans and robots.
TDeLTA: A Light-weight and Robust Table Detection Method based on Learning Text Arrangement
Fan, Yang, Wu, Xiangping, Chen, Qingcai, Li, Heng, Huang, Yan, Cai, Zhixiang, Wu, Qitian
The diversity of tables makes table detection a great challenge, leading to existing models becoming more tedious and complex. Despite achieving high performance, they often overfit to the table style in training set, and suffer from significant performance degradation when encountering out-of-distribution tables in other domains. To tackle this problem, we start from the essence of the table, which is a set of text arranged in rows and columns. Based on this, we propose a novel, light-weighted and robust Table Detection method based on Learning Text Arrangement, namely TDeLTA. TDeLTA takes the text blocks as input, and then models the arrangement of them with a sequential encoder and an attention module. To locate the tables precisely, we design a text-classification task, classifying the text blocks into 4 categories according to their semantic roles in the tables. Experiments are conducted on both the text blocks parsed from PDF and extracted by open-source OCR tools, respectively. Compared to several state-of-the-art methods, TDeLTA achieves competitive results with only 3.1M model parameters on the large-scale public datasets. Moreover, when faced with the cross-domain data under the 0-shot setting, TDeLTA outperforms baselines by a large margin of nearly 7%, which shows the strong robustness and transferability of the proposed model.
GLARE: A Dataset for Traffic Sign Detection in Sun Glare
Gray, Nicholas, Moraes, Megan, Bian, Jiang, Wang, Alex, Tian, Allen, Wilson, Kurt, Huang, Yan, Xiong, Haoyi, Guo, Zhishan
Real-time machine learning object detection algorithms are often found within autonomous vehicle technology and depend on quality datasets. It is essential that these algorithms work correctly in everyday conditions as well as under strong sun glare. Reports indicate glare is one of the two most prominent environment-related reasons for crashes. However, existing datasets, such as the Laboratory for Intelligent & Safe Automobiles Traffic Sign (LISA) Dataset and the German Traffic Sign Recognition Benchmark, do not reflect the existence of sun glare at all. This paper presents the GLARE (GLARE is available at: https://github.com/NicholasCG/GLARE_Dataset ) traffic sign dataset: a collection of images with U.S-based traffic signs under heavy visual interference by sunlight. GLARE contains 2,157 images of traffic signs with sun glare, pulled from 33 videos of dashcam footage of roads in the United States. It provides an essential enrichment to the widely used LISA Traffic Sign dataset. Our experimental study shows that although several state-of-the-art baseline architectures have demonstrated good performance on traffic sign detection in conditions without sun glare in the past, they performed poorly when tested against GLARE (e.g., average mAP0.5:0.95 of 19.4). We also notice that current architectures have better detection when trained on images of traffic signs in sun glare performance (e.g., average mAP0.5:0.95 of 39.6), and perform best when trained on a mixture of conditions (e.g., average mAP0.5:0.95 of 42.3).
MTS-DVGAN: Anomaly Detection in Cyber-Physical Systems using a Dual Variational Generative Adversarial Network
Sun, Haili, Huang, Yan, Han, Lansheng, Fu, Cai, Liu, Hongle, Long, Xiang
Deep generative models are promising in detecting novel cyber-physical attacks, mitigating the vulnerability of Cyber-physical systems (CPSs) without relying on labeled information. Nonetheless, these generative models face challenges in identifying attack behaviors that closely resemble normal data, or deviate from the normal data distribution but are in close proximity to the manifold of the normal cluster in latent space. To tackle this problem, this article proposes a novel unsupervised dual variational generative adversarial model named MST-DVGAN, to perform anomaly detection in multivariate time series data for CPS security. The central concept is to enhance the model's discriminative capability by widening the distinction between reconstructed abnormal samples and their normal counterparts. Specifically, we propose an augmented module by imposing contrastive constraints on the reconstruction process to obtain a more compact embedding. Then, by exploiting the distribution property and modeling the normal patterns of multivariate time series, a variational autoencoder is introduced to force the generative adversarial network (GAN) to generate diverse samples. Furthermore, two augmented loss functions are designed to extract essential characteristics in a self-supervised manner through mutual guidance between the augmented samples and original samples. Finally, a specific feature center loss is introduced for the generator network to enhance its stability. Empirical experiments are conducted on three public datasets, namely SWAT, WADI and NSL_KDD. Comparing with the state-of-the-art methods, the evaluation results show that the proposed MTS-DVGAN is more stable and can achieve consistent performance improvement.
Improving the Reusability of Pre-trained Language Models in Real-world Applications
Ghanbarzadeh, Somayeh, Palangi, Hamid, Huang, Yan, Moreno, Radames Cruz, Khanpour, Hamed
The reusability of state-of-the-art Pre-trained Language Models (PLMs) is often limited by their generalization problem, where their performance drastically decreases when evaluated on examples that differ from the training dataset, known as Out-of-Distribution (OOD)/unseen examples. This limitation arises from PLMs' reliance on spurious correlations, which work well for frequent example types but not for general examples. To address this issue, we propose a training approach called Mask-tuning, which integrates Masked Language Modeling (MLM) training objectives into the fine-tuning process to enhance PLMs' generalization. Comprehensive experiments demonstrate that Mask-tuning surpasses current state-of-the-art techniques and enhances PLMs' generalization on OOD datasets while improving their performance on in-distribution datasets. The findings suggest that Mask-tuning improves the reusability of PLMs on unseen data, making them more practical and effective for real-world applications.