Zhang, Teng
Ten Challenging Problems in Federated Foundation Models
Fan, Tao, Gu, Hanlin, Cao, Xuemei, Chan, Chee Seng, Chen, Qian, Chen, Yiqiang, Feng, Yihui, Gu, Yang, Geng, Jiaxiang, Luo, Bing, Liu, Shuoling, Ong, Win Kent, Ren, Chao, Shao, Jiaqi, Sun, Chuan, Tang, Xiaoli, Tae, Hong Xi, Tong, Yongxin, Wei, Shuyue, Wu, Fan, Xi, Wei, Xu, Mingcong, Yang, He, Yang, Xin, Yan, Jiangpeng, Yu, Hao, Yu, Han, Zhang, Teng, Zhang, Yifei, Zhang, Xiaojin, Zheng, Zhenzhe, Fan, Lixin, Yang, Qiang
Federated Foundation Models (FedFMs) represent a distributed learning paradigm that fuses general competences of foundation models as well as privacy-preserving capabilities of federated learning. This combination allows the large foundation models and the small local domain models at the remote clients to learn from each other in a teacher-student learning setting. This paper provides a comprehensive summary of the ten challenging problems inherent in FedFMs, encompassing foundational theory, utilization of private data, continual learning, unlearning, Non-IID and graph data, bidirectional knowledge transfer, incentive mechanism design, game mechanism design, model watermarking, and efficiency. The ten challenging problems manifest in five pivotal aspects: ``Foundational Theory," which aims to establish a coherent and unifying theoretical framework for FedFMs. ``Data," addressing the difficulties in leveraging domain-specific knowledge from private data while maintaining privacy; ``Heterogeneity," examining variations in data, model, and computational resources across clients; ``Security and Privacy," focusing on defenses against malicious attacks and model theft; and ``Efficiency," highlighting the need for improvements in training, communication, and parameter efficiency. For each problem, we offer a clear mathematical definition on the objective function, analyze existing methods, and discuss the key challenges and potential solutions. This in-depth exploration aims to advance the theoretical foundations of FedFMs, guide practical implementations, and inspire future research to overcome these obstacles, thereby enabling the robust, efficient, and privacy-preserving FedFMs in various real-world applications.
Multi-Class Segmentation of Aortic Branches and Zones in Computed Tomography Angiography: The AortaSeg24 Challenge
Imran, Muhammad, Krebs, Jonathan R., Sivaraman, Vishal Balaji, Zhang, Teng, Kumar, Amarjeet, Ueland, Walker R., Fassler, Michael J., Huang, Jinlong, Sun, Xiao, Wang, Lisheng, Shi, Pengcheng, Rokuss, Maximilian, Baumgartner, Michael, Kirchhof, Yannick, Maier-Hein, Klaus H., Isensee, Fabian, Liu, Shuolin, Han, Bing, Nguyen, Bong Thanh, Shin, Dong-jin, Ji-Woo, Park, Choi, Mathew, Uhm, Kwang-Hyun, Ko, Sung-Jea, Lee, Chanwoong, Chun, Jaehee, Kim, Jin Sung, Zhang, Minghui, Zhang, Hanxiao, You, Xin, Gu, Yun, Pan, Zhaohong, Liu, Xuan, Liang, Xiaokun, Tiefenthaler, Markus, Almar-Munoz, Enrique, Schwab, Matthias, Kotyushev, Mikhail, Epifanov, Rostislav, Wodzinski, Marek, Muller, Henning, Qayyum, Abdul, Mazher, Moona, Niederer, Steven A., Wang, Zhiwei, Yang, Kaixiang, Ren, Jintao, Korreman, Stine Sofia, Gao, Yuchong, Zeng, Hongye, Zheng, Haoyu, Zheng, Rui, Yue, Jinghua, Zhou, Fugen, Liu, Bo, Cosman, Alexander, Liang, Muxuan, Zhao, Chang, Upchurch, Gilbert R. Jr., Ma, Jun, Zhou, Yuyin, Cooper, Michol A., Shao, Wei
Multi-class segmentation of the aorta in computed tomography angiography (CTA) scans is essential for diagnosing and planning complex endovascular treatments for patients with aortic dissections. However, existing methods reduce aortic segmentation to a binary problem, limiting their ability to measure diameters across different branches and zones. Furthermore, no open-source dataset is currently available to support the development of multi-class aortic segmentation methods. To address this gap, we organized the AortaSeg24 MICCAI Challenge, introducing the first dataset of 100 CTA volumes annotated for 23 clinically relevant aortic branches and zones. This dataset was designed to facilitate both model development and validation. The challenge attracted 121 teams worldwide, with participants leveraging state-of-the-art frameworks such as nnU-Net and exploring novel techniques, including cascaded models, data augmentation strategies, and custom loss functions. We evaluated the submitted algorithms using the Dice Similarity Coefficient (DSC) and Normalized Surface Distance (NSD), highlighting the approaches adopted by the top five performing teams. This paper presents the challenge design, dataset details, evaluation metrics, and an in-depth analysis of the top-performing algorithms. The annotated dataset, evaluation code, and implementations of the leading methods are publicly available to support further research. All resources can be accessed at https://aortaseg24.grand-challenge.org.
Outlier Synthesis via Hamiltonian Monte Carlo for Out-of-Distribution Detection
Li, Hengzhuang, Zhang, Teng
Out-of-distribution (OOD) detection is crucial for developing trustworthy and reliable machine learning systems. Recent advances in training with auxiliary OOD data demonstrate efficacy in enhancing detection capabilities. Nonetheless, these methods heavily rely on acquiring a large pool of high-quality natural outliers. Some prior methods try to alleviate this problem by synthesizing virtual outliers but suffer from either poor quality or high cost due to the monotonous sampling strategy and the heavy-parameterized generative models. In this paper, we overcome all these problems by proposing the Hamiltonian Monte Carlo Outlier Synthesis (HamOS) framework, which views the synthesis process as sampling from Markov chains. Based solely on the in-distribution data, the Markov chains can extensively traverse the feature space and generate diverse and representative outliers, hence exposing the model to miscellaneous potential OOD scenarios. The Hamiltonian Monte Carlo with sampling acceptance rate almost close to 1 also makes our framework enjoy great efficiency. By empirically competing with SOTA baselines on both standard and large-scale benchmarks, we verify the efficacy and efficiency of our proposed HamOS.
Hyperbolic Hypergraph Neural Networks for Multi-Relational Knowledge Hypergraph Representation
Li, Mengfan, Shi, Xuanhua, Qiao, Chenqi, Zhang, Teng, Jin, Hai
Knowledge hypergraphs generalize knowledge graphs using hyperedges to connect multiple entities and depict complicated relations. Existing methods either transform hyperedges into an easier-to-handle set of binary relations or view hyperedges as isolated and ignore their adjacencies. Both approaches have information loss and may potentially lead to the creation of sub-optimal models. To fix these issues, we propose the Hyperbolic Hypergraph Neural Network (H2GNN), whose essential component is the hyper-star message passing, a novel scheme motivated by a lossless expansion of hyperedges into hierarchies. It implements a direct embedding that consciously incorporates adjacent entities, hyper-relations, and entity position-aware information. As the name suggests, H2GNN operates in the hyperbolic space, which is more adept at capturing the tree-like hierarchy. We compare H2GNN with 15 baselines on knowledge hypergraphs, and it outperforms state-of-the-art approaches in both node classification and link prediction tasks.
Preliminary Investigation into Data Scaling Laws for Imitation Learning-Based End-to-End Autonomous Driving
Zheng, Yupeng, Xia, Zhongpu, Zhang, Qichao, Zhang, Teng, Lu, Ben, Huo, Xiaochuang, Han, Chao, Li, Yixian, Yu, Mengjie, Jin, Bu, Yang, Pengxuan, Zheng, Yuhang, Yuan, Haifeng, Jiang, Ke, Jia, Peng, Lang, Xianpeng, Zhao, Dongbin
The end-to-end autonomous driving paradigm has recently attracted lots of attention due to its scalability. However, existing methods are constrained by the limited scale of real-world data, which hinders a comprehensive exploration of the scaling laws associated with end-to-end autonomous driving. To address this issue, we collected substantial data from various driving scenarios and behaviors and conducted an extensive study on the scaling laws of existing imitation learning-based end-to-end autonomous driving paradigms. Specifically, approximately 4 million demonstrations from 23 different scenario types were gathered, amounting to over 30,000 hours of driving demonstrations. We performed open-loop evaluations and closed-loop simulation evaluations in 1,400 diverse driving demonstrations (1,300 for open-loop and 100 for closed-loop) under stringent assessment conditions. Through experimental analysis, we discovered that (1) the performance of the driving model exhibits a power-law relationship with the amount of training data; (2) a small increase in the quantity of long-tailed data can significantly improve the performance for the corresponding scenarios; (3) appropriate scaling of data enables the model to achieve combinatorial generalization in novel scenes and actions. Our results highlight the critical role of data scaling in improving the generalizability of models across diverse autonomous driving scenarios, assuring safe deployment in the real world. Project repository: https://github.com/ucaszyp/Driving-Scaling-Law
Chain-of-Though (CoT) prompting strategies for medical error detection and correction
Wu, Zhaolong, Hasan, Abul, Wu, Jinge, Kim, Yunsoo, Cheung, Jason P. Y., Zhang, Teng, Wu, Honghan
This paper describes our submission to the MEDIQA-CORR 2024 shared task for automatically detecting and correcting medical errors in clinical notes. We report results for three methods of few-shot In-Context Learning (ICL) augmented with Chain-of-Thought (CoT) and reason prompts using a large language model (LLM). In the first method, we manually analyse a subset of train and validation dataset to infer three CoT prompts by examining error types in the clinical notes. In the second method, we utilise the training dataset to prompt the LLM to deduce reasons about their correctness or incorrectness. The constructed CoTs and reasons are then augmented with ICL examples to solve the tasks of error detection, span identification, and error correction. Finally, we combine the two methods using a rule-based ensemble method. Across the three sub-tasks, our ensemble method achieves a ranking of 3rd for both sub-task 1 and 2, while securing 7th place in sub-task 3 among all submissions.
Theoretical Guarantees for the Subspace-Constrained Tyler's Estimator
Lerman, Gilad, Yu, Feng, Zhang, Teng
This work analyzes the subspace-constrained Tyler's estimator (STE) [12] designed for recovering a low-dimensional subspace within a dataset that may be highly corrupted with outliers. It assumes a weak inlier-outlier model and allows the fraction of inliers to be smaller than a fraction that leads to computational hardness of the robust subspace recovery problem. It shows that in this setting, if the initialization of STE, which is an iterative algorithm, satisfies a certain condition, then STE can effectively recover the underlying subspace. It further shows that under the generalized haystack model, STE initialized by the Tyler's M-estimator (TME), can recover the subspace when the fraction of iniliers is too small for TME to handle.
Differentially Private Pre-Trained Model Fusion using Decentralized Federated Graph Matching
Chen, Qian, Chen, Yiqiang, Jiang, Xinlong, Zhang, Teng, Dai, Weiwei, Huang, Wuliang, Yan, Zhen, Ye, Bo
Model fusion is becoming a crucial component in the context of model-as-a-service scenarios, enabling the delivery of high-quality model services to local users. However, this approach introduces privacy risks and imposes certain limitations on its applications. Ensuring secure model exchange and knowledge fusion among users becomes a significant challenge in this setting. To tackle this issue, we propose PrivFusion, a novel architecture that preserves privacy while facilitating model fusion under the constraints of local differential privacy. PrivFusion leverages a graph-based structure, enabling the fusion of models from multiple parties without necessitating retraining. By employing randomized mechanisms, PrivFusion ensures privacy guarantees throughout the fusion process. To enhance model privacy, our approach incorporates a hybrid local differentially private mechanism and decentralized federated graph matching, effectively protecting both activation values and weights. Additionally, we introduce a perturbation filter adapter to alleviate the impact of randomized noise, thereby preserving the utility of the fused model. Through extensive experiments conducted on diverse image datasets and real-world healthcare applications, we provide empirical evidence showcasing the effectiveness of PrivFusion in maintaining model performance while preserving privacy. Our contributions offer valuable insights and practical solutions for secure and collaborative data analysis within the domain of privacy-preserving model fusion.
Improved Convergence Rates of Anderson Acceleration for a Large Class of Fixed-Point Iterations
Garner, Casey, Lerman, Gilad, Zhang, Teng
This paper studies Anderson acceleration (AA) for fixed-point methods ${x}^{(k+1)}=q({x}^{(k)})$. It provides the first proof that when the operator $q$ is linear and symmetric, AA improves the root-linear convergence factor over the fixed-point iterations. When $q$ is nonlinear, yet has a symmetric Jacobian at the solution, a slightly modified AA algorithm is proved to have an analogous root-linear convergence factor improvement over fixed-point iterations. Simulations verify our observations. Furthermore, experiments with different data models demonstrate AA is significantly superior to the standard fixed-point methods for Tyler's M-estimation.
FedBone: Towards Large-Scale Federated Multi-Task Learning
Chen, Yiqiang, Zhang, Teng, Jiang, Xinlong, Chen, Qian, Gao, Chenlong, Huang, Wuliang
Heterogeneous federated multi-task learning (HFMTL) is a federated learning technique that combines heterogeneous tasks of different clients to achieve more accurate, comprehensive predictions. In real-world applications, visual and natural language tasks typically require large-scale models to extract high-level abstract features. However, large-scale models cannot be directly applied to existing federated multi-task learning methods. Existing HFML methods also disregard the impact of gradient conflicts on multi-task optimization during the federated aggregation process. In this work, we propose an innovative framework called FedBone, which enables the construction of large-scale models with better generalization from the perspective of server-client split learning and gradient projection. We split the entire model into two components: a large-scale general model (referred to as the general model) on the cloud server and multiple task-specific models (referred to as the client model) on edge clients, solving the problem of insufficient computing power on edge clients. The conflicting gradient projection technique is used to enhance the generalization of the large-scale general model between different tasks. The proposed framework is evaluated on two benchmark datasets and a real ophthalmic dataset. Comprehensive results demonstrate that FedBone efficiently adapts to heterogeneous local tasks of each client and outperforms existing federated learning algorithms in most dense prediction and classification tasks with off-the-shelf computational resources on the client side.