mean 0
- Europe > Germany > Baden-Württemberg > Freiburg (0.04)
- Oceania > New Zealand > North Island > Waikato (0.04)
- North America > United States > Wisconsin (0.04)
- (2 more...)
- Health & Medicine > Therapeutic Area (1.00)
- Banking & Finance (0.68)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > France > Auvergne-Rhône-Alpes > Lyon > Lyon (0.04)
- North America > United States > North Carolina > Orange County > Chapel Hill (0.05)
- North America > Canada (0.05)
05311655a15b75fab86956663e1819cd-Supplemental.pdf
In what follows we will call each experiment by its corresponding figure or table number for convenience. For the rotated/shifted MNIST images (Figure 8, 9), we use the Affine transformation function in the TorchVisionlibrary. In experiments (Table 2, 3, 4, 5), we use either or both of the Large (L) and Small (S) dataset for the standard benchmark vision data: MNIST, FMNIST, KMNIST, Omniglot, SVHN, CIFAR10, CIFAR100, CELEBA. For Figure 10, Table 3, the regularization coefficients for CAE, WAE are searched around 0.01 0.001, the noise level used in DAE is searched around0.1 0.01, and the regularization coefficient andλforSPAEandNRAE aresearched around0.001 Ontheother hand, the runtimes of our algorithms are comparable with other existing methods.
Nesterov-Accelerated Robust Federated Learning Over Byzantine Adversaries
Xu, Lihan, Dong, Yanjie, Wang, Gang, Zeng, Runhao, Fan, Xiaoyi, Hu, Xiping
Abstract--We investigate robust federated learning, where a group of workers collaboratively train a shared model under the orchestration of a central server in the presence of Byzantine adversaries capable of arbitrary and potentially malicious behaviors. T o simultaneously enhance communication efficiency and robustness against such adversaries, we propose a Byzantine-resilient Nesterov-Accelerated Federated Learning (Byrd-NAFL) algorithm. Byrd-NAFL seamlessly integrates Nesterov's momentum into the federated learning process alongside Byzantine-resilient aggregation rules to achieve fast and safeguarding convergence against gradient corruption. We establish a finite-time convergence guarantee for Byrd-NAFL under non-convex and smooth loss functions with relaxed assumption on the aggregated gradients. Extensive numerical experiments validate the effectiveness of Byrd-NAFL and demonstrate the superiority over existing benchmarks in terms of convergence speed, accuracy, and resilience to diverse Byzantine attack strategies. As a promising paradigm for privacy-preserving distributed learning, federated learning (FL) leverages the parallel computational capabilities of user terminals to learn from decentralized data with the orchestration of a central server. Since its inception [1], [2], FL has been proliferating across diverse application scenarios, e.g., healthcare [3], [4], mobile edge [5], [6], and autonomous driving [7], [8]. Despite the merits in preserving user privacy, vanilla FL paradigm is still facing two major challenges, namely, Byzantine resilience [9], [10] and communication efficiency [11]. To robustify the FL paradigm, Byzantine-resilient aggregation rules, e.g., Krum [10], the component-wise median (CwMed) [15], Bulyan [16], and geometric median (GeoMed) [17], are designed to enhance the trustworthiness and reliability of the FL paradigm. Another major challenge in FL lies in enhancing communication efficiency. Current communication-efficient FL algorithms can be broadly classified into three categories: (i) communication frequency reduction [18], [19], [20], [21], [22], [12], (ii) exchanged information compression [23], [24], [25], [6], and (iii) iteration reduction [20], [26], [27], [28].
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- North America > United States > Hawaii (0.04)
- (7 more...)
- Information Technology > Security & Privacy (0.66)
- Government > Military (0.66)
BiMax: Bidirectional MaxSim Score for Document-Level Alignment
Wang, Xiaotian, Utsuro, Takehito, Nagata, Masaaki
Document alignment is necessary for the hierarchical mining (Bañón et al., 2020; Morishita et al., 2022), which aligns documents across source and target languages within the same web domain. Several high precision sentence embedding-based methods have been developed, such as TK-PERT (Thompson and Koehn, 2020) and Optimal Transport (OT) (Clark et al., 2019; El-Kishky and Guzmán, 2020). However, given the massive scale of web mining data, both accuracy and speed must be considered. In this paper, we propose a cross-lingual Bidirectional Maxsim score (BiMax) for computing doc-to-doc similarity, to improve efficiency compared to the OT method. Consequently, on the WMT16 bilingual document alignment task, BiMax attains accuracy comparable to OT with an approximate 100-fold speed increase. Meanwhile, we also conduct a comprehensive analysis to investigate the performance of current state-of-the-art multilingual sentence embedding models. All the alignment methods in this paper are publicly available as a tool called EmbDA (https://github.com/EternalEdenn/EmbDA).
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.67)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.94)
- Information Technology > Data Science (0.88)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)
- Europe > Germany > Baden-Württemberg > Freiburg (0.04)
- Oceania > New Zealand > North Island > Waikato (0.04)
- North America > United States > Wisconsin (0.04)
- (2 more...)
- Health & Medicine > Therapeutic Area (1.00)
- Banking & Finance (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- (2 more...)
Deep Survival Analysis for Competing Risk Modeling with Functional Covariates and Missing Data Imputation
Gao, Penglei, Zou, Yan, Duggal, Abhijit, Huang, Shuaiqi, Liang, Faming, Wang, Xiaofeng
We introduce the Functional Competing Risk Net (FCRN), a unified deep-learning framework for discrete-time survival analysis under competing risks, which seamlessly integrates functional covariates and handles missing data within an end-to-end model. By combining a micro-network Basis Layer for functional data representation with a gradient-based imputation module, FCRN simultaneously learns to impute missing values and predict event-specific hazards. Evaluated on multiple simulated datasets and a real-world ICU case study using the MIMIC-IV and Cleveland Clinic datasets, FCRN demonstrates substantial improvements in prediction accuracy over random survival forests and traditional competing risks models. This approach advances prognostic modeling in critical care by more effectively capturing dynamic risk factors and static predictors while accommodating irregular and incomplete data.
- North America > United States > Ohio > Cuyahoga County > Cleveland (0.04)
- Asia > Middle East > Israel (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
- North America > United States > New York > New York County > New York City (0.07)
- North America > United States > North Carolina > Orange County > Chapel Hill (0.05)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.05)
Beyond Sensor Data: Foundation Models of Behavioral Data from Wearables Improve Health Predictions
Erturk, Eray, Kamran, Fahad, Abbaspourazad, Salar, Jewell, Sean, Sharma, Harsh, Li, Yujie, Williamson, Sinead, Foti, Nicholas J, Futoma, Joseph
Wearable devices record physiological and behavioral signals that can improve health predictions. While foundation models are increasingly used for such predictions, they have been primarily applied to low-level sensor data, despite behavioral data often being more informative due to their alignment with physiologically relevant timescales and quantities. We develop foundation models of such behavioral signals using over 2.5B hours of wearable data from 162K individuals, systematically optimizing architectures and tokenization strategies for this unique dataset. Evaluated on 57 health-related tasks, our model shows strong performance across diverse real-world applications including individual-level classification and time-varying health state prediction. The model excels in behavior-driven tasks like sleep prediction, and improves further when combined with representations of raw sensor data. These results underscore the importance of tailoring foundation model design to wearables and demonstrate the potential to enable new health applications.
- North America > United States > Massachusetts (0.04)
- North America > Canada (0.04)