detox
DETOX: A Redundancy-based Framework for Faster and More Robust Gradient Aggregation
To improve the resilience of distributed training to worst-case, or Byzantine node failures, several recent methods have replaced gradient averaging with robust aggregation methods. Such techniques can have high computational costs, often quadratic in the number of compute nodes, and only have limited robustness guarantees. Other methods have instead used redundancy to guarantee robustness, but can only tolerate limited numbers of Byzantine failures. In this work, we present DETOX, a Byzantine-resilient distributed training framework that combines algorithmic redundancy with robust aggregation. DETOX operates in two steps, a filtering step that uses limited redundancy to significantly reduce the effect of Byzantine nodes, and a hierarchical aggregation step that can be used in tandem with any state-of-the-art robust aggregation method. We show theoretically that this leads to a substantial increase in robustness, and has a per iteration runtime that can be nearly linear in the number of compute nodes. We provide extensive experiments over real distributed setups across a variety of large-scale machine learning tasks, showing that DETOX leads to orders of magnitude accuracy and speedup improvements over many state-of-the-art Byzantine-resilient approaches.
DETOX: A Redundancy-based Framework for Faster and More Robust Gradient Aggregation
To improve the resilience of distributed training to worst-case, or Byzantine node failures, several recent methods have replaced gradient averaging with robust aggregation methods. Such techniques can have high computational costs, often quadratic in the number of compute nodes, and only have limited robustness guarantees. Other methods have instead used redundancy to guarantee robustness, but can only tolerate limited numbers of Byzantine failures. In this work, we present DETOX, a Byzantine-resilient distributed training framework that combines algorithmic redundancy with robust aggregation. DETOX operates in two steps, a filtering step that uses limited redundancy to significantly reduce the effect of Byzantine nodes, and a hierarchical aggregation step that can be used in tandem with any state-of-the-art robust aggregation method.
DeTox: Toxic Subspace Projection for Model Editing
Uppaal, Rheeya, Dey, Apratim, He, Yiting, Zhong, Yiqiao, Hu, Junjie
Recent alignment algorithms such as direct preference optimization (DPO) have been developed to improve the safety of large language models (LLMs) by training these models to match human behaviors exemplified by preference data. However, these methods are both computationally intensive and lacking in controllability and transparency, making them prone to jailbreaking and inhibiting their widespread use. Furthermore, these tuning-based methods require large-scale preference data for training and are susceptible to noisy preference data. In this paper, we introduce a tuning-free alignment alternative (DeTox) and demonstrate its effectiveness under the use case of toxicity reduction. Grounded on theory from factor analysis, DeTox is a sample-efficient model editing approach that identifies a toxic subspace in the model parameter space and reduces model toxicity by projecting away the detected subspace. The toxic sub-space is identified by extracting preference data embeddings from the language model, and removing non-toxic information from these embeddings. We show that DeTox is more sample-efficient than DPO, further showcasing greater robustness to noisy data. Finally, we establish both theoretical and empirical connections between DeTox and DPO, showing that DeTox can be interpreted as a denoised version of a single DPO step.
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- North America > United States > Virginia (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Asia > China (0.04)
- Research Report > New Finding (0.67)
- Research Report > Experimental Study (0.46)
AI chatbots reveal 10 unconventional - but achievable - New Year's resolutions for 2024... and the results are surprising
People turn to AI for career advice, meal ideas and travel tips, but with the new year started, DailyMail.com Research shows that 23 percent of Americans ditch their New Year's resolutions by the end of the first week, and 43 percent quit by the end of January. The failures stem from people'thinking too big,' but Google's Bard, Microsoft's Bing and ChatGPT have curated 10 realistic but unconventional resolutions for 2024. These included hosting a monthly themed dinner party to build stronger bonds with others, eating a vegan meal each week to improve health and starting a digital detox. Google's Bard said: 'Host a themed dinner party every month.
How To Create A Fully Automated AI Based Trading System With Python
A couple of weeks ago I was casually chatting with a friend, masks on, social distance, the usual stuff. He was telling me how he was trying to, and I quote, detox from the broker app he was using. I asked him about the meaning of the word detox in this particular context, worrying that he might go broke, but nah: he told me that he was constantly trading. "If a particular stock has been going up for more than one hour or so and I'm already over the 1% profit threshold then I sell", he said, "among other personal rules I've been following". Leaving aside the slight pseudoscientific aspect of those rules, I understood what he meant by detox: following them implied checking the phone an astronomically high number of times.
DETOX: A Redundancy-based Framework for Faster and More Robust Gradient Aggregation
Rajput, Shashank, Wang, Hongyi, Charles, Zachary, Papailiopoulos, Dimitris
To improve the resilience of distributed training to worst-case, or Byzantine node failures, several recent methods have replaced gradient averaging with robust aggregation methods. Such techniques can have high computational costs, often quadratic in the number of compute nodes, and only have limited robustness guarantees. Other methods have instead used redundancy to guarantee robustness, but can only tolerate limited numbers of Byzantine failures. In this work, we present DETOX, a Byzantine-resilient distributed training framework that combines algorithmic redundancy with robust aggregation. DETOX operates in two steps, a filtering step that uses limited redundancy to significantly reduce the effect of Byzantine nodes, and a hierarchical aggregation step that can be used in tandem with any state-of-the-art robust aggregation method.