AITopics | Sastry, Chandramouli Shama

Collaborating Authors

Sastry, Chandramouli Shama

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Accelerating Neural Network Training: An Analysis of the AlgoPerf Competition

Kasimbeg, Priya, Schneider, Frank, Eschenhagen, Runa, Bae, Juhan, Sastry, Chandramouli Shama, Saroufim, Mark, Feng, Boyuan, Wright, Less, Yang, Edward Z., Nado, Zachary, Medapati, Sourabh, Hennig, Philipp, Rabbat, Michael, Dahl, George E.

arXiv.org Machine LearningFeb-20-2025

The goal of the AlgoPerf: Training Algorithms competition is to evaluate practical speed-ups in neural network training achieved solely by improving the underlying training algorithms. In the external tuning ruleset, submissions must provide workload-agnostic hyperparameter search spaces, while in the self-tuning ruleset they must be completely hyperparameter-free. In both rulesets, submissions are compared on time-to-result across multiple deep learning workloads, training on fixed hardware. This paper presents the inaugural AlgoPerf competition's results, which drew 18 diverse submissions from 10 teams. Our investigation reveals several key findings: (1) The winning submission in the external tuning ruleset, using Distributed Shampoo, demonstrates the effectiveness of non-diagonal preconditioning over popular methods like Adam, even when compared on wall-clock runtime. (2) The winning submission in the self-tuning ruleset, based on the Schedule Free AdamW algorithm, demonstrates a new level of effectiveness for completely hyperparameter-free training algorithms. (3) The top-scoring submissions were surprisingly robust to workload changes. We also discuss the engineering challenges encountered in ensuring a fair comparison between different training algorithms. These results highlight both the significant progress so far, and the considerable room for further improvements.

artificial intelligence, machine learning, submission, (17 more...)

arXiv.org Machine Learning

2502.15015

Country:

Europe > Germany > Baden-Württemberg (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Test-Time Training for Depression Detection

Dumpala, Sri Harsha, Sastry, Chandramouli Shama, Uher, Rudolf, Oore, Sageev

arXiv.org Artificial IntelligenceApr-7-2024

Previous works on depression detection use datasets collected in similar environments to train and test the models. In practice, however, the train and test distributions cannot be guaranteed to be identical. Distribution shifts can be introduced due to variations such as recording environment (e.g., background noise) and demographics (e.g., gender, age, etc). Such distributional shifts can surprisingly lead to severe performance degradation of the depression detection models. In this paper, we analyze the application of test-time training (TTT) to improve robustness of models trained for depression detection. When compared to regular testing of the models, we find TTT can significantly improve the robustness of the model under a variety of distributional shifts introduced due to: (a) background-noise, (b) gender-bias, and (c) data collection and curation procedure (i.e., train and test samples are from separate datasets).

artificial intelligence, depression detection, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2404.05071

Country:

North America > Canada (0.14)
Europe (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Benchmarking Neural Network Training Algorithms

Dahl, George E., Schneider, Frank, Nado, Zachary, Agarwal, Naman, Sastry, Chandramouli Shama, Hennig, Philipp, Medapati, Sourabh, Eschenhagen, Runa, Kasimbeg, Priya, Suo, Daniel, Bae, Juhan, Gilmer, Justin, Peirson, Abel L., Khan, Bilal, Anil, Rohan, Rabbat, Mike, Krishnan, Shankar, Snider, Daniel, Amid, Ehsan, Chen, Kongtao, Maddison, Chris J., Vasudev, Rakshith, Badura, Michal, Garg, Ankush, Mattson, Peter

arXiv.org Artificial IntelligenceJun-12-2023

Training algorithms, broadly construed, are an essential part of every deep learning pipeline. Training algorithm improvements that speed up training across a wide variety of workloads (e.g., better update rules, tuning protocols, learning rate schedules, or data selection schemes) could save time, save computational resources, and lead to better, more accurate, models. Unfortunately, as a community, we are currently unable to reliably identify training algorithm improvements, or even determine the state-of-the-art training algorithm. In this work, using concrete experiments, we argue that real progress in speeding up training requires new benchmarks that resolve three basic challenges faced by empirical comparisons of training algorithms: (1) how to decide when training is complete and precisely measure training time, (2) how to handle the sensitivity of measurements to exact workload details, and (3) how to fairly compare algorithms that require hyperparameter tuning. In order to address these challenges, we introduce a new, competitive, time-to-result benchmark using multiple workloads running on fixed hardware, the AlgoPerf: Training Algorithms benchmark. Our benchmark includes a set of workload variants that make it possible to detect benchmark submissions that are more robust to workload changes than current widely-used methods. Finally, we evaluate baseline submissions constructed using various optimizers that represent current practice, as well as other optimizers that have recently received attention in the literature. These baseline results collectively demonstrate the feasibility of our benchmark, show that non-trivial gaps between methods exist, and set a provisional state-of-the-art for future benchmark submissions to try and surpass.

artificial intelligence, machine learning, training algorithm benchmark, (20 more...)

arXiv.org Artificial Intelligence

2306.07179

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Education (0.92)
Information Technology (0.92)
Health & Medicine (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback