AITopics | Tantia, Vinayak

Collaborating Authors

Tantia, Vinayak

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Digital Twin Ecosystem for Oncology Clinical Operations

Pandey, Himanshu, Amod, Akhil, Shivang, null, Jaggi, Kshitij, Garg, Ruchi, Jain, Abheet, Tantia, Vinayak

arXiv.org Artificial IntelligenceSep-26-2024

Artificial Intelligence (AI) and Large Language Models (LLMs) hold significant promise in revolutionizing healthcare, especially in clinical applications. Simultaneously, Digital Twin technology, which models and simulates complex systems, has gained traction in enhancing patient care. However, despite the advances in experimental clinical settings, the potential of AI and digital twins to streamline clinical operations remains largely untapped. This paper introduces a novel digital twin framework specifically designed to enhance oncology clinical operations. We propose the integration of multiple specialized digital twins, such as the Medical Necessity Twin, Care Navigator Twin, and Clinical History Twin, to enhance workflow efficiency and personalize care for each patient based on their unique data. Furthermore, by synthesizing multiple data sources and aligning them with the National Comprehensive Cancer Network (NCCN) guidelines, we create a dynamic Cancer Care Path, a continuously evolving knowledge base that enables these digital twins to provide precise, tailored clinical recommendations.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2409.1765

Country: Asia > Thailand (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

SlowMo: Improving Communication-Efficient Distributed SGD with Slow Momentum

Wang, Jianyu, Tantia, Vinayak, Ballas, Nicolas, Rabbat, Michael

arXiv.org Machine LearningOct-1-2019

A BSTRACT Distributed optimization is essential for training large models on large datasets. Multiple approaches have been proposed to reduce the communication overhead in distributed training, such as synchronizing only after performing multiple local SGD steps, and decentralized methods ( e.g., using gossip algorithms) to decouple communications among workers. Although these methods run faster than A LLR EDUCEbased methods, which use blocking communication before every update, the resulting models may be less accurate after the same number of updates. Inspired by the BMUF method of Chen & Huo (2016), we propose a slow momentum (S LOWM O) framework, where workers periodically synchronize and perform a momentum update, after multiple iterations of a base optimization algorithm. Experiments on image classification and machine translation tasks demonstrate that S LOWM O consistently yields improvements in optimization and generalization performance relative to the base optimizer, even when the additional overhead is amortized over many updates so that the S LOWM O runtime is on par with that of the base optimizer. We provide theoretical convergence guarantees showing that S LOWM O converges to a stationary point of smooth non-convex losses. Since BMUF is a particular instance of the S LOWM O framework, our results also correspond to the first theoretical convergence guarantees for BMUF. 1 I NTRODUCTION Distributed optimization (Chen et al., 2016; Goyal et al., 2017) is essential for training large models on large datasets (Radford et al., 2019; Liu et al., 2019; Mahajan et al., 2018b). Currently, the most widely-used approaches have workers compute small mini-batch gradients locally, in parallel, and then aggregate these using a blocking communication primitive, A LLR EDUCE, before taking an optimizer step. Communication overhead is a major issue limiting the scaling of this approach, since A LLR EDUCE must complete before every step and blocking communications are sensitive to stragglers (Dutta et al., 2018; Ferdinand et al., 2019). Multiple complementary approaches have recently been investigated to reduce or hide communication overhead. Decentralized training (Jiang et al., 2017; Lian et al., 2017; 2018; Assran et al., 2019) reduces idling due to blocking and stragglers by employing approximate gradient aggregation ( e.g., via gossip or distributed averaging). Approaches such as Local SGD reduce the frequency of communication by having workers perform multiple updates between each round of communication (McDonald et al., 2010; McMahan et al., 2017; Zhou & Cong, 2018; Stich, 2019; Y u et al., 2019b). It is also possible to combine decentralized algorithms with Local SGD (Wang & Joshi, Work performed while doing an internship at Facebook AI Research. 1 arXiv:1910.00643v1

algorithm, neural network, optimization problem, (18 more...)

arXiv.org Machine Learning

1910.00643

Country: North America > Canada (0.28)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

A Modern Take on the Bias-Variance Tradeoff in Neural Networks

Neal, Brady, Mittal, Sarthak, Baratin, Aristide, Tantia, Vinayak, Scicluna, Matthew, Lacoste-Julien, Simon, Mitliagkas, Ioannis

arXiv.org Machine LearningOct-19-2018

We revisit the bias-variance tradeoff for neural networks in light of modern empirical findings. The traditional bias-variance tradeoff in machine learning suggests that as model complexity grows, variance increases. Classical bounds in statistical learning theory point to the number of parameters in a model as a measure of model complexity, which means the tradeoff would indicate that variance increases with the size of neural networks. However, we empirically find that variance due to training set sampling is roughly constant (with both width and depth) in practice. Variance caused by the non-convexity of the loss landscape is different. We find that it decreases with width and increases with depth, in our setting. We provide theoretical analysis, in a simplified setting inspired by linear models, that is consistent with our empirical findings for width. We view bias-variance as a useful lens to study generalization through and encourage further theoretical explanation from this perspective. The traditional view in machine learning is that increasingly complex models achieve lower bias at the expense of higher variance. This balance between underfitting (high bias) and overfitting (high variance) is commonly known as the bias-variance tradeoff (Figure 1). In their landmark work that initially highlighted this bias-variance dilemma in machine learning, Geman et al. (1992) suggest that larger neural networks suffer from higher variance.

deep learning, neural network, variance, (17 more...)

arXiv.org Machine Learning

1810.08591

Country:

Europe (1.00)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback