AITopics | Gupta, Ritwik

Collaborating Authors

Gupta, Ritwik

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Enough Coin Flips Can Make LLMs Act Bayesian

Gupta, Ritwik, Corona, Rodolfo, Ge, Jiaxin, Wang, Eric, Klein, Dan, Darrell, Trevor, Chan, David M.

arXiv.org Artificial IntelligenceMar-6-2025

Large language models (LLMs) exhibit the ability to generalize given few-shot examples in their input prompt, an emergent capability known as in-context learning (ICL). We investigate whether LLMs utilize ICL to perform structured reasoning in ways that are consistent with a Bayesian framework or rely on pattern matching. Using a controlled setting of biased coin flips, we find that: (1) LLMs often possess biased priors, causing initial divergence in zero-shot settings, (2) in-context evidence outweighs explicit bias instructions, (3) LLMs broadly follow Bayesian posterior updates, with deviations primarily due to miscalibrated priors rather than flawed updates, and (4) attention magnitude has negligible effect on Bayesian inference. With sufficient demonstrations of biased coin flips via ICL, LLMs update their priors in a Bayesian manner.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2503.04722

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Government > Regional Government (0.46)

Add feedback

Whack-a-Chip: The Futility of Hardware-Centric Export Controls

Gupta, Ritwik, Walker, Leah, Reddie, Andrew W.

arXiv.org Artificial IntelligenceNov-21-2024

U.S. export controls on semiconductors are widely known to be permeable, with the People's Republic of China (PRC) steadily creating state-of-the-art artificial intelligence (AI) models with exfiltrated chips. This paper presents the first concrete, public evidence of how leading PRC AI labs evade and circumvent U.S. export controls. We examine how Chinese companies, notably Tencent, are not only using chips that are restricted under U.S. export controls but are also finding ways to circumvent these regulations by using software and modeling techniques that maximize less capable hardware. Specifically, we argue that Tencent's ability to power its Hunyuan-Large model with non-export controlled NVIDIA H20s exemplifies broader gains in efficiency in machine learning that have eroded the moat that the United States initially built via its existing export controls. Finally, we examine the implications of this finding for the future of the United States' export control strategy.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2411.14425

Country: North America > United States > Minnesota (0.28)

Genre: Research Report (0.82)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Government > Commerce (1.00)
Banking & Finance (1.00)
Information Technology > Services (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Data-Centric AI Governance: Addressing the Limitations of Model-Focused Policies

Gupta, Ritwik, Walker, Leah, Corona, Rodolfo, Fu, Stephanie, Petryk, Suzanne, Napolitano, Janet, Darrell, Trevor, Reddie, Andrew W.

arXiv.org Artificial IntelligenceSep-25-2024

Current regulations on powerful AI capabilities are narrowly focused on "foundation" or "frontier" models. However, these terms are vague and inconsistently defined, leading to an unstable foundation for governance efforts. Critically, policy debates often fail to consider the data used with these models, despite the clear link between data and model performance. Even (relatively) "small" models that fall outside the typical definitions of foundation and frontier models can achieve equivalent outcomes when exposed to sufficiently specific datasets. In this work, we illustrate the importance of considering dataset size and content as essential factors in assessing the risks posed by models both today and in the future. More broadly, we emphasize the risk posed by over-regulating reactively and provide a path towards careful, quantitative evaluation of capabilities that can lead to a simplified regulatory environment.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2409.17216

Country:

Europe (1.00)
North America > United States > California (0.14)
Asia > Middle East > UAE (0.14)
Asia > Middle East > Qatar (0.14)

Genre: Research Report (0.82)

Industry:

Law (1.00)
Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

Add feedback

xT: Nested Tokenization for Larger Context in Large Images

Gupta, Ritwik, Li, Shufan, Zhu, Tyler, Malik, Jitendra, Darrell, Trevor, Mangalam, Karttikeya

arXiv.org Artificial IntelligenceMar-4-2024

Modern computer vision pipelines handle large images in one of two sub-optimal ways: down-sampling or cropping. These two methods incur significant losses in the amount of information and context present in an image. There are many downstream applications in which global context matters as much as high frequency details, such as in real-world satellite imagery; in such cases researchers have to make the uncomfortable choice of which information to discard. We introduce xT, a simple framework for vision transformers which effectively aggregates global context with local details and can model large images end-to-end on contemporary GPUs. We select a set of benchmark datasets across classic vision tasks which accurately reflect a vision model's ability to understand truly large images and incorporate fine details over large scales and assess our method's improvement on them. By introducing a nested tokenization scheme for large images in conjunction with long-sequence length models normally used for natural language processing, we are able to increase accuracy by up to 8.6% on challenging classification tasks and $F_1$ score by 11.6 on context-dependent segmentation in large images.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2403.01915

Country: North America > United States > Hawaii (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

ClimSim: A large multi-scale dataset for hybrid physics-ML climate emulation

Yu, Sungduk, Hannah, Walter, Peng, Liran, Lin, Jerry, Bhouri, Mohamed Aziz, Gupta, Ritwik, Lütjens, Björn, Will, Justus Christopher, Behrens, Gunnar, Busecke, Julius, Loose, Nora, Stern, Charles I, Beucler, Tom, Harrop, Bryce, Hillman, Benjamin R, Jenney, Andrea, Ferretti, Savannah, Liu, Nana, Anandkumar, Anima, Brenowitz, Noah D, Eyring, Veronika, Geneva, Nicholas, Gentine, Pierre, Mandt, Stephan, Pathak, Jaideep, Subramaniam, Akshay, Vondrick, Carl, Yu, Rose, Zanna, Laure, Zheng, Tian, Abernathey, Ryan, Ahmed, Fiaz, Bader, David C, Baldi, Pierre, Barnes, Elizabeth, Bretherton, Christopher, Caldwell, Peter, Chuang, Wayne, Han, Yilun, Huang, Yu, Iglesias-Suarez, Fernando, Jantre, Sanket, Kashinath, Karthik, Khairoutdinov, Marat, Kurth, Thorsten, Lutsko, Nicholas, Ma, Po-Lun, Mooers, Griffin, Neelin, J. David, Randall, David, Shamekh, Sara, Taylor, Mark A, Urban, Nathan, Yuval, Janni, Zhang, Guang, Pritchard, Michael

arXiv.org Artificial IntelligenceFeb-6-2024

Modern climate projections lack adequate spatial and temporal resolution due to computational constraints. A consequence is inaccurate and imprecise predictions of critical processes such as storms. Hybrid methods that combine physics with machine learning (ML) have introduced a new generation of higher fidelity climate simulators that can sidestep Moore's Law by outsourcing compute-hungry, short, high-resolution simulations to ML emulators. However, this hybrid ML-physics simulation approach requires domain-specific treatment and has been inaccessible to ML experts because of lack of training data and relevant, easy-to-use workflows. We present ClimSim, the largest-ever dataset designed for hybrid ML-physics research. It comprises multi-scale climate simulations, developed by a consortium of climate scientists and ML researchers. It consists of 5.7 billion pairs of multivariate input and output vectors that isolate the influence of locally-nested, high-resolution, high-fidelity physics on a host climate simulator's macro-scale physical state. The dataset is global in coverage, spans multiple years at high sampling frequency, and is designed such that resulting emulators are compatible with downstream coupling into operational climate simulators. We implement a range of deterministic and stochastic regression baselines to highlight the ML challenges and their scoring.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2306.08754

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Energy > Oil & Gas > Upstream (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Proceedings of NeurIPS 2020 Workshop on Artificial Intelligence for Humanitarian Assistance and Disaster Response

Gupta, Ritwik, Heim, Eric T., Nemni, Edoardo

arXiv.org Artificial IntelligenceDec-7-2020

These are the "proceedings" of the 2nd AI + HADR workshop which was held virtually on December 12, 2020 as part of the Neural Information Processing Systems conference. These are non-archival and merely serve as a way to collate all the papers accepted to the workshop.

artificial intelligence, humanitarian assistance and disaster response, neurips 2020, (2 more...)

arXiv.org Artificial Intelligence

2012.02108

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence (0.69)

Add feedback

Proceedings of NeurIPS 2019 Workshop on Artificial Intelligence for Humanitarian Assistance and Disaster Response

Gupta, Ritwik, Heim, Eric T.

arXiv.org Artificial IntelligenceDec-3-2020

These are the "proceedings" of the 1st AI + HADR workshop which was held in Vancouver, Canada on December 13, 2019 as part of the Neural Information Processing Systems conference. These are non-archival and serve solely as a collation of all the papers accepted to the workshop.

artificial intelligence, humanitarian assistance and disaster response, neurips 2019, (2 more...)

arXiv.org Artificial Intelligence

2012.01022

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.24)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence (0.69)

Add feedback