AITopics | Vemprala, Sai

Collaborating Authors

Vemprala, Sai

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MatMamba: A Matryoshka State Space Model

Shukla, Abhinav, Vemprala, Sai, Kusupati, Aditya, Kapoor, Ashish

arXiv.org Artificial IntelligenceOct-9-2024

State Space Models (SSMs) like Mamba2 are a promising alternative to Transformers, with faster theoretical training and inference times -- especially for long context lengths. Recent work on Matryoshka Representation Learning -- and its application to Transformer backbones in works like MatFormer -- showed how to introduce nested granularities of smaller submodels in one universal elastic model. In this work, we present MatMamba: a state space model which combines Matryoshka-style learning with Mamba2, by modifying the block to contain nested dimensions to enable joint training and adaptive inference. MatMamba allows for efficient and adaptive deployment across various model sizes. We train a single large MatMamba model and are able to get a number of smaller nested models for free -- while maintaining or improving upon the performance of a baseline smaller model trained from scratch. We train language and image models at a variety of parameter sizes from 35M to 1.4B. Our results on ImageNet and FineWeb show that MatMamba models scale comparably to Transformers, while having more efficient inference characteristics. This makes MatMamba a practically viable option for deploying large-scale models in an elastic way based on the available inference compute. Code and models are open sourced at \url{https://github.com/ScaledFoundations/MatMamba}

arxiv preprint arxiv, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2410.06718

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

Is Imitation All You Need? Generalized Decision-Making with Dual-Phase Training

Wei, Yao, Sun, Yanchao, Zheng, Ruijie, Vemprala, Sai, Bonatti, Rogerio, Chen, Shuhang, Madaan, Ratnesh, Ba, Zhongjie, Kapoor, Ashish, Ma, Shuang

arXiv.org Artificial IntelligenceOct-9-2023

We introduce DualMind, a generalist agent designed to tackle various decision-making tasks that addresses challenges posed by current methods, such as overfitting behaviors and dependence on task-specific fine-tuning. DualMind uses a novel "Dual-phase" training strategy that emulates how humans learn to act in the world. The model first learns fundamental common knowledge through a self-supervised objective tailored for control tasks and then learns how to make decisions based on different contexts through imitating behaviors conditioned on given prompts. DualMind can handle tasks across domains, scenes, and embodiments using just a single set of model weights and can execute zero-shot prompting without requiring task-specific fine-tuning. We evaluate DualMind on MetaWorld and Habitat through extensive experiments and demonstrate its superior generalizability compared to previous techniques, outperforming other generalist agents by over 50$\%$ and 70$\%$ on Habitat and MetaWorld, respectively. On the 45 tasks in MetaWorld, DualMind achieves over 30 tasks at a 90$\%$ success rate.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2307.07909

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Maryland (0.14)
Asia > Middle East > Israel (0.14)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

GRID: A Platform for General Robot Intelligence Development

Vemprala, Sai, Chen, Shuhang, Shukla, Abhinav, Narayanan, Dinesh, Kapoor, Ashish

arXiv.org Artificial IntelligenceOct-7-2023

Developing machine intelligence abilities in robots and autonomous systems is an expensive and time consuming process. Existing solutions are tailored to specific applications and are harder to generalize. Furthermore, scarcity of training data adds a layer of complexity in deploying deep machine learning models. We present a new platform for General Robot Intelligence Development (GRID) to address both of these issues. The platform enables robots to learn, compose and adapt skills to their physical capabilities, environmental constraints and goals. The platform addresses AI problems in robotics via foundation models that know the physical world. GRID is designed from the ground up to be extensible to accommodate new types of robots, vehicles, hardware platforms and software protocols. In addition, the modular design enables various deep ML components and existing foundation models to be easily usable in a wider variety of robot-centric problems. We demonstrate the platform in various aerial robotics scenarios and demonstrate how the platform dramatically accelerates development of machine intelligent robots.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2310.00887

Country: Africa > Rwanda (0.14)

Genre: Research Report (0.64)

Industry:

Information Technology (1.00)
Transportation (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ChatGPT for Robotics: Design Principles and Model Abilities

Vemprala, Sai, Bonatti, Rogerio, Bucker, Arthur, Kapoor, Ashish

arXiv.org Artificial IntelligenceJul-19-2023

The rapid advancement in natural language processing (NLP) has led to the development of large language models (LLMs), such as BERT [2], GPT-3 [3], and Codex [4], that are revolutionizing a wide range of applications. These models have achieved remarkable results in various tasks such as text generation, machine translation, and code synthesis, among others. A recent addition to this collection of models was the OpenAI ChatGPT [1], a pretrained generative text model which was finetuned using human feedback. Unlike previous models which operate mostly upon a single prompt, ChatGPT provides particularly impressive interaction skills through dialog, combining text generation with code synthesis. Our goal in this paper is to investigate if and how the abilities of ChatGPT can generalize to the domain of robotics. Robotics systems, unlike text-only applications, require a deep understanding of real-world physics, environmental context, and the ability to perform physical actions. A generative robotics model needs to have a robust commonsense knowledge and a sophisticated world model, and the ability to interact with users to interpret and execute commands in ways that are physically possible and that makes sense in the real world. These challenges fall beyond the original scope of language models, as they must not only understand the meaning of a given text, but also translate the intent into a logical sequence of physical actions. In recent years there have been different attempts to incorporate language into robotics systems.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2306.17582

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ConBaT: Control Barrier Transformer for Safe Policy Learning

Meng, Yue, Vemprala, Sai, Bonatti, Rogerio, Fan, Chuchu, Kapoor, Ashish

arXiv.org Artificial IntelligenceMar-7-2023

Large-scale self-supervised models have recently revolutionized our ability to perform a variety of tasks within the vision and language domains. However, using such models for autonomous systems is challenging because of safety requirements: besides executing correct actions, an autonomous agent must also avoid the high cost and potentially fatal critical mistakes. Traditionally, self-supervised training mainly focuses on imitating previously observed behaviors, and the training demonstrations carry no notion of which behaviors should be explicitly avoided. In this work, we propose Control Barrier Transformer (ConBaT), an approach that learns safe behaviors from demonstrations in a self-supervised fashion. ConBaT is inspired by the concept of control barrier functions in control theory and uses a causal transformer that learns to predict safe robot actions autoregressively using a critic that requires minimal safety data labeling. During deployment, we employ a lightweight online optimization to find actions that ensure future states lie within the learned safe set. We apply our approach to different simulated control tasks and show that our method results in safer control policies compared to other classical and learning-based methods such as imitation learning, reinforcement learning, and model predictive control.

artificial intelligence, machine learning, trajectory, (19 more...)

arXiv.org Artificial Intelligence

2303.04212

Genre: Research Report (0.50)

Industry:

Transportation (0.93)
Leisure & Entertainment > Sports > Motorsports (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

CausalCity: Complex Simulations with Agency for Causal Discovery and Reasoning

McDuff, Daniel, Song, Yale, Lee, Jiyoung, Vineet, Vibhav, Vemprala, Sai, Gyde, Nicholas, Salman, Hadi, Ma, Shuang, Sohn, Kwanghoon, Kapoor, Ashish

arXiv.org Artificial IntelligenceJun-24-2021

The ability to perform causal and counterfactual reasoning are central properties of human intelligence. Decision-making systems that can perform these types of reasoning have the potential to be more generalizable and interpretable. Simulations have helped advance the state-of-the-art in this domain, by providing the ability to systematically vary parameters (e.g., confounders) and generate examples of the outcomes in the case of counterfactual scenarios. However, simulating complex temporal causal events in multi-agent scenarios, such as those that exist in driving and vehicle navigation, is challenging. To help address this, we present a high-fidelity simulation environment that is designed for developing algorithms for causal discovery and counterfactual reasoning in the safety-critical context. A core component of our work is to introduce \textit{agency}, such that it is simple to define and create complex scenarios using high-level definitions. The vehicles then operate with agency to complete these objectives, meaning low-level behaviors need only be controlled if necessary. We perform experiments with three state-of-the-art methods to create baselines and highlight the affordances of this environment. Finally, we highlight challenges and opportunities for future work.

artificial intelligence, causal discovery and reasoning, complex simulation, (1 more...)

arXiv.org Artificial Intelligence

2106.13364

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence (0.53)

Add feedback

3DB: A Framework for Debugging Computer Vision Models

Leclerc, Guillaume, Salman, Hadi, Ilyas, Andrew, Vemprala, Sai, Engstrom, Logan, Vineet, Vibhav, Xiao, Kai, Zhang, Pengchuan, Santurkar, Shibani, Yang, Greg, Kapoor, Ashish, Madry, Aleksander

arXiv.org Machine LearningJun-7-2021

We introduce 3DB: an extendable, unified framework for testing and debugging vision models using photorealistic simulation. We demonstrate, through a wide range of use cases, that 3DB allows users to discover vulnerabilities in computer vision systems and gain insights into how models make decisions. 3DB captures and generalizes many robustness analyses from prior work, and enables one to study their interplay. Finally, we find that the insights generated by the system transfer to the physical world. We are releasing 3DB as a library (https://github.com/3db/3db) alongside a set of example analyses, guides, and documentation: https://3db.github.io/3db/ .

deep learning, neural network, us government, (19 more...)

arXiv.org Machine Learning

2106.03805

Country: North America > United States (0.94)

Genre: Research Report > Experimental Study (0.46)

Industry:

Leisure & Entertainment (0.94)
Government > Military (0.68)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback