AITopics | Bhambri, Suvaansh

Collaborating Authors

Bhambri, Suvaansh

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Multi-Level Compositional Reasoning for Interactive Instruction Following

Bhambri, Suvaansh, Kim, Byeonghwi, Choi, Jonghyun

arXiv.org Artificial IntelligenceAug-18-2023

Robotic agents performing domestic chores by natural language directives are required to master the complex job of navigating environment and interacting with objects in the environments. The tasks given to the agents are often composite thus are challenging as completing them require to reason about multiple subtasks, e.g., bring a cup of coffee. To address the challenge, we propose to divide and conquer it by breaking the task into multiple subgoals and attend to them individually for better navigation and interaction. We call it Multi-level Compositional Reasoning Agent (MCR-Agent). Specifically, we learn a three-level action policy. At the highest level, we infer a sequence of human-interpretable subgoals to be executed based on language instructions by a high-level policy composition controller. At the middle level, we discriminatively control the agent's navigation by a master policy by alternating between a navigation policy and various independent interaction policies. Finally, at the lowest level, we infer manipulation actions with the corresponding object masks using the appropriate interaction policy. Our approach not only generates human interpretable subgoals but also achieves 2.03% absolute gain to comparable state of the arts in the efficiency metric (PLWSR in unseen set) without using rule-based planning or a semantic spatial memory.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2308.09387

Genre: Research Report (0.40)

Industry: Education > Educational Setting > Online (0.41)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Subsidiary Prototype Alignment for Universal Domain Adaptation

Kundu, Jogendra Nath, Bhambri, Suvaansh, Kulkarni, Akshay, Sarkar, Hiran, Jampani, Varun, Babu, R. Venkatesh

arXiv.org Artificial IntelligenceOct-28-2022

Universal Domain Adaptation (UniDA) deals with the problem of knowledge transfer between two datasets with domain-shift as well as category-shift. The goal is to categorize unlabeled target samples, either into one of the "known" categories or into a single "unknown" category. A major problem in UniDA is negative transfer, i.e. misalignment of "known" and "unknown" classes. To this end, we first uncover an intriguing tradeoff between negative-transfer-risk and domain-invariance exhibited at different layers of a deep network. It turns out we can strike a balance between these two metrics at a mid-level layer. Towards designing an effective framework based on this insight, we draw motivation from Bag-of-visual-Words (BoW). Word-prototypes in a BoW-like representation of a mid-level layer would represent lower-level visual primitives that are likely to be unaffected by the category-shift in the high-level features. We develop modifications that encourage learning of word-prototypes followed by word-histogram based classification. Following this, subsidiary prototype-space alignment (SPA) can be seen as a closed-set alignment problem, thereby avoiding negative transfer. We realize this with a novel word-histogram-related pretext task to enable closed-set SPA, operating in conjunction with goal task UniDA. We demonstrate the efficacy of our approach on top of existing UniDA techniques, yielding state-of-the-art performance across three standard UniDA and Open-Set DA object recognition benchmarks.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2210.15909

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Concurrent Subsidiary Supervision for Unsupervised Source-Free Domain Adaptation

Kundu, Jogendra Nath, Bhambri, Suvaansh, Kulkarni, Akshay, Sarkar, Hiran, Jampani, Varun, Babu, R. Venkatesh

arXiv.org Artificial IntelligenceJul-26-2022

The prime challenge in unsupervised domain adaptation (DA) is to mitigate the domain shift between the source and target domains. Prior DA works show that pretext tasks could be used to mitigate this domain shift by learning domain invariant representations. However, in practice, we find that most existing pretext tasks are ineffective against other established techniques. Thus, we theoretically analyze how and when a subsidiary pretext task could be leveraged to assist the goal task of a given DA problem and develop objective subsidiary task suitability criteria. Based on this criteria, we devise a novel process of sticker intervention and cast sticker classification as a supervised subsidiary DA problem concurrent to the goal task unsupervised DA. Our approach not only improves goal task adaptation performance, but also facilitates privacy-oriented source-free DA i.e. without concurrent source-target access. Experiments on the standard Office-31, Office-Home, DomainNet, and VisDA benchmarks demonstrate our superiority for both single-source and multi-source source-free DA. Our approach also complements existing non-source-free works, achieving leading performance.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2207.13247

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Sensing and Signal Processing > Image Processing (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

MOCA: A Modular Object-Centric Approach for Interactive Instruction Following

Singh, Kunal Pratap, Bhambri, Suvaansh, Kim, Byeonghwi, Mottaghi, Roozbeh, Choi, Jonghyun

arXiv.org Artificial IntelligenceDec-6-2020

Performing simple household tasks based on language directives is very natural to humans, yet it remains an open challenge for an AI agent. Recently, an `interactive instruction following' task has been proposed to foster research in reasoning over long instruction sequences that requires object interactions in a simulated environment. It involves solving open problems in vision, language and navigation literature at each step. To address this multifaceted problem, we propose a modular architecture that decouples the task into visual perception and action policy, and name it as MOCA, a Modular Object-Centric Approach. We evaluate our method on the ALFRED benchmark and empirically validate that it outperforms prior arts by significant margins in all metrics with good generalization performance (high success rate in unseen environments). Our code is available at https://github.com/gistvision/moca.

agent, artificial intelligence, neural network, (19 more...)

arXiv.org Artificial Intelligence

2012.03208

Genre: Workflow (0.67)

Industry: Education > Educational Setting > Online (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback