AITopics | Mondal, Shanka Subhra

Collaborating Authors

Mondal, Shanka Subhra

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Slot Abstractors: Toward Scalable Abstract Visual Reasoning

Mondal, Shanka Subhra, Cohen, Jonathan D., Webb, Taylor W.

arXiv.org Artificial IntelligenceJun-2-2024

Abstract visual reasoning is a characteristically human ability, allowing the identification of relational patterns that are abstracted away from object features, and the systematic generalization of those patterns to unseen problems. Recent work has demonstrated strong systematic generalization in visual reasoning tasks involving multi-object inputs, through the integration of slot-based methods used for extracting object-centric representations coupled with strong inductive biases for relational abstraction. However, this approach was limited to problems containing a single rule, and was not scalable to visual reasoning problems containing a large number of objects. Other recent work proposed Abstractors, an extension of Transformers that incorporates strong relational inductive biases, thereby inheriting the Transformer's scalability and multi-head architecture, but it has yet to be demonstrated how this approach might be applied to multi-object visual inputs. Here we combine the strengths of the above approaches and propose Slot Abstractors, an approach to abstract visual reasoning that can be scaled to problems involving a large number of objects and multiple relations among them. The approach displays state-of-the-art performance across four abstract visual reasoning tasks, as well as an abstract reasoning task involving real-world images.

abstractor, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2403.03458

Country: North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.93)
Information Technology > Artificial Intelligence > Vision (0.88)

Add feedback

Learning to reason over visual objects

Mondal, Shanka Subhra, Webb, Taylor, Cohen, Jonathan D.

arXiv.org Artificial IntelligenceOct-26-2023

Despite the centrality of objects in visual reasoning, previous works have so far not explored the use of object-centric representations in abstract visual reasoning tasks such as RAVEN and PGM, or at best have employed an imprecise approximation to object representations based on spatial location. Recently, a number of methods have been proposed for the extraction of precise object-centric representations directly from pixel-level inputs, without the need for veridical segmentation data (Greff et al., 2019; Burgess et al., 2019; Locatello et al., 2020; Engelcke et al., 2021). While these methods have been shown to improve performance in some visual reasoning tasks, including question answering from video (Ding et al., 2021) and prediction of physical interactions from video Wu et al. (2022), previous work has not addressed whether this approach is useful in the domain of abstract visual reasoning (i.e., visual analogy). To address this, we developed a model that combines an object-centric encoding method, slot attention (Locatello et al., 2020), with a generic transformer-based reasoning module (Vaswani et al., 2017). The combined system, termed the Slot Transformer Scoring Network (STSN, Figure 1) achieves state-of-the-art performance on both PGM and I-RAVEN (a more challenging variant of RAVEN), despite its general-purpose architecture, and lack of task-specific augmentations. Furthermore, we developed a novel benchmark, the CLEVR-Matrices (Figure 2), using a similar RPM-like problem structure, but with greater visual complexity, and found that STSN also achieves state-of-the-art performance on this task. These results suggest that object-centric encoding is an essential component for achieving strong abstract visual reasoning, and indeed may be even more important than some task-specific inductive biases.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2303.0226

Country: North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

A Prefrontal Cortex-inspired Architecture for Planning in Large Language Models

Webb, Taylor, Mondal, Shanka Subhra, Wang, Chi, Krabach, Brian, Momennejad, Ida

arXiv.org Artificial IntelligenceSep-29-2023

Large language models (LLMs) demonstrate impressive performance on a wide variety of tasks, but they often struggle with tasks that require multi-step reasoning or goal-directed planning. To address this, we take inspiration from the human brain, in which planning is accomplished via the recurrent interaction of specialized modules in the prefrontal cortex (PFC). These modules perform functions such as conflict monitoring, state prediction, state evaluation, task decomposition, and task coordination. We find that LLMs are sometimes capable of carrying out these functions in isolation, but struggle to autonomously coordinate them in the service of a goal. Therefore, we propose a black box architecture with multiple LLM-based (GPT-4) modules. The architecture improves planning through the interaction of specialized PFC-inspired modules that break down a larger problem into multiple brief automated calls to the LLM. We evaluate the combined architecture on two challenging planning tasks -- graph traversal and Tower of Hanoi -- finding that it yields significant improvements over standard LLM methods (e.g., zero-shot prompting or in-context learning). These results demonstrate the benefit of utilizing knowledge from cognitive neuroscience to improve planning in LLMs.

large language model, machine learning, natural language, (4 more...)

arXiv.org Artificial Intelligence

2310.00194

Country: Asia > Vietnam > Hanoi > Hanoi (0.24)

Genre: Research Report (0.69)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Determinantal Point Process Attention Over Grid Codes Supports Out of Distribution Generalization

Mondal, Shanka Subhra, Frankland, Steven, Webb, Taylor, Cohen, Jonathan D.

arXiv.org Artificial IntelligenceMay-28-2023

Deep neural networks have made tremendous gains in emulating human-like intelligence, and have been used increasingly as ways of understanding how the brain may solve the complex computational problems on which this relies. However, these still fall short of, and therefore fail to provide insight into how the brain supports strong forms of generalization of which humans are capable. One such case is out-of-distribution (OOD) generalization -- successful performance on test examples that lie outside the distribution of the training set. Here, we identify properties of processing in the brain that may contribute to this ability. We describe a two-part algorithm that draws on specific features of neural computation to achieve OOD generalization, and provide a proof of concept by evaluating performance on two challenging cognitive tasks. First we draw on the fact that the mammalian brain represents metric spaces using grid-like representations (e.g., in entorhinal cortex): abstract representations of relational structure, organized in recurring motifs that cover the representational space. Second, we propose an attentional mechanism that operates over these grid representations using determinantal point process (DPP-A) -- a transformation that ensures maximum sparseness in the coverage of that space. We show that a loss function that combines standard task-optimized error with DPP-A can exploit the recurring motifs in grid codes, and can be integrated with common architectures to achieve strong OOD generalization performance on analogy and arithmetic tasks. This provides both an interpretation of how grid codes in the mammalian brain may contribute to generalization performance, and at the same time a potential means for improving such capabilities in artificial neural networks.

artificial intelligence, inference module, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2305.18417

Country: North America > United States > California (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

DeepPlace: Learning to Place Applications in Multi-Tenant Clusters

Mitra, Subrata, Mondal, Shanka Subhra, Sheoran, Nikhil, Dhake, Neeraj, Nehra, Ravinder, Simha, Ramanuja

arXiv.org Machine LearningJul-30-2019

Large multi-tenant production clusters often have to handle a variety of jobs and applications with a variety of complex resource usage characteristics. It is non-trivial and non-optimal to manually create placement rules for scheduling that would decide which applications should co-locate. In this paper, we present DeepPlace, a scheduler that learns to exploits various temporal resource usage patterns of applications using Deep Reinforcement Learning (Deep RL) to reduce resource competition across jobs running in the same machine while at the same time optimizing for overall cluster utilization.

application, artificial intelligence, neural network, (17 more...)

arXiv.org Machine Learning

doi: 10.1145/3343737.3343741

1907.12916

Country:

Asia > China (0.16)
North America > United States (0.14)
Asia > India (0.14)

Genre: Research Report (0.50)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback