AITopics | Choudhury, Sutanay

Collaborating Authors

Choudhury, Sutanay

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Thinking Fast and Laterally: Multi-Agentic Approach for Reasoning about Uncertain Emerging Events

Dernbach, Stefan, Michel, Alejandro, Agarwal, Khushbu, Brissette, Christopher, Gupta, Geetika, Choudhury, Sutanay

arXiv.org Artificial IntelligenceDec-10-2024

This paper introduces lateral thinking to implement System-2 reasoning capabilities in AI systems, focusing on anticipatory and causal reasoning under uncertainty. We present a framework for systematic generation and modeling of lateral thinking queries and evaluation datasets. We introduce Streaming Agentic Lateral Thinking (SALT), a multi-agent framework designed to process complex, low-specificity queries in streaming data environments. SALT implements lateral thinking-inspired System-2 reasoning through a dynamic communication structure between specialized agents. Our key insight is that lateral information flow across long-distance agent interactions, combined with fine-grained belief management, yields richer information contexts and enhanced reasoning. Preliminary quantitative and qualitative evaluations indicate SALT's potential to outperform single-agent systems in handling complex lateral reasoning tasks in a streaming environment.

artificial intelligence, information retrieval query processing, natural language, (17 more...)

arXiv.org Artificial Intelligence

2412.07977

Country:

Europe (0.46)
Asia (0.30)
North America > United States (0.28)

Genre: Research Report (0.82)

Industry:

Energy (1.00)
Banking & Finance (1.00)
Government > Commerce (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.46)

Add feedback

ChemReasoner: Heuristic Search over a Large Language Model's Knowledge Space using Quantum-Chemical Feedback

Sprueill, Henry W., Edwards, Carl, Agarwal, Khushbu, Olarte, Mariefel V., Sanyal, Udishnu, Johnston, Conrad, Liu, Hongbin, Ji, Heng, Choudhury, Sutanay

arXiv.org Artificial IntelligenceJun-7-2024

The discovery of new catalysts is essential for the design of new and more efficient chemical processes in order to transition to a sustainable future. We introduce an AI-guided computational screening framework unifying linguistic reasoning with quantum-chemistry based feedback from 3D atomistic representations. Our approach formulates catalyst discovery as an uncertain environment where an agent actively searches for highly effective catalysts via the iterative combination of large language model (LLM)-derived hypotheses and atomistic graph neural network (GNN)-derived feedback. Identified catalysts in intermediate search steps undergo structural evaluation based on spatial orientation, reaction pathways, and stability. Scoring functions based on adsorption energies and reaction energy barriers steer the exploration in the LLM's knowledge space toward energetically favorable, high-efficiency catalysts. We introduce planning methods that automatically guide the exploration without human input, providing competitive performance against expert-enumerated chemical descriptor-based implementations. By integrating language-guided reasoning with computational chemistry feedback, our work pioneers AI-accelerated, trustworthy catalyst discovery.

catalyst, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2402.1098

Country:

Asia > Middle East > UAE (0.14)
North America > United States > Massachusetts (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report (0.82)

Industry:

Materials > Chemicals > Specialty Chemicals (1.00)
Materials > Chemicals > Commodity Chemicals > Petrochemicals (0.51)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

GLaM: Fine-Tuning Large Language Models for Domain Knowledge Graph Alignment via Neighborhood Partitioning and Generative Subgraph Encoding

Dernbach, Stefan, Agarwal, Khushbu, Zuniga, Alejandro, Henry, Michael, Choudhury, Sutanay

arXiv.org Artificial IntelligenceFeb-9-2024

Integrating large language models (LLMs) with knowledge graphs derived from domain-specific data represents an important advancement towards more powerful and factual reasoning. As these models grow more capable, it is crucial to enable them to perform multi-step inferences over real-world knowledge graphs while minimizing hallucination. While large language models excel at conversation and text generation, their ability to reason over domain-specialized graphs of interconnected entities remains limited. For example, can we query a LLM to identify the optimal contact in a professional network for a specific goal, based on relationships and attributes in a private database? The answer is no--such capabilities lie beyond current methods. However, this question underscores a critical technical gap that must be addressed. Many high-value applications in areas such as science, security, and e-commerce rely on proprietary knowledge graphs encoding unique structures, relationships, and logical constraints. We introduce a fine-tuning framework for developing Graph-aligned LAnguage Models (GLaM) that transforms a knowledge graph into an alternate text representation with labeled question-answer pairs. We demonstrate that grounding the models in specific graph-based knowledge expands the models' capacity for structure-based reasoning. Our methodology leverages the large-language model's generative capabilities to create the dataset and proposes an efficient alternate to retrieval-augmented generation styled methods.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2402.06764

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.48)
Health & Medicine > Pharmaceuticals & Biotechnology (0.47)
Information Technology > Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Evaluating Emerging AI/ML Accelerators: IPU, RDU, and NVIDIA/AMD GPUs

Peng, Hongwu, Ding, Caiwen, Geng, Tong, Choudhury, Sutanay, Barker, Kevin, Li, Ang

arXiv.org Artificial IntelligenceNov-7-2023

The relentless advancement of artificial intelligence (AI) and machine learning (ML) applications necessitates the development of specialized hardware accelerators capable of handling the increasing complexity and computational demands. Traditional computing architectures, based on the von Neumann model, are being outstripped by the requirements of contemporary AI/ML algorithms, leading to a surge in the creation of accelerators like the Graphcore Intelligence Processing Unit (IPU), Sambanova Reconfigurable Dataflow Unit (RDU), and enhanced GPU platforms. These hardware accelerators are characterized by their innovative data-flow architectures and other design optimizations that promise to deliver superior performance and energy efficiency for AI/ML tasks. This research provides a preliminary evaluation and comparison of these commercial AI/ML accelerators, delving into their hardware and software design features to discern their strengths and unique capabilities. By conducting a series of benchmark evaluations on common DNN operators and other AI/ML workloads, we aim to illuminate the advantages of data-flow architectures over conventional processor designs and offer insights into the performance trade-offs of each platform. The findings from our study will serve as a valuable reference for the design and performance expectations of research prototypes, thereby facilitating the development of next-generation hardware accelerators tailored for the ever-evolving landscape of AI/ML applications. Through this analysis, we aspire to contribute to the broader understanding of current accelerator technologies and to provide guidance for future innovations in the field.

ai ml accelerator, artificial intelligence, machine learning, (3 more...)

arXiv.org Artificial Intelligence

2311.04417

Genre: Research Report (0.69)

Industry: Information Technology > Hardware (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Faithful and Efficient Explanations for Neural Networks via Neural Tangent Kernel Surrogate Models

Engel, Andrew, Wang, Zhichao, Frank, Natalie S., Dumitriu, Ioana, Choudhury, Sutanay, Sarwate, Anand, Chiang, Tony

arXiv.org Artificial IntelligenceNov-1-2023

A recent trend in explainable AI research has focused on surrogate modeling, where neural networks are approximated as simpler ML algorithms such as kernel machines. A second trend has been to utilize kernel functions in various explain-by-example or data attribution tasks to investigate a diverse set of neural network behavior. In this work, we combine these two trends to analyze approximate empirical neural tangent kernels (eNTK) for data attribution. Approximation is critical for eNTK analysis due to the high computational cost to compute the eNTK. We define new approximate eNTK and perform novel analysis on how well the resulting kernel machine surrogate models correlate with the underlying neural network. We introduce two new random projection variants of approximate eNTK which allow users to tune the time and memory complexity of their calculation. We conclude that kernel machines using approximate neural tangent kernel as the kernel function are effective surrogate models, with the introduced trace NTK the most consistent performer.

artificial intelligence, machine learning, neural tangent kernel surrogate model, (2 more...)

arXiv.org Artificial Intelligence

2305.14585

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Monte Carlo Thought Search: Large Language Model Querying for Complex Scientific Reasoning in Catalyst Design

Sprueill, Henry W., Edwards, Carl, Olarte, Mariefel V., Sanyal, Udishnu, Ji, Heng, Choudhury, Sutanay

arXiv.org Artificial IntelligenceOct-22-2023

Discovering novel catalysts requires complex reasoning involving multiple chemical properties and resultant trade-offs, leading to a combinatorial growth in the search space. While large language models (LLM) have demonstrated novel capabilities for chemistry through complex instruction following capabilities and high quality reasoning, a goal-driven combinatorial search using LLMs has not been explored in detail. In this work, we present a Monte Carlo Tree Search-based approach that improves beyond state-of-the-art chain-of-thought prompting variants to augment scientific reasoning. We introduce two new reasoning datasets: 1) a curation of computational chemistry simulations, and 2) diverse questions written by catalysis researchers for reasoning about novel chemical conversion processes. We improve over the best baseline by 25.8\% and find that our approach can augment scientist's reasoning and discovery process with novel insights.

complex scientific reasoning, large language model, natural language, (4 more...)

arXiv.org Artificial Intelligence

2310.1442

Genre: Research Report (0.40)

Industry: Materials > Chemicals > Specialty Chemicals (0.60)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.87)

Add feedback

A Unification Framework for Euclidean and Hyperbolic Graph Neural Networks

Khatir, Mehrdad, Choudhary, Nurendra, Choudhury, Sutanay, Agarwal, Khushbu, Reddy, Chandan K.

arXiv.org Artificial IntelligenceJun-6-2023

Hyperbolic neural networks can effectively capture the inherent hierarchy of graph datasets, and consequently a powerful choice of GNNs. However, they entangle multiple incongruent (gyro-)vector spaces within a layer, which makes them limited in terms of generalization and scalability. In this work, we propose the Poincare disk model as our search space, and apply all approximations on the disk (as if the disk is a tangent space derived from the origin), thus getting rid of all inter-space transformations. Such an approach enables us to propose a hyperbolic normalization layer and to further simplify the entire hyperbolic model to a Euclidean model cascaded with our hyperbolic normalization layer. We applied our proposed nonlinear hyperbolic normalization to the current state-of-the-art homogeneous and multi-relational graph networks. We demonstrate that our model not only leverages the power of Euclidean networks such as interpretability and efficient execution of various model components, but also outperforms both Euclidean and hyperbolic counterparts on various benchmarks. Our code is made publicly available at https://github.com/oom-debugger/ijcai23.

artificial intelligence, machine learning, normalization, (19 more...)

arXiv.org Artificial Intelligence

2206.04285

Country:

North America > United States > Virginia (0.14)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

Add feedback

BitGNN: Unleashing the Performance Potential of Binary Graph Neural Networks on GPUs

Chen, Jou-An, Sung, Hsin-Hsuan, Shen, Xipeng, Choudhury, Sutanay, Li, Ang

arXiv.org Artificial IntelligenceJun-3-2023

Recent studies have shown that Binary Graph Neural Networks (GNNs) are promising for saving computations of GNNs through binarized tensors. Prior work, however, mainly focused on algorithm designs or training techniques, leaving it open to how to materialize the performance potential on accelerator hardware fully. This work redesigns the binary GNN inference backend from the efficiency perspective. It fills the gap by proposing a series of abstractions and techniques to map binary GNNs and their computations best to fit the nature of bit manipulations on GPUs. Results on real-world graphs with GCNs, GraphSAGE, and GraphSAINT show that the proposed techniques outperform state-of-the-art binary GNN implementations by 8-22X with the same accuracy maintained. BitGNN code is publicly available.

artificial intelligence, machine learning, opération, (17 more...)

arXiv.org Artificial Intelligence

2305.02522

Country: North America > United States > North Carolina (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Cloud Services Enable Efficient AI-Guided Simulation Workflows across Heterogeneous Resources

Ward, Logan, Pauloski, J. Gregory, Hayot-Sasson, Valerie, Chard, Ryan, Babuji, Yadu, Sivaraman, Ganesh, Choudhury, Sutanay, Chard, Kyle, Thakur, Rajeev, Foster, Ian

arXiv.org Artificial IntelligenceMar-15-2023

Applications that fuse machine learning and simulation can benefit from the use of multiple computing resources, with, for example, simulation codes running on highly parallel supercomputers and AI training and inference tasks on specialized accelerators. Here, we present our experiences deploying two AI-guided simulation workflows across such heterogeneous systems. A unique aspect of our approach is our use of cloud-hosted management services to manage challenging aspects of cross-resource authentication and authorization, function-as-a-service (FaaS) function invocation, and data transfer. We show that these methods can achieve performance parity with systems that rely on direct connection between resources. We achieve parity by integrating the FaaS system and data transfer capabilities with a system that passes data by reference among managers and workers, and a user-configurable steering algorithm to hide data transfer latencies. We anticipate that this ease of use can enable routine use of heterogeneous resources in computational science.

artificial intelligence, cloud computing, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/IPDPSW59300.2023.00018

2303.08803

Country: North America > United States > Illinois > Cook County (0.14)

Genre:

Workflow (0.75)
Research Report (0.64)

Industry:

Information Technology > Services (1.00)
Information Technology > Security & Privacy (0.66)

Technology:

Information Technology > Scientific Computing (1.00)
Information Technology > Cloud Computing (1.00)
Information Technology > Architecture > Distributed Systems (0.93)
(4 more...)

Add feedback

Extreme Acceleration of Graph Neural Network-based Prediction Models for Quantum Chemistry

Helal, Hatem, Firoz, Jesun, Bilbrey, Jenna, Krell, Mario Michael, Murray, Tom, Li, Ang, Xantheas, Sotiris, Choudhury, Sutanay

arXiv.org Artificial IntelligenceNov-24-2022

Molecular property calculations are the bedrock of chemical physics. High-fidelity \textit{ab initio} modeling techniques for computing the molecular properties can be prohibitively expensive, and motivate the development of machine-learning models that make the same predictions more efficiently. Training graph neural networks over large molecular databases introduces unique computational challenges such as the need to process millions of small graphs with variable size and support communication patterns that are distinct from learning over large graphs such as social networks. This paper demonstrates a novel hardware-software co-design approach to scale up the training of graph neural networks for molecular property prediction. We introduce an algorithm to coalesce the batches of molecular graphs into fixed size packs to eliminate redundant computation and memory associated with alternative padding techniques and improve throughput via minimizing communication. We demonstrate the effectiveness of our co-design approach by providing an implementation of a well-established molecular property prediction model on the Graphcore Intelligence Processing Units (IPU). We evaluate the training performance on multiple molecular graph databases with varying degrees of graph counts, sizes and sparsity. We demonstrate that such a co-design approach can reduce the training time of such molecular property prediction models from days to less than two hours, opening new possibilities for AI-driven scientific discovery.

artificial intelligence, graph, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2211.13853

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Industry:

Energy (0.46)
Information Technology (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback