AITopics | Tiwari, Devesh

Collaborating Authors

Tiwari, Devesh

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Qompose: A Technique to Select Optimal Algorithm- Specific Layout for Neutral Atom Quantum Architectures

Silver, Daniel, Patel, Tirthak, Tiwari, Devesh

arXiv.org Artificial IntelligenceSep-29-2024

Therefore, motivated by these experimental observations, the goal of this work is to demonstrate how different practically feasible As quantum computing architecture matures, it is important to and simple arrangements of neutral atoms can be leveraged to investigate new technologies that lend unique advantages. In this improve the overall execution of quantum circuits in an algorithmspecific work, we propose, Qompose, a neutral atom quantum computing way. However, we show, that this problem poses non-trivial framework for efficiently composing quantum circuits on 2-D challenges due to the inherent complexities of the neutral atombased topologies of neutral atoms. Qompose selects an efficient topology quantum computing architecture and execution of quantum for any given circuit in order to optimize for length of execution circuits. One challenge is selecting a topology from the infinite through efficient parallelism and for overall fidelity.

artificial intelligence, machine learning, topology, (18 more...)

arXiv.org Artificial Intelligence

2409.1982

Country: North America > United States (0.31)

Genre: Research Report (0.50)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

OrganiQ: Mitigating Classical Resource Bottlenecks of Quantum Generative Adversarial Networks on NISQ-Era Machines

Silver, Daniel, Patel, Tirthak, Ranjan, Aditya, Cutler, William, Tiwari, Devesh

arXiv.org Artificial IntelligenceSep-29-2024

Driven by swift progress in hardware capabilities, quantum machine learning has emerged as a research area of interest. Recently, quantum image generation has produced promising results. However, prior quantum image generation techniques rely on classical neural networks, limiting their quantum potential and image quality. To overcome this, we introduce OrganiQ, the first quantum GAN capable of producing high-quality images without using classical neural networks.

artificial intelligence, discriminator, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2409.19823

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

LLM Inference Serving: Survey of Recent Advances and Opportunities

Li, Baolin, Jiang, Yankai, Gadepally, Vijay, Tiwari, Devesh

arXiv.org Artificial IntelligenceJul-17-2024

This survey offers a comprehensive overview of recent advancements in Large Language Model (LLM) serving systems, focusing on research since the year 2023. We specifically examine system-level enhancements that improve performance and efficiency without altering the core LLM decoding mechanisms. By selecting and reviewing high-quality papers from prestigious ML and system venues, we highlight key innovations and practical considerations for deploying and scaling LLMs in real-world production environments. This survey serves as a valuable resource for LLM practitioners seeking to stay abreast of the latest developments in this rapidly evolving field.

arxiv preprint arxiv, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2407.12391

Country: North America > United States (0.93)

Genre: Overview (1.00)

Industry:

Information Technology > Services (0.46)
Government > Regional Government (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Toward Sustainable GenAI using Generation Directives for Carbon-Friendly Large Language Model Inference

Li, Baolin, Jiang, Yankai, Gadepally, Vijay, Tiwari, Devesh

arXiv.org Artificial IntelligenceMar-19-2024

The rapid advancement of Generative Artificial Intelligence (GenAI) across diverse sectors raises significant environmental concerns, notably the carbon emissions from their cloud and high performance computing (HPC) infrastructure. This paper presents Sprout, an innovative framework designed to address these concerns by reducing the carbon footprint of generative Large Language Model (LLM) inference services. Sprout leverages the innovative concept of "generation directives" to guide the autoregressive generation process, thereby enhancing carbon efficiency. Our proposed method meticulously balances the need for ecological sustainability with the demand for high-quality generation outcomes. Employing a directive optimizer for the strategic assignment of generation directives to user prompts and an original offline quality evaluator, Sprout demonstrates a significant reduction in carbon emissions by over 40% in real-world evaluations using the Llama2 LLM and global electricity grid data. This research marks a critical step toward aligning AI technology with sustainable practices, highlighting the potential for mitigating environmental impacts in the rapidly expanding domain of generative artificial intelligence.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2403.129

Country: North America > United States > California (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry:

Energy > Power Industry (0.88)
Energy > Coal (0.56)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.69)

Add feedback

Sustainable Supercomputing for AI: GPU Power Capping at HPC Scale

Zhao, Dan, Samsi, Siddharth, McDonald, Joseph, Li, Baolin, Bestor, David, Jones, Michael, Tiwari, Devesh, Gadepally, Vijay

arXiv.org Artificial IntelligenceFeb-24-2024

As research and deployment of AI grows, the computational burden to support and sustain its progress inevitably does too. To train or fine-tune state-of-the-art models in NLP, computer vision, etc., some form of AI hardware acceleration is virtually a requirement. Recent large language models require considerable resources to train and deploy, resulting in significant energy usage, potential carbon emissions, and massive demand for GPUs and other hardware accelerators. However, this surge carries large implications for energy sustainability at the HPC/datacenter level. In this paper, we study the aggregate effect of power-capping GPUs on GPU temperature and power draw at a research supercomputing center. With the right amount of power-capping, we show significant decreases in both temperature and power draw, reducing power consumption and potentially improving hardware life-span with minimal impact on job performance. While power-capping reduces power draw by design, the aggregate system-wide effect on overall energy consumption is less clear; for instance, if users notice job performance degradation from GPU power-caps, they may request additional GPU-jobs to compensate, negating any energy savings or even worsening energy consumption. To our knowledge, our work is the first to conduct and make available a detailed analysis of the effects of GPU power-capping at the supercomputing scale. We hope our work will inspire HPCs/datacenters to further explore, evaluate, and communicate the impact of power-capping AI hardware accelerators for more sustainable AI.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3620678.3624793

2402.18593

Country: North America > United States > Minnesota (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry:

Energy (1.00)
Information Technology > Hardware (0.54)

Technology:

Information Technology > Scientific Computing (1.00)
Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

From Words to Watts: Benchmarking the Energy Costs of Large Language Model Inference

Samsi, Siddharth, Zhao, Dan, McDonald, Joseph, Li, Baolin, Michaleas, Adam, Jones, Michael, Bergeron, William, Kepner, Jeremy, Tiwari, Devesh, Gadepally, Vijay

arXiv.org Artificial IntelligenceOct-4-2023

Large language models (LLMs) have exploded in popularity due to their new generative capabilities that go far beyond prior state-of-the-art. These technologies are increasingly being leveraged in various domains such as law, finance, and medicine. However, these models carry significant computational challenges, especially the compute and energy costs required for inference. Inference energy costs already receive less attention than the energy costs of training LLMs -- despite how often these large models are called on to conduct inference in reality (e.g., ChatGPT). As these state-of-the-art LLMs see increasing usage and deployment in various domains, a better understanding of their resource utilization is crucial for cost-savings, scaling performance, efficient hardware usage, and optimal inference strategies. In this paper, we describe experiments conducted to study the computational and energy utilization of inference with LLMs. We benchmark and conduct a preliminary analysis of the inference performance and inference energy costs of different sizes of LLaMA -- a recent state-of-the-art LLM -- developed by Meta AI on two generations of popular GPUs (NVIDIA V100 \& A100) and two datasets (Alpaca and GSM8K) to reflect the diverse set of tasks/benchmarks for LLMs in research and practice. We present the results of multi-node, multi-GPU inference using model sharding across up to 32 GPUs. To our knowledge, our work is the one of the first to study LLM inference performance from the perspective of computational and energy resources at this scale.

artificial intelligence, large language model, natural language, (5 more...)

arXiv.org Artificial Intelligence

2310.03003

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

QUILT: Effective Multi-Class Classification on Quantum Computers Using an Ensemble of Diverse Quantum Classifiers

Silver, Daniel, Patel, Tirthak, Tiwari, Devesh

arXiv.org Artificial IntelligenceSep-26-2023

Quantum computers can theoretically have significant acceleration over classical computers; but, the near-future era of quantum computing is limited due to small number of qubits that are also error prone. Quilt is a framework for performing multi-class classification task designed to work effectively on current error-prone quantum computers. Quilt is evaluated with real quantum machines as well as with projected noise levels as quantum machines become more noise-free. Quilt demonstrates up to 85% multi-class classification accuracy with the MNIST dataset on a five-qubit system.

artificial intelligence, classifier, machine learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1609/aaai.v36i8.20807

2309.15056

Country: Asia (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

SupeRBNN: Randomized Binary Neural Network Using Adiabatic Superconductor Josephson Devices

Li, Zhengang, Yuan, Geng, Yamauchi, Tomoharu, Masoud, Zabihi, Xie, Yanyue, Dong, Peiyan, Tang, Xulong, Yoshikawa, Nobuyuki, Tiwari, Devesh, Wang, Yanzhi, Chen, Olivia

arXiv.org Artificial IntelligenceSep-21-2023

Adiabatic Quantum-Flux-Parametron (AQFP) is a superconducting logic with extremely high energy efficiency. By employing the distinct polarity of current to denote logic `0' and `1', AQFP devices serve as excellent carriers for binary neural network (BNN) computations. Although recent research has made initial strides toward developing an AQFP-based BNN accelerator, several critical challenges remain, preventing the design from being a comprehensive solution. In this paper, we propose SupeRBNN, an AQFP-based randomized BNN acceleration framework that leverages software-hardware co-optimization to eventually make the AQFP devices a feasible solution for BNN acceleration. Specifically, we investigate the randomized behavior of the AQFP devices and analyze the impact of crossbar size on current attenuation, subsequently formulating the current amplitude into the values suitable for use in BNN computation. To tackle the accumulation problem and improve overall hardware performance, we propose a stochastic computing-based accumulation module and a clocking scheme adjustment-based circuit optimization method. We validate our SupeRBNN framework across various datasets and network architectures, comparing it with implementations based on different technologies, including CMOS, ReRAM, and superconducting RSFQ/ERSFQ. Experimental results demonstrate that our design achieves an energy efficiency of approximately 7.8x10^4 times higher than that of the ReRAM-based BNN framework while maintaining a similar level of model accuracy. Furthermore, when compared with superconductor-based counterparts, our framework demonstrates at least two orders of magnitude higher energy efficiency.

aqfp buffer, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2309.12212

Country:

North America > United States (0.46)
Asia > Japan > Honshū > Kantō (0.14)

Genre: Research Report > New Finding (0.34)

Industry:

Materials > Chemicals (0.46)
Semiconductors & Electronics (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

KAIROS: Building Cost-Efficient Machine Learning Inference Systems with Heterogeneous Cloud Resources

Li, Baolin, Samsi, Siddharth, Gadepally, Vijay, Tiwari, Devesh

arXiv.org Artificial IntelligenceMay-2-2023

Online inference is becoming a key service product for many businesses, deployed in cloud platforms to meet customer demands. Despite their revenue-generation capability, these services need to operate under tight Quality-of-Service (QoS) and cost budget constraints. This paper introduces KAIROS, a novel runtime framework that maximizes the query throughput while meeting QoS target and a cost budget. KAIROS designs and implements novel techniques to build a pool of heterogeneous compute hardware without online exploration overhead, and distribute inference queries optimally at runtime. Our evaluation using industry-grade deep learning (DL) models shows that KAIROS yields up to 2X the throughput of an optimal homogeneous solution, and outperforms state-of-the-art schemes by up to 70%, despite advantageous implementations of the competing schemes to ignore their exploration overhead.

artificial intelligence, cloud computing, machine learning, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3588195.3592997

2210.05889

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.93)

Industry:

Information Technology > Services (1.00)
Government (0.93)

Technology:

Information Technology > Communications (1.00)
Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
(2 more...)

Add feedback

RIBBON: Cost-Effective and QoS-Aware Deep Learning Model Inference using a Diverse Pool of Cloud Computing Instances

Li, Baolin, Roy, Rohan Basu, Patel, Tirthak, Gadepally, Vijay, Gettings, Karen, Tiwari, Devesh

arXiv.org Artificial IntelligenceJul-28-2022

Deep learning model inference is a key service in many businesses and scientific discovery processes. This paper introduces RIBBON, a novel deep learning inference serving system that meets two competing objectives: quality-of-service (QoS) target and cost-effectiveness. The key idea behind RIBBON is to intelligently employ a diverse set of cloud computing instances (heterogeneous instances) to meet the QoS target and maximize cost savings. RIBBON devises a Bayesian Optimization-driven strategy that helps users build the optimal set of heterogeneous instances for their model inference service needs on cloud computing platforms -- and, RIBBON demonstrates its superiority over existing approaches of inference serving systems using homogeneous instance pools. RIBBON saves up to 16% of the inference service cost for different learning models including emerging deep learning recommender system models and drug-discovery enabling models.

artificial intelligence, configuration, machine learning, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3458817.3476168

2207.11434

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology > Services (1.00)
Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback