AITopics | Chukka, Ramesh

Collaborating Authors

Chukka, Ramesh

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MLPerf Power: Benchmarking the Energy Efficiency of Machine Learning Systems from Microwatts to Megawatts for Sustainable AI

Tschand, Arya, Rajan, Arun Tejusve Raghunath, Idgunji, Sachin, Ghosh, Anirban, Holleman, Jeremy, Kiraly, Csaba, Ambalkar, Pawan, Borkar, Ritika, Chukka, Ramesh, Cockrell, Trevor, Curtis, Oliver, Fursin, Grigori, Hodak, Miro, Kassa, Hiwot, Lokhmotov, Anton, Miskovic, Dejan, Pan, Yuechao, Manmathan, Manu Prasad, Raymond, Liz, John, Tom St., Suresh, Arjun, Taubitz, Rowan, Zhan, Sean, Wasson, Scott, Kanter, David, Reddi, Vijay Janapa

arXiv.org Artificial IntelligenceOct-15-2024

Rapid adoption of machine learning (ML) technologies has led to a surge in power consumption across diverse systems, from tiny IoT devices to massive datacenter clusters. Benchmarking the energy efficiency of these systems is crucial for optimization, but presents novel challenges due to the variety of hardware platforms, workload characteristics, and system-level interactions. This paper introduces MLPerf Power, a comprehensive benchmarking methodology with capabilities to evaluate the energy efficiency of ML systems at power levels ranging from microwatts to megawatts. Developed by a consortium of industry professionals from more than 20 organizations, MLPerf Power establishes rules and best practices to ensure comparability across diverse architectures. We use representative workloads from the MLPerf benchmark suite to collect 1,841 reproducible measurements from 60 systems across the entire range of ML deployment scales. Our analysis reveals trade-offs between performance, complexity, and energy efficiency across this wide range of systems, providing actionable insights for designing optimized ML solutions from the smallest edge devices to the largest cloud infrastructures. This work emphasizes the importance of energy efficiency as a key metric in the evaluation and comparison of the ML system, laying the foundation for future research in this critical area. We discuss the implications for developing sustainable AI solutions and standardizing energy efficiency benchmarking for ML systems.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2410.12032

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Law (0.93)
Information Technology > Services (0.48)
Energy > Power Industry (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.67)

Add feedback

Deep Learning Models on CPUs: A Methodology for Efficient Training

Fu, Quchen, Chukka, Ramesh, Achorn, Keith, Atta-fosu, Thomas, Canchi, Deepak R., Teng, Zhongwei, White, Jules, Schmidt, Douglas C.

arXiv.org Artificial IntelligenceJun-18-2023

GPUs have been favored for training deep learning models due to their highly parallelized architecture. As a result, most studies on training optimization focus on GPUs. There is often a trade-off, however, between cost and efficiency when deciding on how to choose the proper hardware for training. In particular, CPU servers can be beneficial if training on CPUs was more efficient, as they incur fewer hardware update costs and better utilizing existing infrastructure. This paper makes several contributions to research on training deep learning models using CPUs. First, it presents a method for optimizing the training of deep learning models on Intel CPUs and a toolkit called ProfileDNN, which we developed to improve performance profiling. Second, we describe a generic training optimization method that guides our workflow and explores several case studies where we identified performance issues and then optimized the Intel Extension for PyTorch, resulting in an overall 2x training performance increase for the RetinaNet-ResNext50 model. Third, we show how to leverage the visualization capabilities of ProfileDNN, which enabled us to pinpoint bottlenecks and create a custom focal loss kernel that was two times faster than the official reference PyTorch implementation.

artificial intelligence, machine learning, opération, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.13052/jmltapissn.2022.003

2206.10034

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback