AITopics | perlmutter

Collaborating Authors

perlmutter

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Plexus: Taming Billion-edge Graphs with 3D Parallel Full-graph GNN Training

Ranjan, Aditya K., Singh, Siddharth, Wei, Cunyang, Bhatele, Abhinav

arXiv.org Artificial IntelligenceOct-30-2025

Graph neural networks (GNNs) leverage the connectivity and structure of real-world graphs to learn intricate properties and relationships between nodes. Many real-world graphs exceed the memory capacity of a GPU due to their sheer size, and training GNNs on such graphs requires techniques such as mini-batch sampling to scale. The alternative approach of distributed full-graph training suffers from high communication overheads and load imbalance due to the irregular structure of graphs. We propose a three-dimensional (3D) parallel approach for full-graph training that tackles these issues and scales to billion-edge graphs. In addition, we introduce optimizations such as a double permutation scheme for load balancing, and a performance model to predict the optimal 3D configuration of our parallel implementation -- Plexus. We evaluate Plexus on six different graph datasets and show scaling results on up to 2048 GPUs of Perlmutter, and 1024 GPUs of Frontier. Plexus achieves unprecedented speedups of 2.3-12.5x over prior state of the art, and a reduction in time-to-solution by 5.2-8.7x on Perlmutter and 7.0-54.2x on Frontier.

data mining, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.04083

Country: North America > United States > Maryland > Prince George's County > College Park (0.14)

Genre:

Overview (0.67)
Research Report (0.51)

Industry:

Energy (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Architecture > Distributed Systems (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Data Science > Data Mining (0.93)

Add feedback

Trump strikes a blow for AI – by firing the US copyright supremo

The GuardianMay-13-2025, 12:14:43 GMT

Sometimes it helps me to write by thinking about how a radio broadcaster or television presenter would deliver the information, so I'm your host, Blake Montgomery. Today in tech news: questions hover over the automation of labor in the worker-strapped US healthcare system; and drones proliferate in a new conflict: India v Pakistan, both armed with nuclear weapons. Meanwhile, in contrast to a thoughtful and robust conversation, the US is taking the opposite tack. Legend has it that Alexander the Great was presented with a knot in a rope tying a cart to a stake. So complex were its twistings that no man had been able to untie it of the hundreds who had tried. Alexander silently drew his sword and sliced the knot in two.

artificial intelligence, pakistan, trump, (16 more...)

The Guardian

Country:

Asia > India (0.74)
Asia > Pakistan (0.64)
Europe > United Kingdom (0.48)
(10 more...)

Industry:

Law (1.00)
Government > Regional Government > North America Government > United States Government (0.72)
Government > Military (0.68)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.49)

Technology:

Information Technology > e-Commerce > Financial Technology (0.51)
Information Technology > Artificial Intelligence > Robots (0.31)

Add feedback

Trump admin fires top US copyright official days after terminating Librarian of Congress

FOX NewsMay-12-2025, 00:20:15 GMT

An AI art lecturer said he believes the U.S. government would encounter difficulty if it attempted to establish a watermark system for AI-generated content. Trump fired Librarian of Congress Carla Hayden, who was the first woman and first African American to be Librarian of Congress, on Thursday. The termination was part of the administration's ongoing purge of government officials who are perceived to be opposed to Trump and his agenda. The White House did not immediately respond to Fox News Digital's requests for comment on the matter. Like Perlmutter, Hayden was notified of her firing in an email, according to The Associated Press.

artificial intelligence, librarian, perlmutter, (14 more...)

FOX News

Country: North America > United States > District of Columbia > Washington (0.06)

Industry:

Law > Intellectual Property & Technology Law (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

InfoGain Wavelets: Furthering the Design of Diffusion Wavelets for Graph-Structured Data

Johnson, David R., Krishnaswamy, Smita, Perlmutter, Michael

arXiv.org Machine LearningApr-8-2025

Diffusion wavelets extract information from graph signals at different scales of resolution by utilizing graph diffusion operators raised to various powers, known as diffusion scales. Traditionally, the diffusion scales are chosen to be dyadic integers, $\mathbf{2^j}$. Here, we propose a novel, unsupervised method for selecting the diffusion scales based on ideas from information theory. We then show that our method can be incorporated into wavelet-based GNNs via graph classification experiments.

artificial intelligence, infogain wavelet, machine learning, (18 more...)

arXiv.org Machine Learning

2504.08802

Country:

North America > United States > Idaho > Ada County > Boise (0.04)
North America > United States > Connecticut > New Haven County > New Haven (0.04)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.93)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Democratizing AI: Open-source Scalable LLM Training on GPU-based Supercomputers

Singh, Siddharth, Singhania, Prajwal, Ranjan, Aditya, Kirchenbauer, John, Geiping, Jonas, Wen, Yuxin, Jain, Neel, Hans, Abhimanyu, Shu, Manli, Tomar, Aditya, Goldstein, Tom, Bhatele, Abhinav

arXiv.org Artificial IntelligenceFeb-12-2025

Training and fine-tuning large language models (LLMs) with hundreds of billions to trillions of parameters requires tens of thousands of GPUs, and a highly scalable software stack. In this work, we present a novel four-dimensional hybrid parallel algorithm implemented in a highly scalable, portable, open-source framework called AxoNN. We describe several performance optimizations in AxoNN to improve matrix multiply kernel performance, overlap non-blocking collectives with computation, and performance modeling to choose performance optimal configurations. These have resulted in unprecedented scaling and peak flop/s (bf16) for training of GPT-style transformer models on Perlmutter (620.1 Petaflop/s), Frontier (1.381 Exaflop/s) and Alps (1.423 Exaflop/s). While the abilities of LLMs improve with the number of trainable parameters, so do privacy and copyright risks caused by memorization of training data, which can cause disclosure of sensitive or private information at inference time. We highlight this side effect of scale through experiments that explore "catastrophic memorization", where models are sufficiently large to memorize training data in a single pass, and present an approach to prevent it. As part of this study, we demonstrate fine-tuning of a 405-billion parameter LLM using AxoNN on Frontier.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2502.08145

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry:

Information Technology (0.95)
Energy (0.67)
Government > Regional Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Convergence of Manifold Filter-Combine Networks

Johnson, David R., Chew, Joyce, Viswanath, Siddharth, De Brouwer, Edward, Needell, Deanna, Krishnaswamy, Smita, Perlmutter, Michael

arXiv.org Machine LearningOct-18-2024

In order to better understand manifold neural networks (MNNs), we introduce Manifold Filter-Combine Networks (MFCNs). The filter-combine framework parallels the popular aggregate-combine paradigm for graph neural networks (GNNs) and naturally suggests many interesting families of MNNs which can be interpreted as the manifold analog of various popular GNNs. We then propose a method for implementing MFCNs on high-dimensional point clouds that relies on approximating the manifold by a sparse graph. We prove that our method is consistent in the sense that it converges to a continuum limit as the number of data points tends to infinity.

artificial intelligence, machine learning, neural network, (14 more...)

arXiv.org Machine Learning

2410.14639

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Communication-minimizing Asynchronous Tensor Parallelism

Singh, Siddharth, Sating, Zack, Bhatele, Abhinav

arXiv.org Artificial IntelligenceMay-22-2023

In this work, we propose Tensor3D, a three dimensional (3D) hybrid tensor and data parallel framework which strives to alleviate As state-of-the-art neural networks scale to billions of parameters, the aforementioned performance bottlenecks of existing tensor designing parallel algorithms that can train these networks parallel approaches. Our framework relies on three key ideas to efficiently on multi-GPU clusters has become critical. This paper minimize the idle time spent in communication. First, we show how presents Tensor3D, a novel three-dimensional (3D) approach to a naive application of a tensor parallel strategy can lead to a significant parallelize tensor computations, that strives to minimize the idle amount of communication for satisfying the data dependencies time incurred due to communication in parallel training of large of parallelized layers of a neural network. To this end, we propose multi-billion parameter models. First, we introduce an intelligent an intelligent distribution of neural network parameters across distribution of neural network parameters across GPUs that eliminates GPUs that eliminates the aforementioned communication for satisfying communication required for satisfying data dependencies of data dependencies.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2305.13525

Country:

Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Maryland > Prince George's County > College Park (0.04)
(5 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Copyright Office Sets Sights on Artificial Intelligence in 2023

#artificialintelligenceDec-30-2022, 11:25:48 GMT

"This year, the big milestone was having the board open its doors and start accepting claims," Perlmutter said, adding that board decisions will start coming in the next year. Though it is "still early days" and it remains unclear what the standard volume of claims will be, Perlmutter said she is "extremely impressed" with how well the board is doing. It's received over 260 cases so far. She added that several of the cases have been dismissed. The office believes that means they've been settled, which would adhere to the alternative dispute resolution mechanism of the board, she said. "We set up this totally new tribunal in really record time. I think most other agencies who have seen what we've done can't understand how we managed that in under a year and a half, because it required a lot of work," she said.

artificial intelligence, office set sight, perlmutter, (11 more...)

#artificialintelligence

Industry: Law > Intellectual Property & Technology Law (1.00)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Learnable Filters for Geometric Scattering Modules

Tong, Alexander, Wenkel, Frederik, Bhaskar, Dhananjay, Macdonald, Kincaid, Grady, Jackson, Perlmutter, Michael, Krishnaswamy, Smita, Wolf, Guy

arXiv.org Artificial IntelligenceAug-15-2022

We propose a new graph neural network (GNN) module, based on relaxations of recently proposed geometric scattering transforms, which consist of a cascade of graph wavelet filters. Our learnable geometric scattering (LEGS) module enables adaptive tuning of the wavelets to encourage band-pass features to emerge in learned representations. The incorporation of our LEGS-module in GNNs enables the learning of longer-range graph relations compared to many popular GNNs, which often rely on encoding graph structure via smoothness or similarity between neighbors. Further, its wavelet priors result in simplified architectures with significantly fewer learned parameters compared to competing GNNs. We demonstrate the predictive performance of LEGS-based networks on graph classification benchmarks, as well as the descriptive quality of their learned features in biochemical graph data exploration tasks. Our results show that LEGS-based networks match or outperforms popular GNNs, as well as the original geometric scattering construction, on many datasets, in particular in biochemical domains, while retaining certain mathematical properties of handcrafted (non-learned) geometric scattering.

dataset, graph, neural network, (17 more...)

arXiv.org Artificial Intelligence

2208.07458

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > Canada > Quebec > Montreal (0.04)
North America > United States > Michigan (0.04)
(5 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Education (0.93)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Communications (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

FourCastNet: Accelerating Global High-Resolution Weather Forecasting using Adaptive Fourier Neural Operators

Kurth, Thorsten, Subramanian, Shashank, Harrington, Peter, Pathak, Jaideep, Mardani, Morteza, Hall, David, Miele, Andrea, Kashinath, Karthik, Anandkumar, Animashree

arXiv.org Artificial IntelligenceAug-8-2022

Extreme weather amplified by climate change is causing increasingly devastating impacts across the globe. The current use of physics-based numerical weather prediction (NWP) limits accuracy due to high computational cost and strict time-to-solution limits. We report that a data-driven deep learning Earth system emulator, FourCastNet, can predict global weather and generate medium-range forecasts five orders-of-magnitude faster than NWP while approaching state-of-the-art accuracy. FourCast-Net is optimized and scales efficiently on three supercomputing systems: Selene, Perlmutter, and JUWELS Booster up to 3,808 NVIDIA A100 GPUs, attaining 140.8 petaFLOPS in mixed precision (11.9%of peak at that scale). The time-to-solution for training FourCastNet measured on JUWELS Booster on 3,072GPUs is 67.4minutes, resulting in an 80,000times faster time-to-solution relative to state-of-the-art NWP, in inference. FourCastNet produces accurate instantaneous weather predictions for a week in advance, enables enormous ensembles that better capture weather extremes, and supports higher global forecast resolutions.

fourcastnet, resolution, simulation, (14 more...)

arXiv.org Artificial Intelligence

2208.05419

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > California > Santa Clara County > Santa Clara (0.05)
North America > United States > California > Alameda County > Berkeley (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry: Energy (0.93)

Technology:

Information Technology > Scientific Computing (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback