AITopics | galvatron

Collaborating Authors

galvatron

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

From Smashing Pumpkins to Ferris Bueller: new Australian indie video game Mixtape is a blast of nostalgia

The GuardianMay-8-2026, 15:00:31 GMT

Across Mixtape's four-hour runtime, you'skateboard, mash tongues together during a kiss, TP a house, ride a dinosaur and learn to fly' Across Mixtape's four-hour runtime, you'skateboard, mash tongues together during a kiss, TP a house, ride a dinosaur and learn to fly' W hen Johnny Galvatron was 14, his cousin gave him a copy of the Smashing Pumpkins' seminal 1995 album, Mellon Collie and the Infinite Sadness. For Galvatron, a rambunctious teenager in Geelong who defined himself by his musical taste, it was love at first spin. "I don't think there's a track like Tonight, Tonight from any other band," he reminisces. A song from the album plays at a critical moment in Mixtape, the second game from Galvatron's Melbourne-based studio, Beethoven and Dinosaur. Mixtape is set over a single day; tomorrow, Stacy will be leaving her best friends, Slater and Cassandra, and flying to New York as part of a reckless plan to shove a mixtape into the hands of a superstar music supervisor who will, she believes, be so convinced of Stacy's genius that she'll offer her a job.

artificial intelligence, galvatron, social media, (12 more...)

The Guardian

Country: North America > United States > New York (0.26)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Media > Music (0.90)
Leisure & Entertainment > Sports (0.71)
Media > Film (0.70)

Technology:

Information Technology > Communications > Social Media (0.73)
Information Technology > Artificial Intelligence > Games (0.51)

Add feedback

Galvatron: An Automatic Distributed System for Efficient Foundation Model Training

Liu, Xinyi, Wang, Yujie, Zhu, Shenhan, Fu, Fangcheng, Liu, Qingshuo, Lin, Guangming, Cui, Bin

arXiv.org Artificial IntelligenceMay-1-2025

Galvatron is a distributed system for efficiently training large-scale Foundation Models. It overcomes the complexities of selecting optimal parallelism strategies by automatically identifying the most efficient hybrid strategy, incorporating data, tensor, pipeline, sharded data, and sequence parallelism, along with recomputation. The system's architecture includes a profiler for hardware and model analysis, a search engine for strategy optimization using decision trees and dynamic programming, and a runtime for executing these strategies efficiently. Benchmarking on various clusters demonstrates Galvatron's superior throughput compared to existing frameworks. This open-source system offers user-friendly interfaces and comprehensive documentation, making complex distributed training accessible and efficient.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2504.21411

Country:

North America > United States > Missouri (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)

Add feedback

Improving Automatic Parallel Training via Balanced Memory Workload Optimization

Wang, Yujie, Jiang, Youhe, Miao, Xupeng, Fu, Fangcheng, Nie, Xiaonan, Cui, Bin

arXiv.org Artificial IntelligenceJul-5-2023

Transformer models have emerged as the leading approach for achieving state-of-the-art performance across various application domains, serving as the foundation for advanced large-scale deep learning (DL) models. However, efficiently training these models across multiple GPUs remains a complex challenge due to the abundance of parallelism options. Existing DL systems either require manual efforts to design distributed training plans or limit parallelism combinations to a constrained search space. In this paper, we present Galvatron-BMW, a novel system framework that integrates multiple prevalent parallelism dimensions and automatically identifies the most efficient hybrid parallelism strategy. To effectively navigate this vast search space, we employ a decision tree approach for decomposition and pruning based on intuitive insights. We further utilize a dynamic programming search algorithm to derive the optimal plan. Moreover, to improve resource utilization and enhance system efficiency, we propose a bi-objective optimization workflow that focuses on workload balance. Our evaluations on different Transformer models demonstrate the capabilities of Galvatron-BMW in automating distributed training under varying GPU memory constraints. Across all tested scenarios, Galvatron-BMW consistently achieves superior system throughput, surpassing previous approaches that rely on limited parallelism strategies.

galvatron, galvatron-bmw, parallelism, (16 more...)

arXiv.org Artificial Intelligence

2307.02031

Country:

Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > China > Beijing > Beijing (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > China > Shandong Province > Qingdao (0.04)

Genre: Research Report (0.50)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism

Miao, Xupeng, Wang, Yujie, Jiang, Youhe, Shi, Chunan, Nie, Xiaonan, Zhang, Hailin, Cui, Bin

arXiv.org Artificial IntelligenceNov-24-2022

Transformer models have achieved state-of-the-art performance on various domains of applications and gradually becomes the foundations of the advanced large deep learning (DL) models. However, how to train these models over multiple GPUs efficiently is still challenging due to a large number of parallelism choices. Existing DL systems either rely on manual efforts to make distributed training plans or apply parallelism combinations within a very limited search space. In this approach, we propose Galvatron, a new system framework that incorporates multiple popular parallelism dimensions and automatically finds the most efficient hybrid parallelism strategy. To better explore such a rarely huge search space, we 1) involve a decision tree to make decomposition and pruning based on some reasonable intuitions, and then 2) design a dynamic programming search algorithm to generate the optimal plan. Evaluations on four representative Transformer workloads show that Galvatron could perform automatically distributed training with different GPU memory budgets. Among all evluated scenarios, Galvatron always achieves superior system throughput compared to previous work with limited parallelism.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

doi: 10.14778/3570690.3570697

2211.13878

Country:

Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > China > Shandong Province > Qingdao (0.04)

Genre: Research Report (0.64)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.76)

Add feedback

The Artful Escape review – Bowie meets Hitchhiker's in gratifying teenage space opera

The GuardianOct-2-2021, 12:00:09 GMT

Seventeen-year-old guitar prodigy Francis Vendetti lives with his mother in a small Colorado town that is still in thrall to its most famous export: Francis's late uncle, a platinum-selling folk singer. Francis feels inevitable pressure to continue the family trade, and, in preparation for his highly anticipated first public performance in town, writes a suite of Dylan-esque tracks about toil and loss. Except the act is an affectation: Francis is, at heart and by temperament, a prog-rock wailer who dreams of playing high-gain, euphoric guitar solos over the swell of a supportive orchestra. When he's visited by a sympathetic alien being who observes: "You wear folk like a cheap suit", Francis swaps his skinny Levi's for an LED-encrusted catsuit and sets off across the Milky Way to shred for an audience of intergalactic concertgoers. This is true space opera territory – Ziggy-era Bowie meets The Hitchhiker's Guide to the Galaxy – and far from typical video game subject matter.

bowie meet hitchhiker, francis, teenage space opera, (4 more...)

The Guardian

Country: North America > United States > Colorado (0.27)

Industry:

Leisure & Entertainment > Games > Computer Games (0.60)
Media > Music (0.41)

Technology: Information Technology > Artificial Intelligence (0.64)

Add feedback