AITopics | megatronlm

Collaborating Authors

megatronlm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Astra: Efficient and Money-saving Automatic Parallel Strategies Search on Heterogeneous GPUs

Wang, Peiran, Li, Haibing, Haohan, Fu, Li, Shiyong, Wang, Yanpeng, Shen, Dou

arXiv.org Artificial IntelligenceFeb-19-2025

In this paper, we introduce an efficient and money-saving automatic parallel strategies search framework on heterogeneous GPUs: Astra. First, Astra searches for the efficiency-optimal parallel strategy in both GPU configurations search space (GPU types and GPU numbers) and parallel parameters search space. Then, Astra also provides the solution on heterogeneous GPUs by mathematically modeling the time consumption of heterogeneous training. At last, Astra is the first to propose the automatic parallel strategy search on money-saving. The experiment results demonstrate that Astra can achieve better throughput than expert-designed strategies. The search time cost for Astra can also be limited to 1.27 seconds in a single-GPU setting and less than 1.35 minutes in a heterogeneous-GPU setting on average with an accuracy of over 95%.

astra, configuration, parallelism, (15 more...)

arXiv.org Artificial Intelligence

2502.1348

Country:

North America > United States > California > San Diego County > Carlsbad (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Services (0.46)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.74)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.70)

Add feedback

Energy consumption of AI poses environmental problems

#artificialintelligenceAug-27-2021, 06:12:28 GMT

Take some of the most popular language models, for example. OpenAI trained its GPT-3 model on 45 terabytes of data. To train the final version of MegatronLM, a language model similar to but smaller than GPT-3, Nvidia ran 512 V100 GPUs over nine days. A single V100 GPU can consume between 250 and 300 watts. If we assume 250 watts, then 512 V100 GPUS consumes 128,000 watts, or 128 kilowatts (kW).

ai pose environmental problem, energy consumption, megatronlm, (3 more...)

#artificialintelligence

Country: North America > United States (0.26)

Industry:

Energy (0.99)
Law > Environmental Law (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.92)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.30)

Add feedback

Tiny AI models could supercharge autocorrect and voice assistants on your phone

#artificialintelligenceOct-9-2019, 07:23:02 GMT

Researchers have successfully shrunk a giant language model to use in commercial applications. In October of last year, for example, Google released a model called BERT that passed a long-held reading-comprehension benchmark in the field. The larger version of the model had 340 million data parameters, and training it just one time through cost enough electricity to power a US household for 50 days. Four months later, OpenAI quickly topped it with its model GPT-2. The model demonstrated an impressive knack for constructing convincing prose; it also used 1.5 billion parameters. Now, MegatronLM, the latest and largest model from Nvidia, has 8.3 billion parameters.

application, supercharge autocorrect and voice assistant, tiny ai model, (4 more...)

#artificialintelligence

Country:

North America > United States > New York (0.06)
North America > United States > Massachusetts > Hampshire County > Amherst (0.06)
North America > United States > California > San Francisco County > San Francisco (0.06)

Industry: Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.76)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)

Add feedback