AITopics | trainium

Collaborating Authors

trainium

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

HLAT: High-quality Large Language Model Pre-trained on AWS Trainium

Fan, Haozheng, Zhou, Hao, Huang, Guangtai, Raman, Parameswaran, Fu, Xinwei, Gupta, Gaurav, Ram, Dhananjay, Wang, Yida, Huan, Jun

arXiv.org Artificial IntelligenceApr-16-2024

Getting large language models (LLMs) to perform well on the downstream tasks requires pre-training over trillions of tokens. This typically demands a large number of powerful computational devices in addition to a stable distributed training framework to accelerate the training. The growing number of applications leveraging AI/ML had led to a scarcity of the expensive conventional accelerators (such as GPUs), which begs the need for the alternative specialized-accelerators that are scalable and cost-efficient. AWS Trainium is the second-generation machine learning accelerator that has been purposely built for training large deep learning models. Its corresponding instance, Amazon EC2 trn1, is an alternative to GPU instances for LLM training. However, training LLMs with billions of parameters on trn1 is challenging due to its relatively nascent software ecosystem. In this paper, we showcase HLAT: a 7 billion parameter decoder-only LLM pre-trained using trn1 instances over 1.8 trillion tokens. The performance of HLAT is benchmarked against popular open source baseline models including LLaMA and OpenLLaMA, which have been trained on NVIDIA GPUs and Google TPUs, respectively. On various evaluation tasks, we show that HLAT achieves model quality on par with the baselines. We also share the best practice of using the Neuron Distributed Training Library (NDTL), a customized distributed training library for AWS Trainium to achieve efficient training. Our work demonstrates that AWS Trainium powered by the NDTL is able to successfully pre-train state-of-the-art LLM models with high performance and cost-effectiveness.

accelerator, arxiv, trainium, (12 more...)

arXiv.org Artificial Intelligence

2404.1063

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Industry: Information Technology (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Top AI Chip Announcements Of 2020

#artificialintelligenceDec-4-2020, 09:51:51 GMT

Last year, we compiled a list of top chips for accelerating ML tasks. We talked about the rising demand of AI-based systems on Chips and the year 2020 is no different -- the trend continued. While few chipmakers capitalised on this trend, chip giants like Intel had to undergo a tough period. They even had to sell their NAND division to South Korean chipmaker SK Hynix. Even Apple announced their separation from Intel processors and opened a new chapter of Apple Silicon.

accelerator, application, inferentia, (13 more...)

#artificialintelligence

Country: Asia > South Korea (0.35)

Industry:

Semiconductors & Electronics (1.00)
Information Technology > Hardware (0.53)

Technology:

Information Technology > Artificial Intelligence > Vision (0.98)
Information Technology > Sensing and Signal Processing > Image Processing (0.71)

Add feedback

AWS' custom chip family expands, launches Trainium for machine learning models

#artificialintelligenceDec-4-2020, 07:38:57 GMT

AWS is launching its own machine learning chip to train models for what CEO Andy Jassy says will be the "most cost effective training in the cloud." The custom machine learning processor, called AWS Trainium, follows what is becoming a common blueprint for its silicon strategy. AWS is ultimately targeting enterprises that are just starting to train models and build out their AI strategies. Trainium will launch in 2021 and follow AWS instances on Intel's Habana Gaudi processors.

custom chip family expand, launch trainium, trainium, (2 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Amazon debuts Trainium, a custom chip for machine learning training in the cloud

#artificialintelligenceDec-4-2020, 04:30:23 GMT

Amazon today debuted AWS Trainium, a chip custom-designed to deliver what the company describes as cost-effective machine learning model training in the cloud. It comes ahead of the availability of new Habana Gaudi-based Amazon Elastic Compute Cloud (EC2) instances built specifically for machine learning training, powered by Intel's new Habana Gaudi processors. "We know that we want to keep pushing the price performance on machine learning training, so we're going to have to invest in our own chips," AWS CEO Andy Jassy said during a keynote address at Amazon's re:Invent conference this morning. "You have an unmatched array of instances in AWS, coupled with innovation in chips." Amazon claims that Trainium will offer the most teraflops of any machine learning instance in the cloud, where a teraflop translates to a chip being able to process 1 trillion calculations a second.

amazon, cloud, trainium, (8 more...)

#artificialintelligence

Industry: Information Technology > Services (0.75)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Add feedback

The Cambrian AI Explosion Ramps Up

#artificialintelligenceDec-2-2020, 16:22:03 GMT

There's been a lot of news lately on the AI chip front, so I wanted to share a short synopsis of what has been happening for anyone who may be distracted by the holidays. Let's start with the big news. Amazon AWS (AMZN) made two significant AI announcements on December 1st at the annual AWS re:Invent conference. First, Andy Jassy, AWS head, announced that the cloud leader would offer Intel's Gaudi training chip in the elastic cloud. The AWS deployment is the first traction we have seen for Gaudi, which Intel received in its $2B acquisition of Habana Labs last year. This is long-awaited good news for Intel.

cambrian ai explosion ramp, qualcomm, trainium, (10 more...)

#artificialintelligence

Industry: Information Technology > Services (0.37)

Technology:

Information Technology > Artificial Intelligence (0.39)
Information Technology > Cloud Computing (0.37)

Add feedback