The demand for generative AI is causing a shortage of GPUs, with Nvidia’s best-performing chips reportedly sold out until 2024. The CEO of chipmaker TSMC suggests that the shortage could extend into 2025, leading tech giants to develop custom chips. Amazon has unveiled the latest generation of its chips for model training and inferencing at its annual re:Invent conference.
The first chip, AWS Trainium2, is designed to deliver up to 4x better performance and 2x better energy efficiency than the previous Trainium. It will be available in EC Trn2 instances and can scale up to 100,000 chips in AWS’ EC2 UltraCluster product, delivering 65 exaflops of compute. Amazon claims that a cluster of 100,000 Trainium chips can train a 300-billion parameter AI language model in weeks versus months.
The second chip announced is the Arm-based Graviton4, intended for inferencing. Amazon claims Graviton4 provides up to 30% better compute performance, 50% more cores, and 75% more memory bandwidth than its predecessor, Graviton3. It will be available in Amazon EC2 R8g instances, with general availability planned in the coming months.