How do I know if an open source model is supported on Inferentia, Trainium, or Neuron?

2 minute read

Content level: Foundational

Quick first steps to find out if Inferentia or Trainium is an option for you.

The AWS Neuron SDK (which is used on AWS Inferentia and AWS Trainium instances) provides support for specific model architectures. You can see a list of those models that are known to be supported here:

Neuron SDK documentation

Hugging Face Optimum Neuron documentation

However, there are a lot of other models that will run because they share the same underlying architecture as one of the supported models.

Hugging Face does a great job of tracking the attributes of various models, and they know which ones will work on Neuron. If it is a model that should work, they will give you instructions on how to deploy it on Amazon SageMaker using Inferentia or Trainium. If it works on Inferentia or Trainium using SageMaker, it will also work using EC2, EKS, and ECS.

You can see this under the deploy dropdown in the upper right corner of the model card. Click on Amazon SageMaker and then click on AWS Inferentia and Trainium (screen shots below). If you see instructions, you should be good to go! If you see “The model is not yet cached on Hugging Face. If you are interested in it, please request support or try to compile the model yourself using Optimum Neuron.”, then Hugging Face doesn’t know for sure that it will work.

If you see the message that it is not cached, or if your model didn’t have a deploy option, or if your model isn’t on Hugging Face, you may still be able to run it! You can click on the “Request Cache” button in Hugging Face, you can start researching in the SDK, you can post questions here on re:Post, or you can reach out to your AWS account team!

SageMaker Options

TIN instructions

Topics

Machine Learning & AI Compute

Relevant content

Inferentia and Trainium Service Quotas
EXPERT
Jim_Burtoft
published a month ago
Train large language model using Hugging Face and AWS Trainium
EXPERT
Kamran Khan
published 2 years ago
New NLP/CV Examples to Get Started on AWS Inferentia and AWS Trainium
EXPERT
Kamran Khan
published 2 years ago
TensorFlow for Trainium instances?
Accepted Answer
Arnoud
asked 2 months ago
inferentia neuron core usage is only 1 when 4 cores are available
rePost-User-6132983
asked 2 years ago
neuron compiling bert model for inferentia on tf2
rePost-User-6132983
asked 2 years ago
How do I check if my AWS DMS migration task is stuck or making progress?
AWS OFFICIALUpdated 9 months ago
How do I sign up for an AWS Support plan?
AWS OFFICIALUpdated 2 years ago
How do I use the UpgradeWindowsAWSDrivers runbook to upgrade or repair storage and network AWS drivers on an EC2 Windows instance?
AWS OFFICIALUpdated 10 months ago
What do I do if the tax I'm charged for AWS services is incorrect?
AWS OFFICIALUpdated 2 years ago