Followed example on sagemaker-examples.readthedocs.io- Pretrained Bert Model

I just started a new AWS account to test out Sagemaker. I followed this example to the letter using Sagemaker Studio's Juypterlab https://sagemaker-examples.readthedocs.io/en/latest/sagemaker-script-mode/pytorch_bert/deploy_bert_outputs.html

When trying to get a response I received this error Received server error (500) from primary and could not load the entire response body. See https://ap-southeast-2.console.aws.amazon.com/cloudwatch/home?region=ap-southeast-2#logEventViewer:group=/aws/sagemaker/Endpoints/bert-base-2024-06-24-09-15-07in account 381492025627 for more information.

Here are some key logs 2024-06-24T09:18:00.042Z 2024-06-24 09:17:59,965 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000

2024-06-24T09:18:00.042Z 2024-06-24 09:17:59,965 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000

2024-06-24T09:18:00.042Z 2024-06-24 09:17:59,967 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000

2024-06-24T09:18:00.043Z 2024-06-24 09:17:59,967 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000

2024-06-24T09:18:00.043Z 2024-06-24 09:18:00,025 [INFO ] main com.amazonaws.ml.mms.ModelServer - Inference API bind to: http://0.0.0.0:8080

2024-06-24T09:18:00.043Z 2024-06-24 09:18:00,027 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.

2024-06-24T09:18:00.043Z 2024-06-24 09:18:00,029 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.

2024-06-24T09:18:00.043Z 2024-06-24 09:18:00,030 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.

2024-06-24T09:18:00.043Z Model server started.

2024-06-24T09:18:00.043Z 2024-06-24 09:18:00,032 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.

2024-06-24T09:18:00.807Z 2024-06-24 09:18:00,040 [WARN ] pool-2-thread-1 com.amazonaws.ml.mms.metrics.MetricCollector - worker pid is not available yet.

2024-06-24T09:18:00.807Z 2024-06-24 09:18:00,648 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model model loaded io_fd=6e35bcfffe387de4-00000014-00000003-39759a947189b525-c1df4593

2024-06-24T09:18:00.807Z 2024-06-24 09:18:00,654 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model model loaded io_fd=6e35bcfffe387de4-00000014-00000004-0250da947189b525-c1cdb6b7

2024-06-24T09:18:00.807Z 2024-06-24 09:18:00,656 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model model loaded io_fd=6e35bcfffe387de4-00000014-00000000-5f229a947189b525-e2878da0

2024-06-24T09:18:00.807Z 2024-06-24 09:18:00,656 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 545

2024-06-24T09:18:00.807Z 2024-06-24 09:18:00,657 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 538

2024-06-24T09:18:00.807Z 2024-06-24 09:18:00,657 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 545

2024-06-24T09:18:00.807Z 2024-06-24 09:18:00,658 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-model-1

2024-06-24T09:18:00.807Z 2024-06-24 09:18:00,658 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-model-2

2024-06-24T09:18:00.807Z 2024-06-24 09:18:00,658 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-model-3

2024-06-24T09:18:00.807Z 2024-06-24 09:18:00,694 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model model loaded io_fd=6e35bcfffe387de4-00000014-00000001-289a9a947189b525-7337c1c0

2024-06-24T09:18:00.807Z 2024-06-24 09:18:00,695 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 585 2024-06-24T09:18:01.562Z 2024-06-24 09:18:00,696 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-model-4

2024-06-24T09:22:55.972Z 2024-06-24 09:22:55,866 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 0 2024-06-24T09:22:56.473Z 2024-06-24 09:22:55,866 [INFO ] W-9000-model ACCESS_LOG - /169.254.178.2:40630 "POST /invocations HTTP/1.1" 500 3 2024-06-24T09:23:00.547Z 2024-06-24 09:22:56,238 [INFO ] pool-1-thread-6 ACCESS_LOG - /169.254.178.2:49882 "GET /ping HTTP/1.1" 200 0

Topics

Machine Learning & AI

Relevant content

Exporting Model from Sagemaker Canvas to Sagemaker studio
Fahad-Hassan
asked 2 years ago
Why does my kernal keep dying when I try to import Hugging Face BERT models to Amazon SageMaker?
Accepted Answer
EXPERT
Olivier_CR
asked 4 years ago
How to serve a pretrained model from hugging face in sagemaker without custom script?
clouduser
asked 2 years ago
SageMaker Studio Lab access deployed model via lambda function
macmiller111
asked 10 months ago
How do I check what role my Amazon SageMaker Studio user uses, and how do I change this role?
AWS OFFICIALUpdated 8 months ago
How can I deploy an Amazon SageMaker model to a different AWS account?
AWS OFFICIALUpdated 10 months ago
Why am I unable to launch Amazon SageMaker Studio?
AWS OFFICIALUpdated 2 years ago
How do I troubleshoot issues when I access an Amazon SageMaker Project in SageMaker Studio?
AWS OFFICIALUpdated a year ago
Build and Deploy Models Leveraging Cancer Gene Expression Data With SageMaker Pipelines and SageMaker Multi-Model Endpoints
EXPERT
Joshua_B
published 2 years ago
New NLP/CV Examples to Get Started on AWS Inferentia and AWS Trainium
EXPERT
Kamran Khan
published 2 years ago