Загрузка...

AWS re:Invent 2021 - Serverless Inference on SageMaker! FOR REAL!

At long last, Amazon SageMaker supports serverless endpoints. In this video, I demo this newly launched capability, named Serverless Inference.

Starting from a pre-trained DistilBERT model on the Hugging Face model hub, I fine-tune it for sentiment analysis on the IMDB movie review dataset. Then, I deploy the model to a serverless endpoint, and I run multi-threaded benchmarks with short and long token sequences. Finally, I plot latency numbers and compute latency quantiles.

*** Erratum: max concurrency factor is 50, not 40.

⭐️⭐️⭐️ Don't forget to subscribe to be notified of future videos ⭐️⭐️⭐️

Notebook: https://gitlab.com/juliensimon/huggingface-demos/-/tree/main/serverless-inference

Documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/serverless-endpoints.html

New to Transformers? Check out the Hugging Face course at https://huggingface.co/course

Видео AWS re:Invent 2021 - Serverless Inference on SageMaker! FOR REAL! канала Julien Simon
Яндекс.Метрика
Все заметки Новая заметка Страницу в заметки
Страницу в закладки Мои закладки
На информационно-развлекательном портале SALDA.WS применяются cookie-файлы. Нажимая кнопку Принять, вы подтверждаете свое согласие на их использование.
О CookiesНапомнить позжеПринять