AWS re:Invent 2021 - Serverless Inference on SageMaker! FOR REAL!
At long last, Amazon SageMaker supports serverless endpoints. In this video, I demo this newly launched capability, named Serverless Inference.
Starting from a pre-trained DistilBERT model on the Hugging Face model hub, I fine-tune it for sentiment analysis on the IMDB movie review dataset. Then, I deploy the model to a serverless endpoint, and I run multi-threaded benchmarks with short and long token sequences. Finally, I plot latency numbers and compute latency quantiles.
*** Erratum: max concurrency factor is 50, not 40.
⭐️⭐️⭐️ Don't forget to subscribe to be notified of future videos ⭐️⭐️⭐️
Notebook: https://gitlab.com/juliensimon/huggingface-demos/-/tree/main/serverless-inference
Documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/serverless-endpoints.html
New to Transformers? Check out the Hugging Face course at https://huggingface.co/course
Видео AWS re:Invent 2021 - Serverless Inference on SageMaker! FOR REAL! канала Julien Simon
Starting from a pre-trained DistilBERT model on the Hugging Face model hub, I fine-tune it for sentiment analysis on the IMDB movie review dataset. Then, I deploy the model to a serverless endpoint, and I run multi-threaded benchmarks with short and long token sequences. Finally, I plot latency numbers and compute latency quantiles.
*** Erratum: max concurrency factor is 50, not 40.
⭐️⭐️⭐️ Don't forget to subscribe to be notified of future videos ⭐️⭐️⭐️
Notebook: https://gitlab.com/juliensimon/huggingface-demos/-/tree/main/serverless-inference
Documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/serverless-endpoints.html
New to Transformers? Check out the Hugging Face course at https://huggingface.co/course
Видео AWS re:Invent 2021 - Serverless Inference on SageMaker! FOR REAL! канала Julien Simon
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
Deep Dive: Hugging Face models on AWS AI AcceleratorsHugging Face / AWS roadshow - Johannesburg 🇿🇦🇿🇦🇿🇦Hugging Face / AWS roadshow - Cape Town 🇿🇦🇿🇦🇿🇦AWS User Group DubaiHugging Face / AWS roadshow - Zurich 🇨🇭🇨🇭🇨🇭AWS / Huggingface roadshow - Day 4, MunichHugging Face profite de l'emballement pour l'intelligence artificielleHugging Face / AWS roadshow - Day 2, MadridHugging Face/AWS roadshow - Day 1, MadridTransformer training shootout, part 2: AWS Trainium vs. NVIDIA V100Accelerating Transformers with Optimum Neuron, AWS Trainium and AWS Inferentia2Summarizing legal documents with Hugging Face and Amazon SageMakerInterview BFM Business - Hugging Face (04/2023)Accelerating Stable Diffusion Inference on Intel CPUs with Hugging Face (part 2) 🚀 🚀 🚀Accelerating Stable Diffusion Inference on Intel CPUs with Hugging Face (part 1) 🚀 🚀 🚀Transformer training shootout: AWS Trainium vs. NVIDIA A10GTraining Transformers with AWS Trainium and the Hugging Face Neuron AMIHow Witty Works leverages Hugging Face to scale inclusive languageFast and accurate language identification with Hugging Face and Intel OpenVINOSemantic search on images and videos with BridgeTowerEfficient Few-Shot Learning on CPU with SetFit