Deep Dive on PyTorch Quantization - Chris Gottbrath
Learn more: https://pytorch.org/docs/stable/quantization.html
It’s important to make efficient use of both server-side and on-device compute resources when developing machine learning applications. To support more efficient deployment on servers and edge devices, PyTorch added a support for model quantization using the familiar eager mode Python API.
Quantization leverages 8bit integer (int8) instructions to reduce the model size and run the inference faster (reduced latency) and can be the difference between a model achieving quality of service goals or even fitting into the resources available on a mobile device. Even when resources aren’t quite so constrained it may enable you to deploy a larger and more accurate model. Quantization is available in PyTorch starting in version 1.3 and with the release of PyTorch 1.4 we published quantized models for ResNet, ResNext, MobileNetV2, GoogleNet, InceptionV3 and ShuffleNetV2 in the PyTorch torchvision 0.5 library.
Видео Deep Dive on PyTorch Quantization - Chris Gottbrath канала PyTorch
It’s important to make efficient use of both server-side and on-device compute resources when developing machine learning applications. To support more efficient deployment on servers and edge devices, PyTorch added a support for model quantization using the familiar eager mode Python API.
Quantization leverages 8bit integer (int8) instructions to reduce the model size and run the inference faster (reduced latency) and can be the difference between a model achieving quality of service goals or even fitting into the resources available on a mobile device. Even when resources aren’t quite so constrained it may enable you to deploy a larger and more accurate model. Quantization is available in PyTorch starting in version 1.3 and with the release of PyTorch 1.4 we published quantized models for ResNet, ResNext, MobileNetV2, GoogleNet, InceptionV3 and ShuffleNetV2 in the PyTorch torchvision 0.5 library.
Видео Deep Dive on PyTorch Quantization - Chris Gottbrath канала PyTorch
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
Quantization - Dmytro DzhulgakovTorchScript and PyTorch JIT | Deep DivePyTorch at Tesla - Andrej Karpathy, TeslaBut what is a neural network? | Chapter 1, Deep learningMastery: How to Learn Anything Fast | Nishant KasibhatlaGEL7014 - Week12d - Intro to Trellis Coding Modulation (TCM)Talking PyTorch and Careers in AI: Soumith Chintala and Mat LeonardNeural Network Pruning for Compression & Understanding | Facebook AI Research | Dr. Michela PaganiniDistilling the Knowledge in a Neural NetworkNVIDIA AI Tech Workshop at NeurIPS Expo 2018 - Session 3: Inference and QuantizationInside TensorFlow: TF Model Optimization Toolkit (Quantization and Pruning)Lou Kratz on Scaling Visual Search with Locally Optimized Product QuantizationVariational AutoencodersUsing DLRM | Building Recommender Systems with PyTorch | Maxim Naumov and Dheevatsa MudigereProduction Inference Deployment with PyTorchVision Transformer in PyTorchIllustrated Guide to Transformers Neural Network: A step by step explanationPost-training Quantization in TensorFlow Lite (TFLite)PYTORCH COMMON MISTAKES - How To Save Time 🕒PyTorch Autograd Explained - In-depth Tutorial