tinyML Talks: A Practical Guide to Neural Network Quantization
"A Practical Guide to Neural Network Quantization"
Marios Fournarakis
Deep Learning Researcher
Qualcomm AI Research, Amsterdam
Neural network quantization is an effective way of reducing the power requirements and latency of neural network inference while maintaining high accuracy. The success of quantization has led to a large volume of literature and competing methods in recent years, and Qualcomm has been at the forefront of this research. This talk aims to cut through the noise and introduce a practical guide for quantizing neural networks inspired by our research and expertise at Qualcomm. We will begin with an introduction to quantization and fixed-point accelerators for neural network inference. We will then consider implementation pipelines for quantizing neural networks with near floating-point accuracy for popular neural networks and benchmarks. Finally, you will leave this talk with a set of diagnostic and debugging tools to address common neural network quantization issues.
You can find more information about the theory and algorithms we will discuss in this talk in our White Paper on Neural Network Quantization at the following arXiv link: https://arxiv.org/abs/2106.08295
Видео tinyML Talks: A Practical Guide to Neural Network Quantization канала The tinyML Foundation
Marios Fournarakis
Deep Learning Researcher
Qualcomm AI Research, Amsterdam
Neural network quantization is an effective way of reducing the power requirements and latency of neural network inference while maintaining high accuracy. The success of quantization has led to a large volume of literature and competing methods in recent years, and Qualcomm has been at the forefront of this research. This talk aims to cut through the noise and introduce a practical guide for quantizing neural networks inspired by our research and expertise at Qualcomm. We will begin with an introduction to quantization and fixed-point accelerators for neural network inference. We will then consider implementation pipelines for quantizing neural networks with near floating-point accuracy for popular neural networks and benchmarks. Finally, you will leave this talk with a set of diagnostic and debugging tools to address common neural network quantization issues.
You can find more information about the theory and algorithms we will discuss in this talk in our White Paper on Neural Network Quantization at the following arXiv link: https://arxiv.org/abs/2106.08295
Видео tinyML Talks: A Practical Guide to Neural Network Quantization канала The tinyML Foundation
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
tinyML Talks Taiwan in Mandarin and English: Discovering tinyMLtinyML Summit 2022: Sensing Applications as a Driver for TinyML SolutionstinyML Neuromorphic Engineering Forum - Sensors SessiontinyML Vision Challenge - Himax & Edge ImpulsetinyML Talks Chao Xu: Enabling Neural network at the low power edge: A neural network compiler...SensMACH 2020 Daniel Situnayake: Embedded machine learning in the real worldtinyML Talks: Empowering the Edge: Practical Applications of Embedded Machine Learning on MCUstinyML Talks: Efficient AI for Wildlife ConservationtinyML Research Symposium 2022: Towards Agile Design of Neural Processing Units with ChiseltinyML Talks Phoenix: Novel Device and Materials in Emerging Memory for Neuromorphic ComputingtinyML Talks - Phoenix meetup: Analog TinyML for health management using intelligent wearablestinyML Talks India: Single Lead ECG Classification On Wearable and Implantable DevicestinyML Summit 2023:Personal Computing devices use-case and applications enabled by Smart SensorstinyML Talks: From the lab to the edge: Post-Training CompressiontinyML Talks: State of Hardware & Software Ecosystem for Low-Power ML Applications on RISC-VtinyML Talks: Meetup Italy with small-medium industriestinyML Hackathon Challenge 2023 - Infineon XENSIV 60GHz Radar Sensor and devkit explanationtinyML Auto ML Tutorial with QeexotinyML On Device Learning Forum - Warren Gross: On-Device Learning For Natural Language Processing..EMEA 2021 tiny Talks: Building Heterogeneous TinyML PipelinestinyML EMEA 2022- Eran Treister: Wavelet Feature Maps Compression for Image-to-Image CNNs