QElight - Quality Education
Home
Contact
Quantization in Depth - Syllabus
Introduction
Overview of advanced quantization techniques
Course objectives and key concepts
Overview
Introduction to linear quantization
Benefits and challenges of model quantization
Quantize and De-quantize a Tensor
Basic operations to quantize and de-quantize tensors
Code examples for tensor manipulation
Get the Scale and Zero Point
Understanding scale and zero-point calculations
Practical examples and code implementation
Symmetric vs Asymmetric Mode
Comparing symmetric and asymmetric quantization
Use cases for each mode
Finer Granularity for More Precision
Exploring granularity options for higher precision
Code examples for fine-tuning quantization levels
Per Channel Quantization
Implementing per-channel quantization
Benefits and trade-offs
Per Group Quantization
Exploring per-group quantization techniques
Use cases and limitations
Quantizing Weights & Activations for Inference
Quantization techniques for inference optimization
Practical examples of weight and activation quantization
Custom Build an 8-Bit Quantizer
Building a custom 8-bit quantizer in PyTorch
Code examples for model layer replacement
Replace PyTorch Layers with Quantized Layers
Techniques for replacing layers in PyTorch
Implementing quantized layers in existing models
Quantize Any Open Source PyTorch Model
Guidelines for quantizing open source models
Using quantization for model efficiency
Load Your Quantized Weights from Hugging Face Hub
Loading and using quantized weights from the Hugging Face Hub
Code examples for model deployment
Weights Packing
Introduction to weight packing techniques
Benefits of packing for memory efficiency
Packing 2-Bit Weights
Packing four 2-bit weights into one 8-bit integer
Code examples for weight packing
Unpacking 2-Bit Weights
Unpacking techniques for 2-bit weight representations
Code examples for efficient model deployment
Beyond Linear Quantization
Exploring advanced quantization techniques
Next steps for studying model compression
Conclusion
Summary of quantization techniques and applications
Future directions for model optimization