Torch Quantization Examples

About 263,000 results

Open links in new tab

Any time

pytorch.org
https://docs.pytorch.org › docs › stable › quantization
Quantization — PyTorch 2.11 documentation
Oct 9, 2019 · 3. pt2e quantization has been migrated to torchao (pytorch/ao) see pytorch/ao#2259 for more details We plan to delete torch.ao.quantization in 2.10 if there are no blockers, or in the earliest …
arikpoz.github.io
https://arikpoz.github.io › posts
Neural Network Quantization in PyTorch | Practical ML
Apr 16, 2025 · What is Quantization? Quantization is a model optimization technique that reduces the numerical precision used to represent weights and activations in deep learning models. Its primary …
pytorch.org
https://docs.pytorch.org › docs › stable › quantization-support
Quantization API Reference — PyTorch 2.11 documentation
Jul 25, 2020 · torch.ao.quantization.backend_config # This module contains BackendConfig, a config object that defines how quantization is supported in a backend. Currently only used by FX Graph …
pytorch.org
https://docs.pytorch.org › tutorials
Welcome to PyTorch Tutorials — PyTorch Tutorials 2.11.0+cu130 …
Learn how to use torch.nn.utils.prune to sparsify your neural networks, and how to extend it to implement your own custom pruning technique.
github.com
https://github.com › ... › examples › quantization › README.md
TensorRT-LLM/examples/quantization/README.md at main - GitHub
TensorRT-LLM Quantization Toolkit Installation Guide Introduction This document introduces: The steps to install the TensorRT-LLM quantization toolkit. The Python APIs to quantize the models. The …
zhihu.com
https://zhuanlan.zhihu.com
PyTorch的量化 - 知乎 - 知乎专栏
PyTorch 1.1的时候开始添加 torch.qint8 dtype 、 torch.quantize_linear 转换函数来开始对量化提供有限的实验性支持。 PyTorch 1.3开始正式支持量化，在可量化的Tensor之外，PyTorch开始支持 CNN 中 …
github.com
https://github.com › NVIDIA › Model-Optimizer
GitHub - NVIDIA/Model-Optimizer: A unified library of SOTA model ...
Mar 23, 2026 · NVIDIA Model Optimizer (referred to as Model Optimizer, or ModelOpt) is a library comprising state-of-the-art model optimization techniques including quantization, distillation, pruning, …
github.com
https://github.com › amd › Quark
GitHub - amd/Quark
AMD Quark provides examples of Language Model and Image Classification model quantization, which can be found under examples/torch/ and examples/onnx/. These examples are documented here:
pytorch.org
https://docs.pytorch.org › docs › stable › amp.html
Automatic Mixed Precision package - torch.amp — PyTorch 2.11 …
Jun 12, 2025 · Ordinarily, “automatic mixed precision training” with datatype of torch.float16 uses torch.autocast and torch.amp.GradScaler together, as shown in the Automatic Mixed Precision …
huggingface.co
https://huggingface.co › docs › optimum › concept_guides › quantization
Quantization - Hugging Face
Quantization is a technique to reduce the computational and memory costs of running inference by representing the weights and activations with low-precision data types like 8-bit integer (int8) instead …

Some results have been removed
Pagination
- 1
- 2
- 3
- Next

Quantization — PyTorch 2.11 documentation

Neural Network Quantization in PyTorch | Practical ML

Quantization API Reference — PyTorch 2.11 documentation

Welcome to PyTorch Tutorials — PyTorch Tutorials 2.11.0+cu130 …

TensorRT-LLM/examples/quantization/README.md at main - GitHub

PyTorch的量化 - 知乎 - 知乎专栏

GitHub - NVIDIA/Model-Optimizer: A unified library of SOTA model ...

GitHub - amd/Quark

Automatic Mixed Precision package - torch.amp — PyTorch 2.11 …

Quantization - Hugging Face