About 122,000 results
Open links in new tab
  1. Quantization (signal processing) - Wikipedia

    In mathematics and digital signal processing, quantization is the process of mapping input values from a large set (often a continuous set) to output values in a (countable) smaller set, often with a finite …

  2. What Is Quantization? | How It Works & Applications

    Quantization is the process of mapping continuous infinite values to a smaller set of discrete finite values. In the context of simulation and embedded computing, it is about approximating real-world …

  3. What is Quantization - GeeksforGeeks

    Nov 6, 2025 · Quantization is a model optimization technique that reduces the precision of numerical values such as weights and activations in models to make them faster and more efficient.

  4. Model Quantization: Concepts, Methods, and Why It Matters

    Nov 24, 2025 · Quantization has emerged as a crucial technique to address this challenge, enabling resource-intensive models to run on constrained hardware. The NVIDIA TensorRT and Model …

  5. What is quantization? - IBM

    Quantization is the process of reducing the precision of a digital signal, typically from a higher-precision format to a lower-precision format. This technique is widely used in various fields, including signal …

  6. "What is Quantization? Running AI Models Faster and Cheaper"

    Quantization has become the secret weapon for deploying large language models at scale, transforming AI from an expensive cloud-only technology into something that runs efficiently on edge devices and …

  7. What is quantization in machine learning? - Cloudflare

    What is quantization in machine learning? Quantization is a technique for lightening the load of executing machine learning and artificial intelligence (AI) models. It aims to reduce the memory …

  8. A deep dive into Quantization: Key to Open Source LLM Deployments

    3 days ago · LLM quantization is typically inference-time weight quantization: converting weights stored in FP16/FP32 into lower-precision formats such as INT8 or INT4. This reduces VRAM usage and can …

  9. What Is Quantization? Optimizing Data Compression - Coursera

    Oct 16, 2025 · Quantization converts high-precision data into lower-precision data by compressing it to reduce data loss. By optimizing quantization, you can reduce your model's computational burden …

  10. What is Quantization and Why It Matters for AI Inference?

    Jul 20, 2025 · Among many optimization techniques to improve AI inference performance, quantization has become an essential method when deploying modern AI models into real-world services.