GPTQ
quantization technique
Post-training quantization method for LLMs
Pros and Cons
Ventajas
- + High compression with low quality loss
- + Fast inference
- + Widely supported
Desventajas
- - Slow quantization process
- - Requires calibration
Casos de Uso
- LLM deployment
- Limited GPU inference
- Compressed models