vLLM
inference tool
High-performance LLM inference engine
Supported languages
Pros and Cons
Ventajas
- + Very fast
- + Paged Attention
- + Continuous batching
- + OpenAI compatible
Desventajas
- - Inference only
- - GPU required
Casos de Uso
- LLM serving
- Inference at scale
- Model APIs
- Production