Stack Explorer

vLLM

inference tool

High-performance LLM inference engine

Official site

Supported languages

Pros and Cons

Ventajas

  • + Very fast
  • + Paged Attention
  • + Continuous batching
  • + OpenAI compatible

Desventajas

  • - Inference only
  • - GPU required

Casos de Uso

  • LLM serving
  • Inference at scale
  • Model APIs
  • Production

Related Technologies