CI Benchmark

This is the performance benchmark for vllm github repo This benchmark is facing towards developers.

Hardware

Please select a hardware. All plots down below will only show results of the selected hardware (more hardware is coming).

Smoothing

Taking the average of last commits when drawing the curve. (default: 10)


TL; DR





Latency tests

Description

This test suite aims to test vLLM's end-to-end latency under a controlled setup.

Plot


Throughput tests

Description

This test suite aims to test vLLM's throughput.

Plot


Serving Benchmark (on ShareGPT)

Description

This test suite aims to test vllm's real serving metrics.

Latency plot

Throughput plot

Full Data