experimental/cuda-ubi9/: vllm versions
Latest version on stage is: 0.4.2
A high-throughput and memory-efficient inference and serving engine for LLMs
Index | Version | Documentation |
---|---|---|
experimental/cuda-ubi9 | 0.4.2 |
Latest version on stage is: 0.4.2
A high-throughput and memory-efficient inference and serving engine for LLMs
Index | Version | Documentation |
---|---|---|
experimental/cuda-ubi9 | 0.4.2 |