PaddlePaddle/FastDeploy
High-performance Inference and Deployment Toolkit for LLMs and VLMs based on PaddlePaddle
监控中
Repository Intelligence
High-performance Inference and Deployment Toolkit for LLMs and VLMs based on PaddlePaddle
A high-throughput and memory-efficient inference and serving engine for LLMs
SGLang is a high-performance serving framework for large language models and multimodal models.