Ask HN: What Inference Server do you use to host TTS Models?

5 hours ago 1

All the examples I have are highly unoptimized - For eg, Modal Labs uses FastAPI - https://modal.com/docs/examples/chatterbox_tts\ BentoML also uses FastAPI like service - https://www.bentoml.com/blog/deploying-a-text-to-speech-application-with-bentoml\

Even Chatterbox TTS has a very naive example - https://github.com/resemble-ai/chatterbox\

Tritonserver docs don’t have a TTS example.

I am 100% certain that a highly optimized variant can be written with TritonServer, utilizing model concurrency and batching.

If someone has implemented a TTS service with Tritonserver or has a better inference server alternative to deploy, please help me out here. I don’t want to reinvent the wheel.

Read Entire Article