Kyutai is an amazing set of Streaming Speech-To-Text and Text-To-Speech models that excel in English and French. It is permissively licensed CC-by-4.0 and has very strong performance.
Venice already has Text-To-Speech, which is quite good, but is missing streaming Speech-To-Text and it would be amazing to have a privacy-focused Speech-To-Text model.
Links:
Please authenticate to join the conversation.
Backlog
Feature Requests
Voice
8 months ago

Nicolas Embleton
Get notified by email when there are changes.
Backlog
Feature Requests
Voice
8 months ago

Nicolas Embleton
Get notified by email when there are changes.