You meant speech to text instead of text to speech. They just added the latter recently but we don’t know the model behind it afaik
You meant speech to text instead of text to speech. They just added the latter recently but we don’t know the model behind it afaik