When I wrote this I thought OAI was sort of fudging the audio output and was using SSML as an intermediate step.
After seeing details in the system card, such as copying user voice, it’s clearly not fudging.
Which makes me even more sure the above is going to end up prophetically correct.
When I wrote this I thought OAI was sort of fudging the audio output and was using SSML as an intermediate step.
After seeing details in the system card, such as copying user voice, it’s clearly not fudging.
Which makes me even more sure the above is going to end up prophetically correct.