Thanks! Yes, I am aware, please, I would encourage you to listen to the difference. I am running this through ElevenLabs, which is currently (IMO) at the forefront of humanlike voices, if produces lifelike tone and cadenced based on cues. I also go through and assign every unique quoted person a unique voice, to ensure clear differentiation when listening, alongside extracting text from images, and providing a description when appropriate.
I really do implore you, please, have a listen to a section and reply back how you find it in comparison.
I do think the eleven labs voice is a bunch better for a lot of the text. My understanding is that stuff like LaTeX is very hard to do with it, and various other things like image captions and other fine-tuning that T3Audio has done make the floor of its quality a bunch worse than the current audio, but I think I agree that for posts like this, it does seem straightforwardly better than the T3Audio narration.
Yeah, just plugging any old unadjusted text straight in to ElevenLabs would get you funky results. I really like Zvi’s posts though, I think they are high quality, combining both great original content and a fantastic distillation of quoted views from across the sphere. I do fair amount of work on the text of each post to ensure a good podcast episode gets made out of each one.
Thanks! Yes, I am aware, please, I would encourage you to listen to the difference. I am running this through ElevenLabs, which is currently (IMO) at the forefront of humanlike voices, if produces lifelike tone and cadenced based on cues. I also go through and assign every unique quoted person a unique voice, to ensure clear differentiation when listening, alongside extracting text from images, and providing a description when appropriate.
I really do implore you, please, have a listen to a section and reply back how you find it in comparison.
I do think the eleven labs voice is a bunch better for a lot of the text. My understanding is that stuff like LaTeX is very hard to do with it, and various other things like image captions and other fine-tuning that T3Audio has done make the floor of its quality a bunch worse than the current audio, but I think I agree that for posts like this, it does seem straightforwardly better than the T3Audio narration.
Yeah, just plugging any old unadjusted text straight in to ElevenLabs would get you funky results. I really like Zvi’s posts though, I think they are high quality, combining both great original content and a fantastic distillation of quoted views from across the sphere. I do fair amount of work on the text of each post to ensure a good podcast episode gets made out of each one.