r_claypool comments on The Sequences in MP3 Format

r_claypool Jul 11, 2011, 7:04 AM
0 points
Oh. I understand now. I’ve tried a few text-to-speech engines and AT&T Natural Voices sound the best to my ears. I will find the terms of use and pricing for that.
- wedrifid Jul 11, 2011, 6:48 PM
  1 point
  Parent
  Please do. If they are sufficiently cheap I will see about getting someone here to allow me to implement an automatic audio version of either just the early Eliezer posts (including sequences) or as a feature of all posts. This would be massively valuable for many of us.
  
  In fact, if price is prohibitive I wonder if it would be worth implementing a free (less natural sounding) text-to-speech converter.
  - r_claypool Aug 1, 2011, 6:07 AM
    2 points
    Parent
    Other questions to resolve:
    
    Where should the files be hosted? (Does LW have the bandwidth)
    Is LW exempt from MP3 licensing? (I hope so)
    Where should the download links be placed? (A wiki page is fine, but it will be less discoverable.)
    Which posts should be completed first?
    - wedrifid Aug 1, 2011, 9:19 AM
      0 points
      Parent
      
      Where should the files be hosted? (Does LW have the bandwidth) Probably, it isn’t a huge amount. If not I have half a dozen servers floating around the place. They cost pittance.
      
      Is LW exempt from MP3 licensing? (I hope so)
      
      Probably but I know less than you.
      
      Where should the download links be placed? (A wiki page is fine, but it will be less discoverable.)
      
      A wiki page sounds good for now. If people find it especially useful we can work from there. (I may create an RSS feed or podcast at some stage if I feel inspired.)
      
      Which posts should be completed first?
      
      Whatever you happen to care about.
  - r_claypool Aug 1, 2011, 6:00 AM
    0 points
    Parent
    I have price quotes for Acapela, Cepstral, Wizzard (AT&T Voices), Neospeech, and Nuance RealSpeak. The range is from $1,000 to $15,000 USD.
    
    Open source options are eSpeak (robotic), Festival (robotic), FreeTTS (robotic), Pico and others.
    
    Pico is part of Android and it sounds more natural than other open source options I tried. Pico is licensed under Apache 2.0. Here’s a demo.
    
    The commercial voices are definately better; Loquendo is a good example.
    
    So now I can start converting via Pico or try to get funding for a more natural voice. Thoughts?
    - wedrifid Aug 1, 2011, 9:20 AM
      0 points
      Parent
      
      So now I can start converting via Pico or try to get funding for a more natural voice. Thoughts?
      
      Start with pico I guess. Then we can possibly upgrade in the future.