Thanks for the feedback! I haven’t really estimated how long it would take to have a transcript with speech-to-text + minor corrections,—that’s definitely on the roadmap.
Re audio: cost of recording is probably like one hour (x2 if you have one guest). I think that if I were to write down the whole transcript without talking it would take me easily 4-10x the time it takes me to say it. I’m not sure on how worse the quality is though, but the way I see it conversation is essentially collaborative writing where you get immediate feedback about your flaws in reasoning. And even if I agree that a 1h podcast could be summarized in a few paragraphs, the use case is different (eg. people cooking, running, etc.) so it needs to be somewhat redundant because people are not paying attention.
Re not being interested in forecasting timelines: my current goal is to have people with different expertise share their insights on their particular field and how that could nuance our global understanding of technological progress. For instance, I had a 3h discussion with someone who did robotics competitions, and one planned with a neuroscientist student converted into a ML engineer. I’m not that interested in “forecasting timelines” as a end goal, but more interested in how to dig why people have those inside views about the future (assuming they unconsciously updated on things), so we can either destroy wrong initial reasons for believing something, or gain insight on the actual evidence behind those beliefs.
Anyway, I understand that there’s a space about rigorous AI Alignment research discussions, which is currently being covered by AXRP, and the 80k podcasts also cover a lot of it, but it seems relatively low-cost to just record those conversations I would have anyway during conferences so people can decide by themselves what are the correct or bad arguments.
I haven’t really estimated how long it would take to have a transcript with speech-to-text + minor corrections,—that’s definitely on the roadmap.
FWIW: you can pay $1.25 per recorded minute to get rev.com to produce a somewhat inaccurate transcript in 2 days. It then takes me about 2x the length of the recorded audio to fix errors in that transcription. It’s kind of a pain in the ass, but worth it if a big chunk of your audience doesn’t listen to audio.
Note that the amount of manual fixing depends on the accent of the speaker. The 2x estimate comes from me interviewing someone with a foreign accent, I think for native speakers with standard accents it gets closer to 1x.
Thanks for the feedback! I haven’t really estimated how long it would take to have a transcript with speech-to-text + minor corrections,—that’s definitely on the roadmap.
Re audio: cost of recording is probably like one hour (x2 if you have one guest). I think that if I were to write down the whole transcript without talking it would take me easily 4-10x the time it takes me to say it. I’m not sure on how worse the quality is though, but the way I see it conversation is essentially collaborative writing where you get immediate feedback about your flaws in reasoning. And even if I agree that a 1h podcast could be summarized in a few paragraphs, the use case is different (eg. people cooking, running, etc.) so it needs to be somewhat redundant because people are not paying attention.
Re not being interested in forecasting timelines: my current goal is to have people with different expertise share their insights on their particular field and how that could nuance our global understanding of technological progress. For instance, I had a 3h discussion with someone who did robotics competitions, and one planned with a neuroscientist student converted into a ML engineer. I’m not that interested in “forecasting timelines” as a end goal, but more interested in how to dig why people have those inside views about the future (assuming they unconsciously updated on things), so we can either destroy wrong initial reasons for believing something, or gain insight on the actual evidence behind those beliefs.
Anyway, I understand that there’s a space about rigorous AI Alignment research discussions, which is currently being covered by AXRP, and the 80k podcasts also cover a lot of it, but it seems relatively low-cost to just record those conversations I would have anyway during conferences so people can decide by themselves what are the correct or bad arguments.
FWIW: you can pay $1.25 per recorded minute to get rev.com to produce a somewhat inaccurate transcript in 2 days. It then takes me about 2x the length of the recorded audio to fix errors in that transcription. It’s kind of a pain in the ass, but worth it if a big chunk of your audience doesn’t listen to audio.
[EDIT: changed the thing I was responding to]
Note that the amount of manual fixing depends on the accent of the speaker. The 2x estimate comes from me interviewing someone with a foreign accent, I think for native speakers with standard accents it gets closer to 1x.
Thanks for all of those tips. I’ll definitely try rev!
CastingWords at least used to be accurate (the one time I used them, I don’t recall the transcript having had any flaws).
Ace I’ll try that too, thanks!
I think it’s possible to just upload the video to Youtube, and then download its automatically generated subtitle with youtube-dl, and finally convert that subtitle into plain text (using, e.g., https://github.com/NightMachinary/.shells/blob/master/scripts/python/vtt2txt2.py ).