Could you say more about the difficulties you foresee? I’m guessing that Bloggingheads would have the two separate streams of audio from each microphone, which should make it somewhat easier, but even without that figuring out which speaker is which doesn’t seem beyond the realms of what audio processing might be able to do.
Need it be more complex than: person A begins to speak while person B is still speaking? It might get a few false positives, but it should be a useful metric overall.
Could you say more about the difficulties you foresee? I’m guessing that Bloggingheads would have the two separate streams of audio from each microphone, which should make it somewhat easier, but even without that figuring out which speaker is which doesn’t seem beyond the realms of what audio processing might be able to do.
I may have been overpessimistic. I didn’t think about the separate feeds, and you’re right about that making things easier.
There might be questions about who has the “right” to be speaking at a given moment—that would define what constitutes an interruption.
Need it be more complex than: person A begins to speak while person B is still speaking? It might get a few false positives, but it should be a useful metric overall.
I think people just use standard video-editing software to combine the videos and their audio streams before uploading them.