I’m wondering if you could get enough precision to essentially visualize a matrix of buttons, look at the position of the imagined button you want to “select” it, blink or do a certain mouth movement to “click” it, etc
Maybe! This would definitely be nice if it worked. Probably better for switching the system between modes than triggering sounds in real time, though?
This makes me expect that some portion of your audience would have a worse time listening to you if the music you’re trying to play was mixed with commands that the listeners would be meant to ignore.
When using the mic in this mode I wouldn’t be sending it out to the hall. It wouldn’t be audible offstage.
see what speech-to-text can do with the result
I do think that’s worth doing, though only if I get far enough along to have speech-to-text running at all. Right now I think I probably am just trying to use hardware that isn’t up to the task.
Maybe! This would definitely be nice if it worked. Probably better for switching the system between modes than triggering sounds in real time, though?
When using the mic in this mode I wouldn’t be sending it out to the hall. It wouldn’t be audible offstage.
I do think that’s worth doing, though only if I get far enough along to have speech-to-text running at all. Right now I think I probably am just trying to use hardware that isn’t up to the task.