Can the process not be automated? Like, sheet music specifies notes, right? And notes are frequencies. And frequencies can be determined by examining a recording by means of appropriate hardware/software (very easily, in the case of digital recordings, I should think). Right? So, is there not some software or something that can do this?
One thing that makes this hard to automate is human imprecision in generating a recording, espeically with rhythm: notes encode frequencies but also timings and durations, and humans performing a song will never get those things exactly precise (nor should they—good performance tends to involve being a little free with rhythms in ways that shouldn’t be directly reflected in the sheet music), so any automatic transcriber will get silly-looking slightly off rhythms that still need judgment to adjust.
This seems solvable by using multiple recordings and averaging, yes?
Also, if the transcription to sheet-music form is accurate w.r.t. the recording, and the recording is acceptable w.r.t. the intended notes, then the transcription ought to be close enough to the intended notes. Or am I misunderstanding?
[edit: one issue is that some irregularities will in fact be correlated across takes and STILL shouldn’t be written down—like, sometimes a song will slow down gradually over the course of a couple measures, and the way to deal with that is to write the notes as though no slowdown is happening and then write “rit.” (means “slow down”) over the staff, NOT to write gradually longer notes; this might be tunable post facto but I think that itself would take human (or really good AI) judgment that’s not necessarily much easier than just transcribing it manually to start]
re point 2 - the thing is you’d get a really irregular-looking hard to read thing that nobody could sightread. (actually this is already somewhat true for a lot of folk-style songs that sound intuitive but look really confusing when written down)
You’d think, but I wasn’t been able to find such a thing despite looking pretty hard a few years ago; there might be a more recent AI approach to this though. A useful search term might be “audio to midi conversion”. (Stem separation, for which Spleeter works well, might be a necessary preprocessing step.)
Can the process not be automated? Like, sheet music specifies notes, right? And notes are frequencies. And frequencies can be determined by examining a recording by means of appropriate hardware/software (very easily, in the case of digital recordings, I should think). Right? So, is there not some software or something that can do this?
One thing that makes this hard to automate is human imprecision in generating a recording, espeically with rhythm: notes encode frequencies but also timings and durations, and humans performing a song will never get those things exactly precise (nor should they—good performance tends to involve being a little free with rhythms in ways that shouldn’t be directly reflected in the sheet music), so any automatic transcriber will get silly-looking slightly off rhythms that still need judgment to adjust.
This seems solvable by using multiple recordings and averaging, yes?
Also, if the transcription to sheet-music form is accurate w.r.t. the recording, and the recording is acceptable w.r.t. the intended notes, then the transcription ought to be close enough to the intended notes. Or am I misunderstanding?
re point 1 - maybe? unsure
[edit: one issue is that some irregularities will in fact be correlated across takes and STILL shouldn’t be written down—like, sometimes a song will slow down gradually over the course of a couple measures, and the way to deal with that is to write the notes as though no slowdown is happening and then write “rit.” (means “slow down”) over the staff, NOT to write gradually longer notes; this might be tunable post facto but I think that itself would take human (or really good AI) judgment that’s not necessarily much easier than just transcribing it manually to start]
re point 2 - the thing is you’d get a really irregular-looking hard to read thing that nobody could sightread. (actually this is already somewhat true for a lot of folk-style songs that sound intuitive but look really confusing when written down)
You’d think, but I wasn’t been able to find such a thing despite looking pretty hard a few years ago; there might be a more recent AI approach to this though. A useful search term might be “audio to midi conversion”. (Stem separation, for which Spleeter works well, might be a necessary preprocessing step.)