No particularly strong reason—the main thing is that, when building these models, you also have to build a model of the environment that the system is interacting with. And the codebase for helping people build generic environments is mostly focused on handling key-presses and mouse-movements and visually looking at screens, while there’s a separate codebase for handing auditory stimuli and responses, since that’s a pretty different sort of behaviour.
Why do they separate out the auditory world and the environment?
No particularly strong reason—the main thing is that, when building these models, you also have to build a model of the environment that the system is interacting with. And the codebase for helping people build generic environments is mostly focused on handling key-presses and mouse-movements and visually looking at screens, while there’s a separate codebase for handing auditory stimuli and responses, since that’s a pretty different sort of behaviour.