I think the hypothesis of high orca intelligence is interesting and plausible. I am all in favor of low-cost low-risk long shot experiments to learn more about other forms of intelligence, whether they end up instrumentally useful to humans or not. Personally I’d be curious just to know whether this kind of attempt is even able to catch the interest of orcas, and if so, to what degree. Even if you just stuck to simple two and three word declarative sentences, it would be interesting to see if they intuit the categories of noun and verb, if they can learn the referent of a word from a picture or video without acoustic or olfactory or stereo visual data.
You mention languages that don’t have articles, but keep in mind this is not rare. Most East Asian, Slavic, and Bantu languages don’t; Latin and Sanskrit don’t/didn’t. Though, Mandarin (the only one of these I know even a little) does use demonstrative adjectives like “this one” and “that one,” and makes extensive use of measure words for some of the same purposes as English uses articles. If it’s not necessary among humans, it shouldn’t be necessary for other mind types in general.
Another factor to consider is that some languages, like Mandarin, have no word modifications for case/number/gender/tense/perspective markers; these are done with extra words where necessarily and not at all otherwise. Probably a feature you want to have when trying to teach with few examples.
When looking at where to do this, consider that in many/most places, orcas are protected species with rules about approaching them and interfering with their natural behavior/environment.
Edit to add: I think if any project like this succeeds, what you’ll be teaching is not so much a language as a pidgin. Whether or not it can become a creole, whether we can learn to incorporate orca vocabulary and grammar to find a medium of communication both species can reproduce and interpret, those are much bigger questions and longer term projects.
Aside: Do we have enough recordings of orca calls to train an LLM on them?
I don’t know what you’d consider enough recordings, and I don’t know how much decent data we have.
I think the biggest datasets for orca vocalizations are the orchive and the orcasound archive. I think they each are multiple terabytes big (from audio recordings) but I think most of it (80-99.9% (?)) is probably crap where there might just be a brief very faint mammal vocalization in the distance. We also don’t have a way to see which orca said what.
Also orcas from different regions have different languages, and orcas from different pods different dialects.
I currently think the decoding path would be slower, and yeah the decoding part would involve AI but I feel like people just try to use AI somehow without a clear plan, but perhaps not you. What approach did you imagine?
I have nowhere near the technical capability to have anything like a clear plan, and your response is basically what I expected. I was just curious. Seems like it could be another cheap “Who knows? Let’s see what happens” thing to try, with little to lose when it doesn’t help anyone with anything. Still, can we distinguish individuals in unlabeled recordings? Can we learn about meaning and grammar (or its equivalent) based in part on differences between languages and dialects?
At root my thought process amounted to: we have a technology that learns complex structures including languages from data without the benefit of the structural predispositions of human brains. If we could get a good enough corpus of data, it can also learn things other than human languages, and find approximate mappings to human languages. I assumed we wouldn’t have such data in this case. That’s as far as I got before I posted.
Currently we basically don’t have any datasets where it’s labelled what orca says what. When I listen to recordings, I cannot distinguish voices, though idk it’s possible that people who listened a lot more can. I think just unsupervised voice clustering would probably not work very accurately. I’d guess it’s probably possible to get data on who said what by using an array of hydrophones to infer the location of the sound, but we need very accurate position inference because different orcas are often just 1-10m distance from each other, and for this we might need to get/infer decent estimates of how water temperature varies by depth, and generally there have not yet been attempts to get high precision through this method. (It’s definitely harder in water than in air.)
Yeah basically I initially also had rough thoughts into this direction, but I think the create-and-teach language way is probably a lot faster.
I think the earth species project is trying to use AI to decode animal communication, though they don’t focus on orcas in particular, but many species including e.g. beluga whales. Didn’t look into it a lot but seems possible I could do sth like this in a smarter and more promising way, but probably still would take long.
I think the hypothesis of high orca intelligence is interesting and plausible. I am all in favor of low-cost low-risk long shot experiments to learn more about other forms of intelligence, whether they end up instrumentally useful to humans or not. Personally I’d be curious just to know whether this kind of attempt is even able to catch the interest of orcas, and if so, to what degree. Even if you just stuck to simple two and three word declarative sentences, it would be interesting to see if they intuit the categories of noun and verb, if they can learn the referent of a word from a picture or video without acoustic or olfactory or stereo visual data.
You mention languages that don’t have articles, but keep in mind this is not rare. Most East Asian, Slavic, and Bantu languages don’t; Latin and Sanskrit don’t/didn’t. Though, Mandarin (the only one of these I know even a little) does use demonstrative adjectives like “this one” and “that one,” and makes extensive use of measure words for some of the same purposes as English uses articles. If it’s not necessary among humans, it shouldn’t be necessary for other mind types in general.
Another factor to consider is that some languages, like Mandarin, have no word modifications for case/number/gender/tense/perspective markers; these are done with extra words where necessarily and not at all otherwise. Probably a feature you want to have when trying to teach with few examples.
When looking at where to do this, consider that in many/most places, orcas are protected species with rules about approaching them and interfering with their natural behavior/environment.
Edit to add: I think if any project like this succeeds, what you’ll be teaching is not so much a language as a pidgin. Whether or not it can become a creole, whether we can learn to incorporate orca vocabulary and grammar to find a medium of communication both species can reproduce and interpret, those are much bigger questions and longer term projects.
Aside: Do we have enough recordings of orca calls to train an LLM on them?
Thanks for your thoughts!
I don’t know what you’d consider enough recordings, and I don’t know how much decent data we have.
I think the biggest datasets for orca vocalizations are the orchive and the orcasound archive. I think they each are multiple terabytes big (from audio recordings) but I think most of it (80-99.9% (?)) is probably crap where there might just be a brief very faint mammal vocalization in the distance.
We also don’t have a way to see which orca said what.
Also orcas from different regions have different languages, and orcas from different pods different dialects.
I currently think the decoding path would be slower, and yeah the decoding part would involve AI but I feel like people just try to use AI somehow without a clear plan, but perhaps not you.
What approach did you imagine?
In case you’re interested in few high-quality data (but still without annotations): https://orcasound.net/data/product/biophony/SRKW/bouts/
I have nowhere near the technical capability to have anything like a clear plan, and your response is basically what I expected. I was just curious. Seems like it could be another cheap “Who knows? Let’s see what happens” thing to try, with little to lose when it doesn’t help anyone with anything. Still, can we distinguish individuals in unlabeled recordings? Can we learn about meaning and grammar (or its equivalent) based in part on differences between languages and dialects?
At root my thought process amounted to: we have a technology that learns complex structures including languages from data without the benefit of the structural predispositions of human brains. If we could get a good enough corpus of data, it can also learn things other than human languages, and find approximate mappings to human languages. I assumed we wouldn’t have such data in this case. That’s as far as I got before I posted.
Currently we basically don’t have any datasets where it’s labelled what orca says what. When I listen to recordings, I cannot distinguish voices, though idk it’s possible that people who listened a lot more can. I think just unsupervised voice clustering would probably not work very accurately. I’d guess it’s probably possible to get data on who said what by using an array of hydrophones to infer the location of the sound, but we need very accurate position inference because different orcas are often just 1-10m distance from each other, and for this we might need to get/infer decent estimates of how water temperature varies by depth, and generally there have not yet been attempts to get high precision through this method. (It’s definitely harder in water than in air.)
Yeah basically I initially also had rough thoughts into this direction, but I think the create-and-teach language way is probably a lot faster.
I think the earth species project is trying to use AI to decode animal communication, though they don’t focus on orcas in particular, but many species including e.g. beluga whales. Didn’t look into it a lot but seems possible I could do sth like this in a smarter and more promising way, but probably still would take long.