Maybe. I think there’s a level on which we ultimately demand that AI’s perception of values to be handled through a human lens. If you zoom out too far from the human perspective, things start getting really weird. For instance, if you try to reason for the betterment of all life in a truly species-agnostic way, you start getting highly plausible arguments for leaving bacterial or fungal infections untreated, as the human host is only one organism but the pathogens number in the millions of individuals.(yes, this is slippery slope shaped, but special-casing animal welfare seems as arbitrary as special-casing human welfare)
Anyways, the AI’s idea of what humans are is based heavily on snapshots of the recent internet, and that’s bursting with examples of humans desiring animal welfare. So if a model trained on that understanding of humanity’s goals attempts to reason about whether it’s good to help animals, it’d better conclude that humans will probably benefit from animal welfare improvements, or something has gone horribly wrong. Do you think it’s realistically plausible for humanity to develop into a species which we recognize as still human, but no individual prefers happy cute animals over sad ones? I don’t.
I fear it would be a stupid mistake to pass on the opportunity to inquire: Where does one find the form?