Garrett Baker comments on Neuroscience and Alignment

Garrett Baker 19 Mar 2024 18:56 UTC
2 points
0
A clarification about in what sense I claim “biological and artificial neural-networks are based upon the same fundamental principles”:

I would not be surprised if the reasons why neural networks “work” are also exploited by the brain.

In particular why I think neuroscience for value alignment is good is because we can expect that the values part of the brain will be compatible with these reasons, and won’t require too much extra fundamental advances to actually implement, unlike say corrigibility, which will first progress from ideal utility maximizers, and then require a mapping from that to neural networks, which seems potentially just as hard as writing an AGI from scratch.

In the case where human values are incompatible with artificial neural networks, again I get much more pessimistic about all alternative forms of value alignment of neural networks.
What links here?
- Garrett Baker's comment on Neuroscience and Alignment by Garrett Baker (24 Mar 2024 16:09 UTC; 2 points)