jessicata comments on “Brain enthusiasts” in AI Safety

jessicata 18 Jun 2022 18:54 UTC
11 points
I’ve been thinking recently about AI alignment perhaps being better thought of as a subfield of cognitive science than either AI (since AI focuses on artificial agents, not human values) or philosophy (since philosophy is too open-ended); cognitive science is a finite endeavor (due to the limited size of the human brain) compatible with executable philosophy.

It seems to me that an approach that would “work” for AI alignment (in the sense of solving or reframing it) would be to understand the human mind well enough to determine whether it has “values” / “beliefs” / etc; if it does, then an aligned AI can be programmed to be aligned with these values/beliefs; if it doesn’t; then the AI alignment problem must be reframed so as to be meaningful and meaningfully solvable. This isn’t guaranteed to work “in time” but it seems to have the virtue of working eventually at all, which is nice.

(Btw, although I got my degree in computer science / AI, I worked in Noah Goodman’s lab at Stanford on cognitive science and probabilistic programming, see probmods.org for an intro to this lab’s approach)
- Jan 18 Jun 2022 19:06 UTC
  3 points
  Parent
  Great point! And thanks for the references :)
  I’ll change your background to Computational Cognitive Science in the table! (unless you object or think a different field is even more appropriate)