I have slightly refined my personal recipe for human-friendly superintelligence (which derives from mid-2000s Eliezer). It is CEV as an interim goal, along with as much “reflective virtue” as possible.
I was thinking about the problem of unknown unknowns, and how a developing superintelligence deals with them, once it is beyond human oversight. An unknown unknown is something we humans didn’t know about or didn’t think of, that the AI discovers, and which potentially affects what it does or should do.
I asked ChatGPT about this problem, and one of its suggestions was “robust and reflective AI design”. I was reminded of a concept from philosophy, the idea of a virtuous circle among disciplines such as ontology, epistemology, phenomenology, and methodology. (@Roman Leventov has some similar ideas.)
Thus, reflective virtue: the extent to which an AI’s design embodies and encourages such a virtuous circle. If it faces unknown unknowns, at times when it is beyond human assistance or guidance, that’s all it will have to keep it on track.
Re: the virtuous cycle, I was excited recently to find Toby Smithe’s work, a compositional account of Bayesian Brain, which strives to establish formal connections between ontology, epistemology, phenomenology, semantics, evolutionary game theory, and more.
This is my first try at a “shortform” post…
I have slightly refined my personal recipe for human-friendly superintelligence (which derives from mid-2000s Eliezer). It is CEV as an interim goal, along with as much “reflective virtue” as possible.
I was thinking about the problem of unknown unknowns, and how a developing superintelligence deals with them, once it is beyond human oversight. An unknown unknown is something we humans didn’t know about or didn’t think of, that the AI discovers, and which potentially affects what it does or should do.
I asked ChatGPT about this problem, and one of its suggestions was “robust and reflective AI design”. I was reminded of a concept from philosophy, the idea of a virtuous circle among disciplines such as ontology, epistemology, phenomenology, and methodology. (@Roman Leventov has some similar ideas.)
Thus, reflective virtue: the extent to which an AI’s design embodies and encourages such a virtuous circle. If it faces unknown unknowns, at times when it is beyond human assistance or guidance, that’s all it will have to keep it on track.
Re: the virtuous cycle, I was excited recently to find Toby Smithe’s work, a compositional account of Bayesian Brain, which strives to establish formal connections between ontology, epistemology, phenomenology, semantics, evolutionary game theory, and more.
Next week, Smithe will give a seminar about this work.
That word sets of my BS detectors. It just seems to mean “good, not otherwise specified”. It’s suspicious that politicians use it all the time.