RogerDearnaley comments on “AI Alignment” is a Dangerously Overloaded Term

RogerDearnaley 21 Dec 2023 6:12 UTC
1 point
0
What in my diamond maximization proposal above only works in familiar contexts? Most of it is (unsurprisingly) about crystalography and isotopic ratios, plus a standard causal wrapper. (If you look carefully, I even allowed for the possibility of FTL.)
The obvious “brute force” solution to aimability is a practical, approximately Bayesian, GOFAI equivalant of AIXI that is capable of tool use and contains an LLM as a tool;. This is extremely aimable — it has an explicit slot to plug a utility function in. Which makes it extremely easy to build a diamond maximizer, or a paperclip maximizer, or any other such x-risk. Then we need to instead plug in something that hopefully isn’t an x-risk, like value learning or CEV or “solve goalcraft” as the terminal goal: figure out what we want, then optimize that, while appropriately pessimizing that optimization over remaining uncertainties in “what we want”.