You are given a string s corresponding to the Instructions for the construction of an AGI which has been correctly aligned with the goal of converting as much of the universe into diamonds as possible.
What is the conditional Kolmogorov Complexity of the string s’ which produces an AGI aligned with “human values” or any other suitable alignment target.
To convert an abstract string to a physical object, the “Instructions” are read by a Finite State Automata, with the state of the FSA at each step dictating the behavior of a robotic arm (with appropriate mobility and precision) with access to a large collection of physical materials.
that depends a lot on what exactly the specific instructions are. there are a variety of approaches which would result in a variety of retargetabilities. it also depends on what you’re handwaving by “correctly aligned”. is it perfectly robust? what percentage of universes will fail to be completely converted? how far would it get? what kinds of failures happen in the failure universes? how compressed is it?
anyway, something something hypothetical version 3 of QACI (which has not hit a v1)
You are given a string s corresponding to the Instructions for the construction of an AGI which has been correctly aligned with the goal of converting as much of the universe into diamonds as possible.
What is the conditional Kolmogorov Complexity of the string s’ which produces an AGI aligned with “human values” or any other suitable alignment target.
To convert an abstract string to a physical object, the “Instructions” are read by a Finite State Automata, with the state of the FSA at each step dictating the behavior of a robotic arm (with appropriate mobility and precision) with access to a large collection of physical materials.
that depends a lot on what exactly the specific instructions are. there are a variety of approaches which would result in a variety of retargetabilities. it also depends on what you’re handwaving by “correctly aligned”. is it perfectly robust? what percentage of universes will fail to be completely converted? how far would it get? what kinds of failures happen in the failure universes? how compressed is it?
anyway, something something hypothetical version 3 of QACI (which has not hit a v1)