Well, if I can ask for anything I want, my first question would be the same one I’ve been asking variants of to language models for a while now, this time with no dumbing down...
Please mathematically describe in lean 4 a mathematical formalism for arbitrary (continuous?) causal graphs, especially as inspired by the paper “reasoning about causality in games”, and a general experimental procedure that will reliably reduce uncertainty about the following facts:
given that we can configure the state of one part of the universe (encoded as a causal graph we can intervene on to some degree), how do we make a mechanism which, given no further intervention after its construction, which when activated—ideally within the span of only a few minutes, though that part is flexible—can nondestructively and harmlessly scan, measure, and detect some tractable combination of:
a (dense/continuous?) causal graph representation of the chunk of matter; ie, the reactive mechanisms or non-equilibrium states in that chunk of matter, to whatever resolution is permitted by the sensors in the mechanism
moral patients within that chunk of matter (choose a definition and give a chain of reasoning which justifies it, then critique that chain of reasoning vigorously)
agency (in the sense given in Discovering Agents) within that chunk of matter; (note that running discovering agents exactly is 1. intractable for large systems, 2. impossible for physical states unless their physical configuration can be read into a computer, so I’d like you to improve on that definition by giving a fully specified algorithm in lean 4 that the mechanism can use to detect the agency of the system)
local wants (as in, things that the system within the chunk of matter would have agency towards, if it were not impaired)
this should be defined with a local definition, not an infinite, unbounded reflection
global wants (as in, things that the system within the chunk of matter would have agency towards if it were fully rational, according to its own growth process)
according to my current beliefs it is likely not possible to extrapolate this exactly, and CEV will always be uncertain, but to the degree that it is permitted by the information which the mechanism can extract from the chunk of matter
give a series of definitions of the mathematical properties of each of local wants, global wants, and moral patiency, in terms of the physical causal graph framework used, and then provide proof scripts for proving the correctness of the mechanism in terms of its ability to form a representation of these attributes of the system under test within itself.
I will test the description of how to construct this mechanism by constructing test versions of it in game of life, lenia, particle lenia, and after I have become satisfied by the proofs and tests of them, real life. Think out loud as much as needed to accomplish this, and tell me any questions you need answered before you can start about what I intend here, what I will use this for, etc. Begin!
I might also clarify that I’d be intending to use this to identify what both I and the AI want, so that we can both get it in the face of AIs arbitrarily stronger than either of us, and that it’s not the only AI I’d be asking. AIs certainly seem to be more cooperative if I say that, which would make sense for current gen AIs which understand the cooperative spirit from data and don’t have a huge amount of objective-specific intentionality.
[edit: pinned to profile]
Well, if I can ask for anything I want, my first question would be the same one I’ve been asking variants of to language models for a while now, this time with no dumbing down...
Please mathematically describe in lean 4 a mathematical formalism for arbitrary (continuous?) causal graphs, especially as inspired by the paper “reasoning about causality in games”, and a general experimental procedure that will reliably reduce uncertainty about the following facts:
given that we can configure the state of one part of the universe (encoded as a causal graph we can intervene on to some degree), how do we make a mechanism which, given no further intervention after its construction, which when activated—ideally within the span of only a few minutes, though that part is flexible—can nondestructively and harmlessly scan, measure, and detect some tractable combination of:
a (dense/continuous?) causal graph representation of the chunk of matter; ie, the reactive mechanisms or non-equilibrium states in that chunk of matter, to whatever resolution is permitted by the sensors in the mechanism
moral patients within that chunk of matter (choose a definition and give a chain of reasoning which justifies it, then critique that chain of reasoning vigorously)
agency (in the sense given in Discovering Agents) within that chunk of matter; (note that running discovering agents exactly is 1. intractable for large systems, 2. impossible for physical states unless their physical configuration can be read into a computer, so I’d like you to improve on that definition by giving a fully specified algorithm in lean 4 that the mechanism can use to detect the agency of the system)
local wants (as in, things that the system within the chunk of matter would have agency towards, if it were not impaired)
this should be defined with a local definition, not an infinite, unbounded reflection
global wants (as in, things that the system within the chunk of matter would have agency towards if it were fully rational, according to its own growth process)
according to my current beliefs it is likely not possible to extrapolate this exactly, and CEV will always be uncertain, but to the degree that it is permitted by the information which the mechanism can extract from the chunk of matter
give a series of definitions of the mathematical properties of each of local wants, global wants, and moral patiency, in terms of the physical causal graph framework used, and then provide proof scripts for proving the correctness of the mechanism in terms of its ability to form a representation of these attributes of the system under test within itself.
I will test the description of how to construct this mechanism by constructing test versions of it in game of life, lenia, particle lenia, and after I have become satisfied by the proofs and tests of them, real life. Think out loud as much as needed to accomplish this, and tell me any questions you need answered before you can start about what I intend here, what I will use this for, etc. Begin!
I might also clarify that I’d be intending to use this to identify what both I and the AI want, so that we can both get it in the face of AIs arbitrarily stronger than either of us, and that it’s not the only AI I’d be asking. AIs certainly seem to be more cooperative if I say that, which would make sense for current gen AIs which understand the cooperative spirit from data and don’t have a huge amount of objective-specific intentionality.