The thing I’m describing, which happens to resemble some points in your posts, is about using self-supervised learning (SSL) as a setting for formulating decision making. Decision theory often relies on a notion of counterfactuals, which are a weird simulacrum of reality where laws of physics or even logical constraints on their definition can fail upon undue examination, but that need to be reasoned about somewhat to make sensible decisions in reality. SSL trains a model to fill in the gaps in episodes, to reconstruct them out of fragments. This ends up giving plausible results even when prompted by fragments that upon closer examination are not descriptions of reality or don’t make sense in stronger ways. So the episodes generated by SSL models seem like a good fit for counterfactuals of decision theory.
You are talking in realist language, that’s an interesting exercise. An SSL model trained on real observations can be thought of as a map of the world, and maps that are good enough to rely on in navigating the territory can take up the role of ontologies, with things they should claim (if they did everything right) becoming a form of presentation of the world. This way we can talk about Bayesian probability of a rare or one-off event as an objective expression of some prior, even though it’s not physically there. So if similarly we develop a sufficiently good notion of an SSL map of the world, we might talk about things it should conclude as objective facts.
The way I’m relating SSL episodes to decision making is by putting agents into larger implied episodes (in principle as big as whole worlds), without requiring the much smaller fragments of episodes that SSL models actually train on to explicitly contain those agents. The training episodes only need to contain stories of how they act, and the outcomes. But character of the agents that shape an episode from the outside is important to how the episode turns out, and what outcomes the more complete (but still small) episode settles into. So agents should be partially described, by their intents (goals) and influence (ability to perform particular actions within the episode), and these descriptions should be treated as parts of the episode. Backchaining from outcomes to actions according to intent is decision making, backchaining from outcomes to intent according to influence is preference elicitation. A lot of this might benefit from automatically generated episodes, as in chess (MCTS).
The agents implied by a small episode can have similarly small intents, caring about simpler properties of outcomes like color or tallness or chairness. This is about counterfactuals, the implied world outside the episodes can be weird. But such an intent might also be an aspect of human decision making (as a concept), and can be shared by multiple implied humans around an episode, acausally coordinating them (by virtue of the decisions of the intent channeled through the humans taking place in the same counterfactual; as opposed to a different counterfactual where the intent coordinates the humans in making different choices, or in following a different policy). So this way we should be able to ask what the world would look like if a given concept meant something a bit different, suggested different conclusions in thinking that involves it. Or backchaining from how the world actually looks like, we can ask what a concept means, if it is to coordinate the minds of humanity to settle the world into the shape it has.
The thing I’m describing, which happens to resemble some points in your posts, is about using self-supervised learning (SSL) as a setting for formulating decision making. Decision theory often relies on a notion of counterfactuals, which are a weird simulacrum of reality where laws of physics or even logical constraints on their definition can fail upon undue examination, but that need to be reasoned about somewhat to make sensible decisions in reality. SSL trains a model to fill in the gaps in episodes, to reconstruct them out of fragments. This ends up giving plausible results even when prompted by fragments that upon closer examination are not descriptions of reality or don’t make sense in stronger ways. So the episodes generated by SSL models seem like a good fit for counterfactuals of decision theory.
You are talking in realist language, that’s an interesting exercise. An SSL model trained on real observations can be thought of as a map of the world, and maps that are good enough to rely on in navigating the territory can take up the role of ontologies, with things they should claim (if they did everything right) becoming a form of presentation of the world. This way we can talk about Bayesian probability of a rare or one-off event as an objective expression of some prior, even though it’s not physically there. So if similarly we develop a sufficiently good notion of an SSL map of the world, we might talk about things it should conclude as objective facts.
The way I’m relating SSL episodes to decision making is by putting agents into larger implied episodes (in principle as big as whole worlds), without requiring the much smaller fragments of episodes that SSL models actually train on to explicitly contain those agents. The training episodes only need to contain stories of how they act, and the outcomes. But character of the agents that shape an episode from the outside is important to how the episode turns out, and what outcomes the more complete (but still small) episode settles into. So agents should be partially described, by their intents (goals) and influence (ability to perform particular actions within the episode), and these descriptions should be treated as parts of the episode. Backchaining from outcomes to actions according to intent is decision making, backchaining from outcomes to intent according to influence is preference elicitation. A lot of this might benefit from automatically generated episodes, as in chess (MCTS).
The agents implied by a small episode can have similarly small intents, caring about simpler properties of outcomes like color or tallness or chairness. This is about counterfactuals, the implied world outside the episodes can be weird. But such an intent might also be an aspect of human decision making (as a concept), and can be shared by multiple implied humans around an episode, acausally coordinating them (by virtue of the decisions of the intent channeled through the humans taking place in the same counterfactual; as opposed to a different counterfactual where the intent coordinates the humans in making different choices, or in following a different policy). So this way we should be able to ask what the world would look like if a given concept meant something a bit different, suggested different conclusions in thinking that involves it. Or backchaining from how the world actually looks like, we can ask what a concept means, if it is to coordinate the minds of humanity to settle the world into the shape it has.
You can represent a human/AI as multiple incomplete desires fighting for parts of the world? I agree, this is related to the post.
Interesting idea about counterfactual versions of concepts.
Could you help to clarify how probability should work for objects with shared properties/identities?