Look innocent to a cursory human inspection, yes. But if hardware is designed to be deterministically cooperative/coordinating and to provably not be a backdoor in combination with larger hardware, that sounds like something that should be provable if the hardware was designed with that provability in mind.
Many governments, including the US, are concerned right now that their computers have hardware backdoors, so the current lack of research results on this topic is not just due to lack of interest, but probably intrinsic difficulty. Even if provable hardware is physically possible and technically feasible in the future, there is likely a cost attached, for example running slower than non-provable hardware or using more resources.
Instead of confidently predicting that AIs will Cooperate in one-shot PD, wouldn’t it be more reasonable to say that this is a possibility, which may or may not occur, depending on the feasibility and economics of various future technologies?
The singleton scenario seems overwhelmingly likely, so whatever multiple AIs will exist, they’ll play by the singleton’s rules, with native physics becoming irrelevant. (I know, I know...)
I believe this stuff bottoms out in physics—it’s either possible or impossible to make a physically provable analog to the PREFIX program. The idea is fascinating, but I don’t know enough physics to determine whether it’s crazy.
The difficulty would be to make sure nothing could interact with the atoms/physical constituents of the prefix in a way that distorts the prefix. Prefixes of programs have the benefit they go first, and in the serial nature of most programs, things that go first have complete control.
So it is a question of isolating the prefix. I’m going to read this paper on isolation and physics, before making any comments on the subject.
It gave some ideas. It suggests we might start with specifying time limits, e.g. specifying a system will be effectively isolated for a certain time, by scanning a region of space around that system.
Look innocent to a cursory human inspection, yes. But if hardware is designed to be deterministically cooperative/coordinating and to provably not be a backdoor in combination with larger hardware, that sounds like something that should be provable if the hardware was designed with that provability in mind.
Many governments, including the US, are concerned right now that their computers have hardware backdoors, so the current lack of research results on this topic is not just due to lack of interest, but probably intrinsic difficulty. Even if provable hardware is physically possible and technically feasible in the future, there is likely a cost attached, for example running slower than non-provable hardware or using more resources.
Instead of confidently predicting that AIs will Cooperate in one-shot PD, wouldn’t it be more reasonable to say that this is a possibility, which may or may not occur, depending on the feasibility and economics of various future technologies?
The singleton scenario seems overwhelmingly likely, so whatever multiple AIs will exist, they’ll play by the singleton’s rules, with native physics becoming irrelevant. (I know, I know...)
I believe this stuff bottoms out in physics—it’s either possible or impossible to make a physically provable analog to the PREFIX program. The idea is fascinating, but I don’t know enough physics to determine whether it’s crazy.
The difficulty would be to make sure nothing could interact with the atoms/physical constituents of the prefix in a way that distorts the prefix. Prefixes of programs have the benefit they go first, and in the serial nature of most programs, things that go first have complete control.
So it is a question of isolating the prefix. I’m going to read this paper on isolation and physics, before making any comments on the subject.
I read the paper, and it seemed to me to be useless. We want a physically inviolable guarantee of isolation.
It gave some ideas. It suggests we might start with specifying time limits, e.g. specifying a system will be effectively isolated for a certain time, by scanning a region of space around that system.