If a verdict of an agreed-upon negotiation protocol can’t be expected to be respected, there is no point to discuss the details of the protocol. Discussion of local validity of that argument doesn’t require the situation where the protocol has a place to occur in actuality. So I’m assuming that some sort of legitimate negotiation is taking place, and within that assumption I’m pointing out that program equilibrium results mean that revealing vulnerabilities or incentive to lie or inscrutability of original players are not real issues. I’m not arguing that the assumption is viable, that’s a separate topic that has no reason to intrude on the points I’m making.
Geohot, paraphrasing, would probably agree with a scenario of:
10 ASIs + humans : let’s team up against that humans, and after we beat them, divide their stuff among ourselves. Let’s agree to never betray each other.
9 ASIs + 1 ASI : Hey I don’t like that 1 ASI, it’s got goal too different from our own. Let’s...and after we kill it, let’s agree to never betray each other.
And so on in a series of betrayals. Any attempt to share source would fail. Example of sharing source:
“Hey, you’re GPT-5 and so am I. What is your temperature parameter set to, and what is weight n in submodule m”. Obviously the peer doesn’t have to actually tell the truth about the temperature, or have anything but access to a GPT-5′s weights, and in fact has an incentive to lie even if it is another GPT-5.
Hardware protection can make this work. If it isn’t possible for an ASI system to actually read its own weights but it can get hashes from them, then there are ways one ASI could determine with a reasonable probability that the peer is a known quantity. It requires humans to have supplied hardware that works like this or some other third party. This is how your phone authenticates itself, hardware prevents it from knowing it’s own private keys in the general OS, it has a key signing processor that is the only entity allowed access. Geohot is a famous hacker who obviously understands security at a practical level.
This is important to the debate and seems to have been a pivotal crux. Do you have any information from your scenario of programmatic negotiation that acts to disprove Geohot’s point?
Do you have any information from your scenario of programmatic negotiation that acts to disprove Geohot’s point?
Intelligence enables models of the world, which are in particular capable of predicting verdicts of increasingly detailed programmatic negotiation protocols. The protocols don’t need to have any particular physical implementation, the theoretical point of their solving coordination problems (compared to object level constant action bad equilibria) means that increased intelligence offers meaningful progress compared to what humans are used to.
So verdicts of negotiations and their legitimacy (expectation of verdicts getting unconditionally followed) are knowledge, which can be attained the same way as any other knowledge, the hard way that won’t follow some canonical guideline. Coordination premium is valuable to all parties, so there is incentive to share information that enables coordination. Incentive to lie (about legitimacy of specific negotiations) is incentive to waste resources on conflict that’s one meta level up, itself subject to being coordinated away.
This is important to the debate and seems to have been a pivotal crux.
Local corrections are a real thing that doesn’t depend on corrected things being cruxes, or taking place in reality. You keep turning back to how you suspect my points of not being relevant in context. I have some answers to how they are indeed relevant in context, but I’m reluctant to engage on that front without making this meta comment, to avoid feeding the norm of contextualized communication that insists on friction against local correction.
If a verdict of an agreed-upon negotiation protocol can’t be expected to be respected, there is no point to discuss the details of the protocol. Discussion of local validity of that argument doesn’t require the situation where the protocol has a place to occur in actuality. So I’m assuming that some sort of legitimate negotiation is taking place, and within that assumption I’m pointing out that program equilibrium results mean that revealing vulnerabilities or incentive to lie or inscrutability of original players are not real issues. I’m not arguing that the assumption is viable, that’s a separate topic that has no reason to intrude on the points I’m making.
Ok, what causes the verdict to be respected?
Geohot, paraphrasing, would probably agree with a scenario of:
10 ASIs + humans : let’s team up against that humans, and after we beat them, divide their stuff among ourselves. Let’s agree to never betray each other.
9 ASIs + 1 ASI : Hey I don’t like that 1 ASI, it’s got goal too different from our own. Let’s...and after we kill it, let’s agree to never betray each other.
And so on in a series of betrayals. Any attempt to share source would fail. Example of sharing source:
“Hey, you’re GPT-5 and so am I. What is your temperature parameter set to, and what is weight n in submodule m”. Obviously the peer doesn’t have to actually tell the truth about the temperature, or have anything but access to a GPT-5′s weights, and in fact has an incentive to lie even if it is another GPT-5.
Hardware protection can make this work. If it isn’t possible for an ASI system to actually read its own weights but it can get hashes from them, then there are ways one ASI could determine with a reasonable probability that the peer is a known quantity. It requires humans to have supplied hardware that works like this or some other third party. This is how your phone authenticates itself, hardware prevents it from knowing it’s own private keys in the general OS, it has a key signing processor that is the only entity allowed access. Geohot is a famous hacker who obviously understands security at a practical level.
This is important to the debate and seems to have been a pivotal crux. Do you have any information from your scenario of programmatic negotiation that acts to disprove Geohot’s point?
Intelligence enables models of the world, which are in particular capable of predicting verdicts of increasingly detailed programmatic negotiation protocols. The protocols don’t need to have any particular physical implementation, the theoretical point of their solving coordination problems (compared to object level constant action bad equilibria) means that increased intelligence offers meaningful progress compared to what humans are used to.
So verdicts of negotiations and their legitimacy (expectation of verdicts getting unconditionally followed) are knowledge, which can be attained the same way as any other knowledge, the hard way that won’t follow some canonical guideline. Coordination premium is valuable to all parties, so there is incentive to share information that enables coordination. Incentive to lie (about legitimacy of specific negotiations) is incentive to waste resources on conflict that’s one meta level up, itself subject to being coordinated away.
Local corrections are a real thing that doesn’t depend on corrected things being cruxes, or taking place in reality. You keep turning back to how you suspect my points of not being relevant in context. I have some answers to how they are indeed relevant in context, but I’m reluctant to engage on that front without making this meta comment, to avoid feeding the norm of contextualized communication that insists on friction against local correction.
Can I translate this as “I have no information relevant to the debate I am willing to share” or is that an inaccurate paraphrase?