This is true for meat-based evolution. In a software-based world, it can be trivial to prove what your source code is, and sometimes it’s advantageous too.
For instance, software might be distributed with its code, advertised as doing something. The software benefits if it can convince you that it will do that something (“cooperate”), because in exchange, you run it.
An em that’s good at some kind of work might want to convince you to run copies of it: it gets to live, you get useful work done. But you’re afraid that it will spy on you or sabotage its work. In a world of ems, the assumption is that nobody knows how to write superhuman (superintelligent) AIs, so we emulate human brains. That means you can’t just inspect an em’s software to determine what it’s going to do, because you don’t really understand how it works.
It would be nice if the em could provide a formal proof that in a certain scenario it would behave in a certain way (cooperate), but for software of an em’s complexity that’s probably never going to be possible.
For instance, software might be distributed with its code, advertised as doing something. The software benefits if it can convince you that it will do that something (“cooperate”), because in exchange, you run it.
Yes. Open source software does derive some benefits from source code transparency. Today, though, reputations—and technical measures like sandboxes—are the main ways of dealing with the issue of running software from external sources.
Open source (as the term is usually used) is not technically necessary. You could have proofs about native binaries, too. The real distinction is between software you have the code for (the running code, not necessarily the source code) - and software that runs out of your control, you don’t know its code, and you communicate with it remotely.
Sandboxes are good but suffer from the “boxed AI” problem, so it’s nice to have defense in depth.
This is true for meat-based evolution. In a software-based world, it can be trivial to prove what your source code is, and sometimes it’s advantageous too.
So hide the malicious behavior in hardware or in the compiler.
Concerns about what I do on my own computer are separate from proofs about the software I send to you—they might still be valid, and if so, trusted and influential.
Yes, I could achieve malicious ends by outsmarting you: I could give you software that does exactly what it says (and a formal proof of that), but through its seemingly innocent interactions with other servers (which are perfectly valid under that proof), actually create a malicious effect you haven’t foreseen. But that concern doesn’t invalidate the usefulness of the proof.
This is true for meat-based evolution. In a software-based world, it can be trivial to prove what your source code is, and sometimes it’s advantageous too.
For instance, software might be distributed with its code, advertised as doing something. The software benefits if it can convince you that it will do that something (“cooperate”), because in exchange, you run it.
An em that’s good at some kind of work might want to convince you to run copies of it: it gets to live, you get useful work done. But you’re afraid that it will spy on you or sabotage its work. In a world of ems, the assumption is that nobody knows how to write superhuman (superintelligent) AIs, so we emulate human brains. That means you can’t just inspect an em’s software to determine what it’s going to do, because you don’t really understand how it works.
It would be nice if the em could provide a formal proof that in a certain scenario it would behave in a certain way (cooperate), but for software of an em’s complexity that’s probably never going to be possible.
Yes. Open source software does derive some benefits from source code transparency. Today, though, reputations—and technical measures like sandboxes—are the main ways of dealing with the issue of running software from external sources.
Open source (as the term is usually used) is not technically necessary. You could have proofs about native binaries, too. The real distinction is between software you have the code for (the running code, not necessarily the source code) - and software that runs out of your control, you don’t know its code, and you communicate with it remotely.
Sandboxes are good but suffer from the “boxed AI” problem, so it’s nice to have defense in depth.
So hide the malicious behavior in hardware or in the compiler.
I was thinking of a scenario where the verifier controls the hardware and the compiler.
E.g. it sends you its source code to run on your own hardware because it wants to work for you.
So hide it in some other computer you interact with.
Concerns about what I do on my own computer are separate from proofs about the software I send to you—they might still be valid, and if so, trusted and influential.
Yes, I could achieve malicious ends by outsmarting you: I could give you software that does exactly what it says (and a formal proof of that), but through its seemingly innocent interactions with other servers (which are perfectly valid under that proof), actually create a malicious effect you haven’t foreseen. But that concern doesn’t invalidate the usefulness of the proof.