If an AI intuits that policy, it can subvert it—nothing says that it has to announce its presence, or openly take over immediately. Shutting it down when they build computers should work.
If the “human in a box” degenerates into a loop like LLMs do, try the next species.
I agree on your last paragraph, though humans have produced loads of philosophy that both works for them and benefits them for others to adopt.
If an AI intuits that policy, it can subvert it—nothing says that it has to announce its presence, or openly take over immediately. Shutting it down when they build computers should work.
The simulators can easily see into every computer in the simulation, so it would be hard for an AI to hide from them.
If the “human in a box” degenerates into a loop like LLMs do, try the next species.
The “human in a box” could also confidently (and non-deceptively) declare that they’ve solved philosophy but hand you a bunch of nonsense. How would you know that you’ve sufficiently recreated a suitable environment/institutions for making genuine philosophical progress? (I guess a similar problem occurs at the level of picking which civilizations/species to simulate, but still by using “human in a box” you now have two points of failure instead of one: picking a civilization/species that is capable of genuine philosophical progress, and recreating suitable conditions for genuine philosophical progress.)
I agree on your last paragraph, though humans have produced loads of philosophy that both works for them and benefits them for others to adopt.
What are some examples of this? Maybe it wouldn’t be too hard for the simulators to filter them out?
Can the simulators tell whether an AI is dumb or just playing dumb, though? You can get the right meme out there with a very light touch.
Yeah, it’d be safer to skip the simulations altogether and just build a philosopher from the criteria by which you were going to select a civilization.
To be blunt, sample a published piece of philosophy! Its author wanted others to adopt it. But you’re well within your rights to go “If this set is so large, surely it has an element?”, so here’s a fun couple paragraphs on the topic.
If an AI intuits that policy, it can subvert it—nothing says that it has to announce its presence, or openly take over immediately. Shutting it down when they build computers should work.
If the “human in a box” degenerates into a loop like LLMs do, try the next species.
I agree on your last paragraph, though humans have produced loads of philosophy that both works for them and benefits them for others to adopt.
The simulators can easily see into every computer in the simulation, so it would be hard for an AI to hide from them.
The “human in a box” could also confidently (and non-deceptively) declare that they’ve solved philosophy but hand you a bunch of nonsense. How would you know that you’ve sufficiently recreated a suitable environment/institutions for making genuine philosophical progress? (I guess a similar problem occurs at the level of picking which civilizations/species to simulate, but still by using “human in a box” you now have two points of failure instead of one: picking a civilization/species that is capable of genuine philosophical progress, and recreating suitable conditions for genuine philosophical progress.)
What are some examples of this? Maybe it wouldn’t be too hard for the simulators to filter them out?
Can the simulators tell whether an AI is dumb or just playing dumb, though? You can get the right meme out there with a very light touch.
Yeah, it’d be safer to skip the simulations altogether and just build a philosopher from the criteria by which you were going to select a civilization.
To be blunt, sample a published piece of philosophy! Its author wanted others to adopt it. But you’re well within your rights to go “If this set is so large, surely it has an element?”, so here’s a fun couple paragraphs on the topic.