Gurkenglas comments on Are we so good to simulate?

Gurkenglas 4 Mar 2024 19:44 UTC
2 points
0
Sorry, our timeline is dangerous because we’re on track to create AI that can eat unsophisticated simulators for breakfast, such as by helpfully handing them a “solution to philosophy”.

Yes, instantiate a philosopher. Not having solved philosophy is a good reason to use fewer moving parts you don’t understand. Just because you can use arbitrary compute doesn’t mean you should.
- Wei Dai 4 Mar 2024 20:05 UTC
  2 points
  0
  Parent
  
  Sorry, our timeline is dangerous because we’re on track to create AI that can eat unsophisticated simulators for breakfast, such as by helpfully handing them a “solution to philosophy”.
  
  If I was running such a simulation, I’d stop it before AI is created. Basically look for civilizations that end up doing “long reflections” in an easily interpreatable way, e.g., with biological brains using natural language (to make sure they’re trying to solve philosophy for themselves and not trying to trick potential simulators).
  
  Yes, instantiate a philosopher. Not having solved philosophy is a good reason to use fewer moving parts you don’t understand. Just because you can use arbitrary compute doesn’t mean you should.
  
  But ability to make philosophical progress may be a property of civilizations, not necessarily of individual or even small groups of philosophers, since any given philosopher is motivated by and collaborates with many others around them. Also if you put a philosopher in an alien (to them) environment, wouldn’t that greatly increase the risk of them handing you a deceptive “solution to philosophy”?
  - Gurkenglas 4 Mar 2024 21:32 UTC
    2 points
    0
    Parent
    How do you tell when to stop the simulation? Apparently not at the almost human-level AI we have now.
    Do you have an example piece of philosophical progress made by a civilization?
    I admit that the human could turn against you, but if a human can eat you, you certainly shouldn’t be watching a planet full of humans.
    - Wei Dai 4 Mar 2024 22:39 UTC
      2 points
      0
      Parent
      
      How do you tell when to stop the simulation? Apparently not at the almost human-level AI we have now.
      
      I guess you stop it when there’s very little chance left that it would go on to solve philosophy or metaphilosophy in a clearly non-deceptive way.
      
      Do you have an example piece of philosophical progress made by a civilization?
      
      In my view every piece of human philosophical progress so far was “made by a civilization” because whoever did it probably couldn’t or wouldn’t have done it if they were isolated from civilization.
      
      It seems possible that if you knew enough about how humans work (and maybe about how philosophy works), you could do it with less than a full civilization, by instantiating some large number of people and setting up some institutions that allow them to collaborate and motivate each other effectively (and not go crazy, or get stuck due to lack of sufficiently diverse ideas, or other failure modes). But it’s also quite possible that for the simulators it would be easier to just simulate the whole civilization and let the existing institutions work.
      
      I admit that the human could turn against you, but if a human can eat you, you certainly shouldn’t be watching a planet full of humans.
      
      My point is that a human or group of humans placed into an alien (or obviously simulated) environment will know that they’re instantiated to do work for someone else and can take advantage of that knowledge (to try to deceive the aliens/simulators), whereas a planet full of humans in our (apparent) native environment will want to solve philosophy for ourselves, which probably overrides any thoughts of deceiving simulators even if we suspect that we might be simulated. So that makes the latter perhaps a bit safer.
      - Gurkenglas 5 Mar 2024 0:52 UTC
        2 points
        0
        Parent
        If an AI intuits that policy, it can subvert it—nothing says that it has to announce its presence, or openly take over immediately. Shutting it down when they build computers should work.
        If the “human in a box” degenerates into a loop like LLMs do, try the next species.
        I agree on your last paragraph, though humans have produced loads of philosophy that both works for them and benefits them for others to adopt.
        Wei Dai 5 Mar 2024 17:15 UTC
        2 points
        0
        Parent
        
        If an AI intuits that policy, it can subvert it—nothing says that it has to announce its presence, or openly take over immediately. Shutting it down when they build computers should work.
        
        The simulators can easily see into every computer in the simulation, so it would be hard for an AI to hide from them.
        
        If the “human in a box” degenerates into a loop like LLMs do, try the next species.
        
        The “human in a box” could also confidently (and non-deceptively) declare that they’ve solved philosophy but hand you a bunch of nonsense. How would you know that you’ve sufficiently recreated a suitable environment/institutions for making genuine philosophical progress? (I guess a similar problem occurs at the level of picking which civilizations/species to simulate, but still by using “human in a box” you now have two points of failure instead of one: picking a civilization/species that is capable of genuine philosophical progress, and recreating suitable conditions for genuine philosophical progress.)
        
        I agree on your last paragraph, though humans have produced loads of philosophy that both works for them and benefits them for others to adopt.
        
        What are some examples of this? Maybe it wouldn’t be too hard for the simulators to filter them out?
        Gurkenglas 5 Mar 2024 23:28 UTC
        2 points
        0
        Parent
        Can the simulators tell whether an AI is dumb or just playing dumb, though? You can get the right meme out there with a very light touch.
        Yeah, it’d be safer to skip the simulations altogether and just build a philosopher from the criteria by which you were going to select a civilization.
        To be blunt, sample a published piece of philosophy! Its author wanted others to adopt it. But you’re well within your rights to go “If this set is so large, surely it has an element?”, so here’s a fun couple paragraphs on the topic.