(I’ve typed out about five different responses thus far, but:) I guess Carl trusts in Eliezer’s prudence more than I do, or is willing to risk Eliezer getting enough momentum to do brain-in-a-box-in-a-basement if it also means that SingInst gains more credibility with which to influence government Manhattan project whole brain emulation endeavors, or gains more credibility with which to attract/hire brilliant strategic thinkers. Carl and I disagree about psi; this might cause him to be more confident than I am that the gods aren’t going to mess with us (aren’t already messing with us). Psi really confuses me and I’m having a lot of trouble seeing its implications. “Supporting” would mean different things for me and Carl; for me it means helping revise papers occasionally, for him it means a full-time job. Might be something to do with marginals. I think that the biggest difference is that for Carl “supporting” involves shaping SingInst policy and making it more strategic, whereas I don’t have that much leverage. I have a very strong bias towards being as meta as possible and staying as meta as possible for as long as possible, probably to a greater extent than Carl; I think that doing things is almost always a bad idea, whereas talking about things is in itself generally okay. Unfortunately when SingInst talks about things that tends to cause people to do things, like how the CEV document has led to a whole bunch of people thinking about FAI in terms of CEV for no particularly good reason Anyway, it’s a good question, and I don’t have a good answer. Why don’t you think SingInst is worth supporting when Carl does?
Why don’t you think SingInst is worth supporting when Carl does?
I have provided SingInst with various forms of support in the past, but I’ve done so privately and like to think of it as “helping people I know and like” instead of “supporting SingInst”. I guess for these reasons:
I’m afraid that adopting the role/identity of a SingInst supporter will affect my objectivity when thinking about Singularity-related issues. Carl might be more confident in his own rationality.
SingInst is still strongly associated with wanting to directly build FAI. It’s a bad idea according to my best guess, and I want to avoid giving the impression that I support the idea. Carl may have different opinions on this subject or do not care as much about giving other people wrong impressions of his beliefs.
Carl may have the above worries as well but the kinds of support he can give requires that he does so publicly.
“Supporting” would mean different things for me and Carl; for me it means helping revise papers occasionally, for him it means a full-time job. [...] I think that the biggest difference is that for Carl “supporting” involves shaping SingInst policy and making it more strategic, whereas I don’t have that much leverage.
Carl has been writing and publishing a lot of papers lately. Surely it couldn’t hurt to help with those papers?
SingInst is still strongly associated with wanting to directly build FAI. It’s a bad idea according to my best guess, and I want to avoid giving the impression that I support the idea.
I think this is a serious concern, especially as I’m starting to suspect that AGI might require decision theoretic insights about reflection in order to be truly dangerous. If my suspicion is wrong then SingInst working directly on FAI isn’t that harmful marginally speaking, but if it’s right then SingInst’s support of decision theory research might make it one of the most dangerous institutions around.
Given that you’re worried and that you’re highly respected in the community, this would seem to be one of those “stop, melt, and catch fire” situations that Eliezer talks about, so I’m confused about SingInst’s apparently somewhat cavalier attitude. They seem to be intent on laying the groundwork for the ennead.
I’m starting to suspect that AGI might require decision theoretic insights about reflection in order to be truly dangerous
A chess computer doesn’t need reflection to win at chess. An AGI doesn’t need reflection to make its own causal models. So if the game is ‘eat the earth’, an unreflective AGI seems like a contender. One might argue that it needs to ‘understand’ reflection in order to understand the human beings that might oppose it, or to model its own nature, but I think the necessary capacities could emerge in an indirect way. In making a causal model of an external reflective intelligence it might need to worry about the halting problem, but computational resource bounds are a real-world issue that will anyway require it to have heuristics for noticing when a particular subtask is taking up too much time. As for self-modelling, it may be capable of forming partial self-models relevant for reasoning correctly about the implications of self-modification (or just the implications of damage to itself), just by applying standard causal modelling to its own physical vicinity, i.e. without any special data representations or computational architecture designed to tell it ‘this item represents me, myself, and not just another object in the world’.
It would be desirable to have a truly rigorous understanding of both these issues, but just thinking about them informally already tells me that there’s no safety here, we can’t say “whew, at least that isn’t possible”. Finally, a world-eating AGI equipped with a knowledge of physics and a head start in brute power might never have to worry about reflection, because human beings and their machines are just too easy to swat aside. You don’t need to become an entomologist before you can stomp an insect.
I agree with everything you’ve written as far as my modal hypothesis goes, but I also think we’re going to lose in that case, so I’ve sort of renormalized to focus my attention at least somewhat more on worlds where for some reason academic/industry AI approaches don’t work, even if that requires some sort of deus ex machina. My intuition says that highly recursive narrow AI style techniques should give you AGI, but to some extent this does go against e.g. the position of many philosophers of mind, and in this case I hope they’re right. Trying to imagine intermediate scenarios led me to think about this kinda stuff.
It would of course be incredibly foolish to entirely write off worlds where AGI is relatively easy, but I also think we should think about cases where for whatever reason that isn’t the case, and if it’s not the case then SingInst is in a uniquely good position to build uFAI.
I’ve sort of renormalized to focus my attention at least somewhat more on worlds where for some reason academic/industry AI approaches don’t work, even if that requires some sort of deus ex machina
I apologize for asking, but I just want to clarify something. When you write ‘deus ex machina’, you’re not solely using the term in a metaphorical sort of way, are you? Because, if you mean what it sort of sounds like you mean, at least some of your public positions suddenly make a lot more sense.
I’m starting to suspect that AGI might require decision theoretic insights about reflection in order to be truly dangerous
Another way in which decision theoretic insights may be harmful is if they increase the sophistication of UFAI and allow them to control less sophisticated AGIs in other universes.
They seem to be intent on laying the groundwork for the ennead.
I’m trying to avoid being too confrontational, which might backfire, or I might be wrong myself. It seems safer to just push them to be more strategic and either see the danger themselves or explain why it’s a good idea despite the dangers.
(I’ve typed out about five different responses thus far, but:) I guess Carl trusts in Eliezer’s prudence more than I do, or is willing to risk Eliezer getting enough momentum to do brain-in-a-box-in-a-basement if it also means that SingInst gains more credibility with which to influence government Manhattan project whole brain emulation endeavors, or gains more credibility with which to attract/hire brilliant strategic thinkers. Carl and I disagree about psi; this might cause him to be more confident than I am that the gods aren’t going to mess with us (aren’t already messing with us). Psi really confuses me and I’m having a lot of trouble seeing its implications. “Supporting” would mean different things for me and Carl; for me it means helping revise papers occasionally, for him it means a full-time job. Might be something to do with marginals. I think that the biggest difference is that for Carl “supporting” involves shaping SingInst policy and making it more strategic, whereas I don’t have that much leverage. I have a very strong bias towards being as meta as possible and staying as meta as possible for as long as possible, probably to a greater extent than Carl; I think that doing things is almost always a bad idea, whereas talking about things is in itself generally okay. Unfortunately when SingInst talks about things that tends to cause people to do things, like how the CEV document has led to a whole bunch of people thinking about FAI in terms of CEV for no particularly good reason Anyway, it’s a good question, and I don’t have a good answer. Why don’t you think SingInst is worth supporting when Carl does?
I have provided SingInst with various forms of support in the past, but I’ve done so privately and like to think of it as “helping people I know and like” instead of “supporting SingInst”. I guess for these reasons:
I’m afraid that adopting the role/identity of a SingInst supporter will affect my objectivity when thinking about Singularity-related issues. Carl might be more confident in his own rationality.
SingInst is still strongly associated with wanting to directly build FAI. It’s a bad idea according to my best guess, and I want to avoid giving the impression that I support the idea. Carl may have different opinions on this subject or do not care as much about giving other people wrong impressions of his beliefs.
Carl may have the above worries as well but the kinds of support he can give requires that he does so publicly.
Carl has been writing and publishing a lot of papers lately. Surely it couldn’t hurt to help with those papers?
I think this is a serious concern, especially as I’m starting to suspect that AGI might require decision theoretic insights about reflection in order to be truly dangerous. If my suspicion is wrong then SingInst working directly on FAI isn’t that harmful marginally speaking, but if it’s right then SingInst’s support of decision theory research might make it one of the most dangerous institutions around.
Given that you’re worried and that you’re highly respected in the community, this would seem to be one of those “stop, melt, and catch fire” situations that Eliezer talks about, so I’m confused about SingInst’s apparently somewhat cavalier attitude. They seem to be intent on laying the groundwork for the ennead.
A chess computer doesn’t need reflection to win at chess. An AGI doesn’t need reflection to make its own causal models. So if the game is ‘eat the earth’, an unreflective AGI seems like a contender. One might argue that it needs to ‘understand’ reflection in order to understand the human beings that might oppose it, or to model its own nature, but I think the necessary capacities could emerge in an indirect way. In making a causal model of an external reflective intelligence it might need to worry about the halting problem, but computational resource bounds are a real-world issue that will anyway require it to have heuristics for noticing when a particular subtask is taking up too much time. As for self-modelling, it may be capable of forming partial self-models relevant for reasoning correctly about the implications of self-modification (or just the implications of damage to itself), just by applying standard causal modelling to its own physical vicinity, i.e. without any special data representations or computational architecture designed to tell it ‘this item represents me, myself, and not just another object in the world’.
It would be desirable to have a truly rigorous understanding of both these issues, but just thinking about them informally already tells me that there’s no safety here, we can’t say “whew, at least that isn’t possible”. Finally, a world-eating AGI equipped with a knowledge of physics and a head start in brute power might never have to worry about reflection, because human beings and their machines are just too easy to swat aside. You don’t need to become an entomologist before you can stomp an insect.
I agree with everything you’ve written as far as my modal hypothesis goes, but I also think we’re going to lose in that case, so I’ve sort of renormalized to focus my attention at least somewhat more on worlds where for some reason academic/industry AI approaches don’t work, even if that requires some sort of deus ex machina. My intuition says that highly recursive narrow AI style techniques should give you AGI, but to some extent this does go against e.g. the position of many philosophers of mind, and in this case I hope they’re right. Trying to imagine intermediate scenarios led me to think about this kinda stuff.
It would of course be incredibly foolish to entirely write off worlds where AGI is relatively easy, but I also think we should think about cases where for whatever reason that isn’t the case, and if it’s not the case then SingInst is in a uniquely good position to build uFAI.
I apologize for asking, but I just want to clarify something. When you write ‘deus ex machina’, you’re not solely using the term in a metaphorical sort of way, are you? Because, if you mean what it sort of sounds like you mean, at least some of your public positions suddenly make a lot more sense.
Yes, literal deus ex machina is one scenario which I find plausible.
Another way in which decision theoretic insights may be harmful is if they increase the sophistication of UFAI and allow them to control less sophisticated AGIs in other universes.
I’m trying to avoid being too confrontational, which might backfire, or I might be wrong myself. It seems safer to just push them to be more strategic and either see the danger themselves or explain why it’s a good idea despite the dangers.