I didn’t provide all of the evidence that an AI is possible, just one strong piece. All the evidence, plus a good prior for how likely the AI is to turn us into more useful matter, should be enough to convince even a skeptic. However, the brain-as-proof-of-concept idea is really strong: try and formulate an argument against that position.
Unless they’re a skeptic like A above, or they’re an “UFAI-denier” (in the style of climate change deniers) posing as a skeptic, or they privilege what they want to believe over what they ought to believe. There are probably half a dozen more failure modes I haven’t spotted.
Sounds like a conversational disconnect to me, then: at least, going back through the sequence of comments, it seems the sequence began with an expression of skepticism of the claim that “a donation to the Singularity Institute is the most efficient charitable investment,” and ended with a presentation of an argument that UFAI is both possible and more likely than FAI.
Thanks for clarifying.
Just to pre-emptively avoid being misunderstood myself, since I have stepped into what may well be a minefield of overinterpretation, let me state some of my own related beliefs: I consider human-level, human-produced AGI possible (confidence level ~1) within the next century (C ~.85-.99, depending on just what “human-level” means and assuming we continue to work on the problem), likely not within the next 30 years (C<.15-.5, depending as above). I consider self-improving AGI and associated FOOM, given human-level AGI, a great big question mark: I’d say >99% of HLAGIs we develop will be architected in such a way that significant self-improvement is unlikely (much as our own architectures make it unlikely for us), but the important question is whether the actual number of exceptions is 0 or 1, and I have no confidence in my intuitions about that (see my comments elsewhere about expected results based on small probabilities of large magnitudes). I consider UFAI given self-improving AGI practically a certainty: >99% of SIAGIs will be UFAIs, and again the important question is whether the number of exceptions is 0 or 1, and whether the exception comes first. (The same thing is true about non-SI AGIs, but I care about that less.) Whether SIAI can influence that last question at all, and if so by how much and in what direction, I haven’t a clue about; if I wanted to develop an opinion about that I’d have to look into what SIAI actually does day-to-day.
If any of that is symptomatic of fallacy, I’d appreciate having it pointed out, though of course nobody is under any obligation to do so.
There’s an argument chain I didn’t make clear; “If UFAI is both more possible and more likely than FAI, then influencing this in favour of FAI is a critical goal” and “SIAI is the most effective charity working towards this goal”.
The only part I would inquire about is
I’d say >99% of HLAGIs we develop will be architected in such a way that significant self-improvement is unlikely (much as our own architectures make it unlikely for us),
Humans don’t have the ability to self-modify (at least, our neuroscience is too underdeveloped to count for that yet) but AGIs will probably be made from explicit programming code, and will probably have some level of command over programming code (it seems like one of the ways in which it would be expected to interact with the world, creating code that achieves its goals). So its architecture is more conducive to self-modification (and hence self-improvement) than ours is.
Of course, a more developed point is that humans are very likely to build a fixed AGI if they can. If you’re making that point, and not that AGIs simply won’t self-improve, then I see no issues.
Re: argument chain… I agree that those claims are salient.
Observations that differentially support those claims are also salient, of course, which is what I understood XiXiDu to be asking for, which is why I asked you initially to clarify what you thought you were providing.
Re: self-improvement… I agree that AGIs will be better-suited to modify code than humans are to modify neurons, both in terms of physical access and in terms of a functional understanding of what that code does.
I also think that if humans did have the equivalent ability to mess with their own neurons, >99% of us would either wirehead or accidentally self-lobotomize rather than successfully self-optimize.
I don’t think the reason for that is primarily in how difficult human brains are to optimize, because humans are also pretty dreadful at optimizing systems other than human brains. I think the problem is primarily in how bad human brains are at optimizing. (While still being way better at it than their competition.)
That is, the reasons have to do with our patterns of cognition and behavior, which are as much a part of our architecture as is the fact that our fingers can’t rewire our neural circuits.
Of course, maybe human-level AGIs would be way way better at this than humans would. But if so, it wouldn’t be just because they can write their own cognitive substrate, it would also be because their patterns of cognition and behavior were better suited for self-optimization.
I’m curious as to your estimate of what % of HLAGIs will successfully self-improve?
I’m curious as to your estimate of what % of HLAGIs will successfully self-improve?
I guess all AGIs that aren’t explicity forbidden to will self-modify (75%); self-modification will mostly start with a backup (code has this option) (95%), and maybe half the methods of backup/compare will approve improvements and throw out undesirable changes.
So 35% will self-improve successfully. I also estimate that humans will keep making AGIs until they get one that self-improves.
I guess all AGIs that aren’t explicity forbidden to will self-modify (75%); self-modification will mostly start with a backup (code has this option) (95%), and maybe half the methods of backup/compare will approve improvements and throw out undesirable changes.
Really? This seems to ignore that certain structures will have a lot of trouble self-modifying. For example, consider an AI that is a hard-encoded silicon chip with a fixed amount of RAM. Unless it is already very clever, there’s no way it can self-improve.
This actually illustrates nicely some issues with the whole notion of “self-improving.”
Suppose Sally is an AI on a hard-encoded silicon chip with fixed RAM. One day Sally is given the job of establishing a policy to control resource allocation at the Irrelevant Outputs factory, and concludes that the most efficient mechanism for doing so is to implement in software on the IO network the same algorithms that its own silicon chip implements in hardware, so it does so.
The program Sally just wrote can be thought of as a version of Sally that is not constrained to a particular silicon chip. (It probably also runs much slower, though that’s not entirely clear.)
In this scenario, is Sally self-modifying? Is it self-improving? I’m not even sure those are the right questions.
Hard-coding onto chips, or even making specific structures electromechanical in nature, is one way of how humans would achieve “explicitly forbidden to self-modify” in AIs. I estimated that one in every four AGI projects will desire to forbid their project from self-modification. I thought this was optimistic; I haven’t seen any discussion of fixed AGI. Although maybe that might be something military research and development is interested in.
I didn’t provide all of the evidence that an AI is possible, just one strong piece. All the evidence, plus a good prior for how likely the AI is to turn us into more useful matter, should be enough to convince even a skeptic. However, the brain-as-proof-of-concept idea is really strong: try and formulate an argument against that position.
Unless they’re a skeptic like A above, or they’re an “UFAI-denier” (in the style of climate change deniers) posing as a skeptic, or they privilege what they want to believe over what they ought to believe. There are probably half a dozen more failure modes I haven’t spotted.
Sounds like a conversational disconnect to me, then: at least, going back through the sequence of comments, it seems the sequence began with an expression of skepticism of the claim that “a donation to the Singularity Institute is the most efficient charitable investment,” and ended with a presentation of an argument that UFAI is both possible and more likely than FAI.
Thanks for clarifying.
Just to pre-emptively avoid being misunderstood myself, since I have stepped into what may well be a minefield of overinterpretation, let me state some of my own related beliefs: I consider human-level, human-produced AGI possible (confidence level ~1) within the next century (C ~.85-.99, depending on just what “human-level” means and assuming we continue to work on the problem), likely not within the next 30 years (C<.15-.5, depending as above). I consider self-improving AGI and associated FOOM, given human-level AGI, a great big question mark: I’d say >99% of HLAGIs we develop will be architected in such a way that significant self-improvement is unlikely (much as our own architectures make it unlikely for us), but the important question is whether the actual number of exceptions is 0 or 1, and I have no confidence in my intuitions about that (see my comments elsewhere about expected results based on small probabilities of large magnitudes). I consider UFAI given self-improving AGI practically a certainty: >99% of SIAGIs will be UFAIs, and again the important question is whether the number of exceptions is 0 or 1, and whether the exception comes first. (The same thing is true about non-SI AGIs, but I care about that less.) Whether SIAI can influence that last question at all, and if so by how much and in what direction, I haven’t a clue about; if I wanted to develop an opinion about that I’d have to look into what SIAI actually does day-to-day.
If any of that is symptomatic of fallacy, I’d appreciate having it pointed out, though of course nobody is under any obligation to do so.
There’s an argument chain I didn’t make clear; “If UFAI is both more possible and more likely than FAI, then influencing this in favour of FAI is a critical goal” and “SIAI is the most effective charity working towards this goal”.
The only part I would inquire about is
Humans don’t have the ability to self-modify (at least, our neuroscience is too underdeveloped to count for that yet) but AGIs will probably be made from explicit programming code, and will probably have some level of command over programming code (it seems like one of the ways in which it would be expected to interact with the world, creating code that achieves its goals). So its architecture is more conducive to self-modification (and hence self-improvement) than ours is.
Of course, a more developed point is that humans are very likely to build a fixed AGI if they can. If you’re making that point, and not that AGIs simply won’t self-improve, then I see no issues.
Re: argument chain… I agree that those claims are salient.
Observations that differentially support those claims are also salient, of course, which is what I understood XiXiDu to be asking for, which is why I asked you initially to clarify what you thought you were providing.
Re: self-improvement… I agree that AGIs will be better-suited to modify code than humans are to modify neurons, both in terms of physical access and in terms of a functional understanding of what that code does.
I also think that if humans did have the equivalent ability to mess with their own neurons, >99% of us would either wirehead or accidentally self-lobotomize rather than successfully self-optimize.
I don’t think the reason for that is primarily in how difficult human brains are to optimize, because humans are also pretty dreadful at optimizing systems other than human brains. I think the problem is primarily in how bad human brains are at optimizing. (While still being way better at it than their competition.)
That is, the reasons have to do with our patterns of cognition and behavior, which are as much a part of our architecture as is the fact that our fingers can’t rewire our neural circuits.
Of course, maybe human-level AGIs would be way way better at this than humans would. But if so, it wouldn’t be just because they can write their own cognitive substrate, it would also be because their patterns of cognition and behavior were better suited for self-optimization.
I’m curious as to your estimate of what % of HLAGIs will successfully self-improve?
I guess all AGIs that aren’t explicity forbidden to will self-modify (75%); self-modification will mostly start with a backup (code has this option) (95%), and maybe half the methods of backup/compare will approve improvements and throw out undesirable changes.
So 35% will self-improve successfully. I also estimate that humans will keep making AGIs until they get one that self-improves.
Really? This seems to ignore that certain structures will have a lot of trouble self-modifying. For example, consider an AI that is a hard-encoded silicon chip with a fixed amount of RAM. Unless it is already very clever, there’s no way it can self-improve.
This actually illustrates nicely some issues with the whole notion of “self-improving.”
Suppose Sally is an AI on a hard-encoded silicon chip with fixed RAM. One day Sally is given the job of establishing a policy to control resource allocation at the Irrelevant Outputs factory, and concludes that the most efficient mechanism for doing so is to implement in software on the IO network the same algorithms that its own silicon chip implements in hardware, so it does so.
The program Sally just wrote can be thought of as a version of Sally that is not constrained to a particular silicon chip. (It probably also runs much slower, though that’s not entirely clear.)
In this scenario, is Sally self-modifying? Is it self-improving? I’m not even sure those are the right questions.
Hard-coding onto chips, or even making specific structures electromechanical in nature, is one way of how humans would achieve “explicitly forbidden to self-modify” in AIs. I estimated that one in every four AGI projects will desire to forbid their project from self-modification. I thought this was optimistic; I haven’t seen any discussion of fixed AGI. Although maybe that might be something military research and development is interested in.
My point was that even in some cases where people aren’t thinking about self-modification, self-modification won’t happen by default.