In your example that says “the AI will not have the values we describe, because the programmers have solved some hard problems” and “the AI doesn’t have the values we describe, because the computer can solve most problems” and “the AI can’t solve most problems, since the computer has a hard problem that is impossible to get right” and “the AI will not try to solve most problems (in fact it can’t solve most problems if you try), but the AI won’t want to solve all problems”. This would be a case where the AI is (and is) sufficiently superintelligent to solve most problems, but the human programmer is still trying to make this AI way or else it will fail to make this AI way or else it will crash or fail.
If you consider the statement “If someone were to describe themselves as an agent, then the notion of a self-modifying or self-improving agent should be internally consistent, even under the assumption that they are consistent.”
The problem with the above statement is that it seems highly misleading. If someone believes that there are some coherent agents, then they are mistaken, and we won’t be able to tell them apart.
I disagree with the claim that there are coherent agents. And I’ve had very good success with it. I can’t say that it’s only because I’m using the phrase “agreed, but not that.” And I’m sure that your definition of “rationality” isn’t consistent with reality. But, I’m worried that it doesn’t seem like a reasonable word for the job of making rational agents, for not even an example.
I don’t believe the question itself was about rational agents. There’s a good reason why Eliezer describes them as agents or agents with other characteristics, even though the question is not about their personality. And if Eliezer is arguing that some coherent agents are agents or their environment (let’s say Omega), I’d guess that’s true even if I don’t think they are agents or environment. And I think that’s the point of his conclusion.
My point is also that Eliezer’s meta-level arguments happen to have answers to questions that seem difficult to answer, even if they are answered by intuition or logical reasoning. For example, does your meta-level theory make a much stronger claim about the truth of a proposition than its premises?
For a while I’ve wondered, why do the comments feel a lot like replies? Well, I’m not a big fan of the former two and don’t generally see them as being aimed at a conversation instead, or at all, as they serve no filtering for your own preferences, so I’m mostly just an unsympathetic, unsympathetic guy.
Now, I have a pretty clear understanding of what they are about. I have a vague sense that they often feel like they’re “somewhat” antagonistic, and the way that I feel about it is closer to “somewhat”, or to “extremely” hostile. Sometimes I just want to feel like I’m in some kind of weird mental state. Like, maybe I feel like I’m in some kind of weird mental state, and a person who says things like that is hostile.
On the other other hand, I don’t think anything particularly weird is being meant as an insult. I’d get the same reaction when you say something obvious and have no intention of doing so, but I suspect there is some weird emotional machinery behind the “hurtful” reaction that feels that way to me.
(I personally have a vague sense that it’s the opposite reaction, and that it’s more of a feeling of being talked about as weird, possibly harmful, than it is. I think those two things should be correlated, but I don’t see it as an inherent property.)
It might be worthwhile to define what you mean with serious research if you want to optimize for making it easier.
In your example that says “the AI will not have the values we describe, because the programmers have solved some hard problems” and “the AI doesn’t have the values we describe, because the computer can solve most problems” and “the AI can’t solve most problems, since the computer has a hard problem that is impossible to get right” and “the AI will not try to solve most problems (in fact it can’t solve most problems if you try), but the AI won’t want to solve all problems”. This would be a case where the AI is (and is) sufficiently superintelligent to solve most problems, but the human programmer is still trying to make this AI way or else it will fail to make this AI way or else it will crash or fail.
Examples and definitions are two different things.
If you consider the statement “If someone were to describe themselves as an agent, then the notion of a self-modifying or self-improving agent should be internally consistent, even under the assumption that they are consistent.”
The problem with the above statement is that it seems highly misleading. If someone believes that there are some coherent agents, then they are mistaken, and we won’t be able to tell them apart.
I disagree with the claim that there are coherent agents. And I’ve had very good success with it. I can’t say that it’s only because I’m using the phrase “agreed, but not that.” And I’m sure that your definition of “rationality” isn’t consistent with reality. But, I’m worried that it doesn’t seem like a reasonable word for the job of making rational agents, for not even an example.
I don’t believe the question itself was about rational agents. There’s a good reason why Eliezer describes them as agents or agents with other characteristics, even though the question is not about their personality. And if Eliezer is arguing that some coherent agents are agents or their environment (let’s say Omega), I’d guess that’s true even if I don’t think they are agents or environment. And I think that’s the point of his conclusion.
My point is also that Eliezer’s meta-level arguments happen to have answers to questions that seem difficult to answer, even if they are answered by intuition or logical reasoning. For example, does your meta-level theory make a much stronger claim about the truth of a proposition than its premises?
It looks to me like your post isn’t a reply to mine but intended to be an answer to something else.
For a while I’ve wondered, why do the comments feel a lot like replies? Well, I’m not a big fan of the former two and don’t generally see them as being aimed at a conversation instead, or at all, as they serve no filtering for your own preferences, so I’m mostly just an unsympathetic, unsympathetic guy.
Now, I have a pretty clear understanding of what they are about. I have a vague sense that they often feel like they’re “somewhat” antagonistic, and the way that I feel about it is closer to “somewhat”, or to “extremely” hostile. Sometimes I just want to feel like I’m in some kind of weird mental state. Like, maybe I feel like I’m in some kind of weird mental state, and a person who says things like that is hostile.
On the other other hand, I don’t think anything particularly weird is being meant as an insult. I’d get the same reaction when you say something obvious and have no intention of doing so, but I suspect there is some weird emotional machinery behind the “hurtful” reaction that feels that way to me.
(I personally have a vague sense that it’s the opposite reaction, and that it’s more of a feeling of being talked about as weird, possibly harmful, than it is. I think those two things should be correlated, but I don’t see it as an inherent property.)