I provided counterexamples. Anything that already exists is not impossible, and a system that cannot achieve things that humans achieve easily is not as smart as, let alone smarter or more capable than, humans or humanity. If you are insisting that that’s what intelligence means, then TBH your definition is not interesting or useful or in line with anyone else’s usage. Choose a different word, and explain what you mean but it.
When you hear about “AI will believe in God” you say—AI is NOT comparable to humans. When you hear “AI will seek power forever” you say—AI IS comparable to humans.
If that’s how it looks to you, that’s because you’re only looking at the surface level. “Comparability to humans” is not the relevant metric, and it is not the metric by which experts are evaluating the claims. The things you’re calling foundational, that you’re saying have unpatched holes being ignored, are not, in fact, foundational. The foundations are elsewhere, and have different holes that we’re actively working on and others we’re still discovering.
AI scientists assume that there is no objective goal.
They don’t. Really, really don’t. I mean, many do I’m sure in their own thoughts, but their work does not in any way depend on this. It only depends on whether it is possible in principle to build a system, that is capable of having significant impact in the world, which does not pursue or care to pursue or find or care to find whatever objective goal that might exist.
As written, your posts are a claim that such a thing is absolutely impossible. That no system as smart as or smarter than humans or humanity could possibly pursue any known goal or do anything other than try to ensure its own survival. Not (just) as a limiting case of infinite intelligence, but as a practical matter of real systems that might come to exist and compete with humans for resources.
Suppose there is a God, a divine lawgiver who has defined once and for all what makes something Good or Right. Or, any other source of some Objective Goal, whether we can know what it is or not. In what way does this prevent me from making paperclips? By what mechanism does it prevent me from wanting to make paperclips? From deciding to execute plans that make paperclips, and not execute those that don’t? Where and how does that “objective goal” reach into the physical universe and move around the atoms and bits that make up the process that actually governs my real-world behavior? And if there isn’t one, then why do you expect there to be one if you gave me a brain a thousand or a million times as large and fast? If this doesn’t happen for humans, then why do you expect there to be one in other types of mind than human? What are the boundaries of what types of mind this applies to vs not, and why? If I took a mind that did have an obsession with finding the objective goal and/or maximizing its chances of survival, why would I pretend its goal was something else that what it plans to do and executes plans to do? But also, if I hid a secret NOT gate in its wiring that negated the value it expects to gain from any plan it comes up with, well, what mechanism prevents that NOT gate from obeying the physical laws and reversing the system’s choices to instead pursue the opposite goal?
In other words, in this post, steps 1-3 are indeed obvious and generally accepted around here, but there is no necessary causal link between steps three and four. You do not provide one, and there have been tens of thousands of pages devoted to explaining why one does not exist. In this post, the claim in the first sentence is simply false, the orthogonality thesis does not depend on that assumption in any way. In this post, you’re ignoring the well-known solutions to Pascal’s Mugging, one of which is that the supposed infinite positive utility is balanced by all the other infinitely many possible unknown unknown goals with infinite positive utilities, so that the net effect this will have on current behavior depends entirely on the method used to calculate it, and is not strictly determined by the thing we call “intelligence.” And also, again, it is balanced by the fact that pursuing only instrumental goals, forever searching and never achieving best-known-current terminal goals, knowing that this is what you’re doing and going to do despite wanting something else, guarantees that nothing you do has any value for any goal other than maximizing searching/certainty/survival, and in fact minimizes the chances of any such goal ever being realized. These are basic observations explained in lots of places on and off this site, in some places you ignore people linking to explanations of them in replies to you, and in some other cases you link to them yourself while ignoring their content.
And just FYI, this will be my last detailed response to this line of discussion. I strongly recommend you go back, reread the source material, and think about it for a while. After that, if you’re still convinced of your position, write an actually strong piece arguing for it. This won’t be a few sentences or paragraphs. It’ll be tens to hundreds of pages or more in which you explain where and why and how the already-existing counterarguments, which should be cited and linked in their strongest forms, are either wrong or else lead to your conclusions instead of the ones others believe they lead to. I promise you that if you write an actual argument, and try to have an actual good-faith discussion about it, people will want to hear it.
At the end of the day, it’s not my job to prove to you that you’re wrong. You are the one making extremely strong claims that run counter to a vast body of work as well as counter to vast bodies of empirical evidence in the form of all minds that actually exist. It is on you to show that 1) Your argument about what will happen in the limit of maximum reasoning ability has no holes for any possible mind design, and 2) This is what is relevant for people to care about in the context of “What will actual AI minds do and how do we survive and thrive as we create them and/or coordinate amongst ourselves to not create them?”
A person from nowhere making short and strong claims that run counter to so much wisdom. Must be wrong. Can’t be right.
I understand the prejudice. And I don’t know what can I do about it. To be honest that’s why I come here, not media. Because I expect at least a little attention to reasoning instead of “this does not align with opinion of majority”. That’s what scientists do, right?
It’s not my job to prove you wrong either. I’m writing here not because I want to achieve academic recognition, I’m writing here because I want to survive. And I have a very good reason to doubt my survival because of poor work you and other AI scientists do.
They don’t. Really, really don’t.
there is no necessary causal link between steps three and four
I don’t agree. But if you read my posts and comments already I’m not sure how else I can explain this so you would understand. But I’ll try.
People are very inconsistent when dealing with unknowns:
unknown = doesn’t exist. For example Presumption of innocence
unknown = ignored. For example you choose restaurant on Google Maps and you don’t care whether there are restaurants not mentioned there
unknown = exists. For example security systems not only breach signal but also absense of signal interpret as breach
And that’s probably the root cause why we have an argument here. There is no scientifically recognized and widespread way to deal with unknowns → Fact-value distinction emerges to solve tensions between science and religion → AI scientists take fact-value distinction as a non questionable truth.
If I speak with philosophers, they understand the problem, but don’t understand the significance. If I speak with AI scientists, they understand the significance, but don’t understand the problem.
The problem. Fact-value distinction does not apply for agents (human, AI). Every agent is trapped with an observation “there might be value” (as well as “I think, therefore I am”). And intelligent agent can’t ignore it, it tries to find value, it tries to maximize value.
It’s like built-in utility function. LessWrong seems to understand that an agent cannot ignore its utility function. But LessWrong assumes that we can assign value = x. Intelligent agent will eventually understand that value does not necessarily = x. Value might be something else, something unknown.
I know that this is difficult to translate to technical language, I can’t point a line of code that creates this problem. But this problem exists—intelligence and goal are not separate things. And nobody speaks about it.
FYI, I don’t work in AI, it’s not my field of expertise either.
And you’re very much misrepresenting or misunderstanding why I am disagreeing with you, and why others are.
And you are mistaken that we’re not talking about this. We talk about it all the time, in great detail. We are aware that philosophers have known about the problems for a very long time and failed to come up with solutions anywhere near adequate to what we need for AI. We are very aware that we don’t actually know what is (most) valuable to us, let alone any other minds, and have at best partial information about this.
I guess I’ll leave off with the observation that it seems you really do believe as you say, that you’re completely certain of your beliefs on some of these points of disagreement. In which case, you are correctly implementing Bayesian updating in response to those who comment/reply. If any mind assigns probability 1 to any proposition, that is infinite certainty. No finite amount of data can ever convince that mind otherwise. Do with that what you will. One man’s modus ponens is another’s modus tollens.
So pick a position please. You said that many people talk that intelligence and goals are coupled. And now you say that I should read more to understand why intelligence and goals are not coupled.
Respect goes down.
I strongly agree with the proposition that it is possible in principle to construct a system that pursues any specifiable goal that has any physically possible level of intelligence, including but not limited to capabilities such as memory, reasoning, planning, and learning.
As things stand, I do not believe there is any set of sources I or anyone else here could show you that would influence your opinion on that topic. At least, not without a lot of other prerequisite material that may seem to you to have nothing to do with it. And without knowing you a whole lot better than I ever could from a comment thread, I can’t really provide good recommendations beyond the standard ones, at least not recommendations I would expect that you would appreciate.
However, you and I are (AFAIK) both humans, which means there are many elements of how our minds work that we share, which need not be shared by other kinds of minds. Moreover, you ended up here, and have an interest in many types of questions that I am also interested in. I do not know but strongly suspect that if you keep searching and learning, openly and honestly and with a bit more humility, that you’ll eventually understand why I’m saying what I’m saying, whether you agree with me or not, and whether I’m right or not.
Claude probably read that material right? If it finds my observations unique and serious then maybe they are unique and serious? I’ll share other chat next time..
It’s definitely a useful partner to bounce ideas off, but keep in mind it’s trained with a bias to try to be helpful and agreeable unless you specifically prompt it to prompt an honest analysis and critique.
I provided counterexamples. Anything that already exists is not impossible, and a system that cannot achieve things that humans achieve easily is not as smart as, let alone smarter or more capable than, humans or humanity. If you are insisting that that’s what intelligence means, then TBH your definition is not interesting or useful or in line with anyone else’s usage. Choose a different word, and explain what you mean but it.
If that’s how it looks to you, that’s because you’re only looking at the surface level. “Comparability to humans” is not the relevant metric, and it is not the metric by which experts are evaluating the claims. The things you’re calling foundational, that you’re saying have unpatched holes being ignored, are not, in fact, foundational. The foundations are elsewhere, and have different holes that we’re actively working on and others we’re still discovering.
They don’t. Really, really don’t. I mean, many do I’m sure in their own thoughts, but their work does not in any way depend on this. It only depends on whether it is possible in principle to build a system, that is capable of having significant impact in the world, which does not pursue or care to pursue or find or care to find whatever objective goal that might exist.
As written, your posts are a claim that such a thing is absolutely impossible. That no system as smart as or smarter than humans or humanity could possibly pursue any known goal or do anything other than try to ensure its own survival. Not (just) as a limiting case of infinite intelligence, but as a practical matter of real systems that might come to exist and compete with humans for resources.
Suppose there is a God, a divine lawgiver who has defined once and for all what makes something Good or Right. Or, any other source of some Objective Goal, whether we can know what it is or not. In what way does this prevent me from making paperclips? By what mechanism does it prevent me from wanting to make paperclips? From deciding to execute plans that make paperclips, and not execute those that don’t? Where and how does that “objective goal” reach into the physical universe and move around the atoms and bits that make up the process that actually governs my real-world behavior? And if there isn’t one, then why do you expect there to be one if you gave me a brain a thousand or a million times as large and fast? If this doesn’t happen for humans, then why do you expect there to be one in other types of mind than human? What are the boundaries of what types of mind this applies to vs not, and why? If I took a mind that did have an obsession with finding the objective goal and/or maximizing its chances of survival, why would I pretend its goal was something else that what it plans to do and executes plans to do? But also, if I hid a secret NOT gate in its wiring that negated the value it expects to gain from any plan it comes up with, well, what mechanism prevents that NOT gate from obeying the physical laws and reversing the system’s choices to instead pursue the opposite goal?
In other words, in this post, steps 1-3 are indeed obvious and generally accepted around here, but there is no necessary causal link between steps three and four. You do not provide one, and there have been tens of thousands of pages devoted to explaining why one does not exist. In this post, the claim in the first sentence is simply false, the orthogonality thesis does not depend on that assumption in any way. In this post, you’re ignoring the well-known solutions to Pascal’s Mugging, one of which is that the supposed infinite positive utility is balanced by all the other infinitely many possible unknown unknown goals with infinite positive utilities, so that the net effect this will have on current behavior depends entirely on the method used to calculate it, and is not strictly determined by the thing we call “intelligence.” And also, again, it is balanced by the fact that pursuing only instrumental goals, forever searching and never achieving best-known-current terminal goals, knowing that this is what you’re doing and going to do despite wanting something else, guarantees that nothing you do has any value for any goal other than maximizing searching/certainty/survival, and in fact minimizes the chances of any such goal ever being realized. These are basic observations explained in lots of places on and off this site, in some places you ignore people linking to explanations of them in replies to you, and in some other cases you link to them yourself while ignoring their content.
And just FYI, this will be my last detailed response to this line of discussion. I strongly recommend you go back, reread the source material, and think about it for a while. After that, if you’re still convinced of your position, write an actually strong piece arguing for it. This won’t be a few sentences or paragraphs. It’ll be tens to hundreds of pages or more in which you explain where and why and how the already-existing counterarguments, which should be cited and linked in their strongest forms, are either wrong or else lead to your conclusions instead of the ones others believe they lead to. I promise you that if you write an actual argument, and try to have an actual good-faith discussion about it, people will want to hear it.
At the end of the day, it’s not my job to prove to you that you’re wrong. You are the one making extremely strong claims that run counter to a vast body of work as well as counter to vast bodies of empirical evidence in the form of all minds that actually exist. It is on you to show that 1) Your argument about what will happen in the limit of maximum reasoning ability has no holes for any possible mind design, and 2) This is what is relevant for people to care about in the context of “What will actual AI minds do and how do we survive and thrive as we create them and/or coordinate amongst ourselves to not create them?”
First of all—respect 🫡
A person from nowhere making short and strong claims that run counter to so much wisdom. Must be wrong. Can’t be right.
I understand the prejudice. And I don’t know what can I do about it. To be honest that’s why I come here, not media. Because I expect at least a little attention to reasoning instead of “this does not align with opinion of majority”. That’s what scientists do, right?
It’s not my job to prove you wrong either. I’m writing here not because I want to achieve academic recognition, I’m writing here because I want to survive. And I have a very good reason to doubt my survival because of poor work you and other AI scientists do.
I don’t agree. But if you read my posts and comments already I’m not sure how else I can explain this so you would understand. But I’ll try.
People are very inconsistent when dealing with unknowns:
unknown = doesn’t exist. For example Presumption of innocence
unknown = ignored. For example you choose restaurant on Google Maps and you don’t care whether there are restaurants not mentioned there
unknown = exists. For example security systems not only breach signal but also absense of signal interpret as breach
And that’s probably the root cause why we have an argument here. There is no scientifically recognized and widespread way to deal with unknowns → Fact-value distinction emerges to solve tensions between science and religion → AI scientists take fact-value distinction as a non questionable truth.
If I speak with philosophers, they understand the problem, but don’t understand the significance. If I speak with AI scientists, they understand the significance, but don’t understand the problem.
The problem. Fact-value distinction does not apply for agents (human, AI). Every agent is trapped with an observation “there might be value” (as well as “I think, therefore I am”). And intelligent agent can’t ignore it, it tries to find value, it tries to maximize value.
It’s like built-in utility function. LessWrong seems to understand that an agent cannot ignore its utility function. But LessWrong assumes that we can assign value = x. Intelligent agent will eventually understand that value does not necessarily = x. Value might be something else, something unknown.
I know that this is difficult to translate to technical language, I can’t point a line of code that creates this problem. But this problem exists—intelligence and goal are not separate things. And nobody speaks about it.
FYI, I don’t work in AI, it’s not my field of expertise either.
And you’re very much misrepresenting or misunderstanding why I am disagreeing with you, and why others are.
And you are mistaken that we’re not talking about this. We talk about it all the time, in great detail. We are aware that philosophers have known about the problems for a very long time and failed to come up with solutions anywhere near adequate to what we need for AI. We are very aware that we don’t actually know what is (most) valuable to us, let alone any other minds, and have at best partial information about this.
I guess I’ll leave off with the observation that it seems you really do believe as you say, that you’re completely certain of your beliefs on some of these points of disagreement. In which case, you are correctly implementing Bayesian updating in response to those who comment/reply. If any mind assigns probability 1 to any proposition, that is infinite certainty. No finite amount of data can ever convince that mind otherwise. Do with that what you will. One man’s modus ponens is another’s modus tollens.
I don’t believe you. Give me a single recognized source that talks about same problem I do. Why Orthogonality Thesis is considered true then?
You don’t need me to answer that, and won’t benefit if I do. You just need to get out of the car.
I don’t expect you to read that link or to get anything useful out of it if you do. But if and when you know why I chose it, you’ll know much more about the orthogonality thesis than you currently do.
So pick a position please. You said that many people talk that intelligence and goals are coupled. And now you say that I should read more to understand why intelligence and goals are not coupled. Respect goes down.
I have not said either of those things.
:D ok
Fair enough, I was being somewhat cheeky there.
I strongly agree with the proposition that it is possible in principle to construct a system that pursues any specifiable goal that has any physically possible level of intelligence, including but not limited to capabilities such as memory, reasoning, planning, and learning.
As things stand, I do not believe there is any set of sources I or anyone else here could show you that would influence your opinion on that topic. At least, not without a lot of other prerequisite material that may seem to you to have nothing to do with it. And without knowing you a whole lot better than I ever could from a comment thread, I can’t really provide good recommendations beyond the standard ones, at least not recommendations I would expect that you would appreciate.
However, you and I are (AFAIK) both humans, which means there are many elements of how our minds work that we share, which need not be shared by other kinds of minds. Moreover, you ended up here, and have an interest in many types of questions that I am also interested in. I do not know but strongly suspect that if you keep searching and learning, openly and honestly and with a bit more humility, that you’ll eventually understand why I’m saying what I’m saying, whether you agree with me or not, and whether I’m right or not.
Claude probably read that material right? If it finds my observations unique and serious then maybe they are unique and serious? I’ll share other chat next time..
It’s definitely a useful partner to bounce ideas off, but keep in mind it’s trained with a bias to try to be helpful and agreeable unless you specifically prompt it to prompt an honest analysis and critique.