To rephrase, my prior is that LLM just predict next words (it is their only capability). I would be worried when a LLM does something else (though I think it cannot happen), that would be what I would call “misalignment”.
On the meantime , what I read a lot about people worrying about ChatGPT/Bing sounds just like anthropomorphizing the AI with the prior that it can be sentient, have “intents” / and to me it is just not right.
I am not sure to understand how having the ability to search the internet change dramatically that.
If a LLM, when p(next words) is too low, can ‴″decide‴′ to search the internet to have better inputs, I do not feel that it makes a change in what I say above.
I do not want to have a too long fruitless discussion, I think indeed that I have to continue to read some materials on AI safety to better understand what are your models , but at this stage to be honest I cannot help thinking that some comments or posts are made by people who lack some basic understanding about what a LLM is , which may result in anthropomorphizing AI more than it should be. It is very easy when you do not know what a LLM is to wonder for exemple “CHAT Gpt answered that, but he seems to say that to not hurt me, I wonder what does ChatGPT really think ?” and I typically think that this sentence makes no sense at all, because of what a LLM is.
The predicting next token thing is the output channel. Strictly logically speaking, this is independent of agenty-ness of the neural network. You can have anything, from a single rule-based table looking only at the previous token to a superintelligent agent, predicting the next token.
I’m not saying ChatGPT has thoughts or is sentient, but I’m saying that it trying to predict the next token doesn’t logically preclude either. If you lock me into a room and give me only a single output channel in which I can give probability distributions over the next token, and only a single input channel in which I can read text, then I will be an agent trying to predict the next token, and I will be sentient and have thoughts.
Plus, the comment you’re responding to gave an example of how you can use token prediction specifically to build other AIs. (You responded to the third paragraph, but not the second.)
I agree with your reasoning strictly logically speaking, but it seems to me that a LLM cannot be sentient or have thoughts, even theoritically, and the burden of proof seems strongly on the side of someone who would made opposite claims.
And for someone who do not know what is a LLM, it is of course easy to anthropomorphize the LLM for obvious reasons (it can be designed to sound sentient or to express ‘thoughts’), and it is my feeling that this post was a little bit about that.
Overall, I find the arguments that I received after my first comment more convincing in making me feel what could be the problem, than the original post.
As for the possibility of a LLM to accelerate scientific progress towards agentic AI, I am skeptical, but I may be lacking imagination.
And again, nothing in the exemples presented in the original post is related to this risk, It seems that people that are worried are more trying to find exemples where the “character” of the AI is strange (which in my opinion are mistaken worries due to anthropomorphization of the AI), rather than finding exemples where the AI is particularly “capable” in terms of generating powerful reasoning or impressive “new ideas” (maybe also because at this stage the best LLM are far from being there).
I agree with your reasoning strictly logically speaking, but it seems to me that a LLM cannot be sentient or have thoughts, even theoritically,
This seems not-obvious—ChatGPT is a neural network, and most philosophers and AI people do think that neural networks can be conscious if they run the right algorithm. (The fact that it’s a language model doesn’t seem very relevant here for the same reason as before; it’s just a statement about its final layer.)
(maybe also because at this stage the best LLM are far from being there).
I think the most important question is about where on a reasoning-capability scale you would put
GPT-2
ChatGPT/Bing
human-level intelligence
Opinions on this vary widely even between well informed people. E.g., if you think (1) is a 10, (2) an 11, and (3) a 100, you wouldn’t be worried. But if it’s 10 → 20 → 50, that’s a different story. I think it’s easy to underestimate how different other people’s intuitions are from yours. But depending on your intuitions, you could consider the dog thing as an example that Bing is capable of “powerful reasoning”.
I think that the “most” in the sentence “most philosophers and AI people do think that neurol networks can be conscious if they run the right algorithm” is an overstatement, though I do not know to what extent.
I have no strong view on that, primarly because I think I lack some deep ML knowledge (I would weigh far more the view of ML experts than the view of philosophers on this topic).
Anyway, even accepting that neural networks can be conscious with the right algorithm, I think I disagree about “the fact that it’s a language model doesn’t seem relevant”. In a LLM language is not only the final layer, you have also the fact that the aim of the algorithm is p(next words), so it is a specific kind of algorithms. My feeling is that a p(next words) algorithms cannot be sentient, and I think that most ML researchers would agree with that, though I am not sure.
I am also not sure about the “reasoning-capability” scale, even if a LLM is very close to human for most parts of conversations, or better than human for some specific tasks (i.e doing summaries, for exemple), that would not mean that it is close to do a scientific breakthrough (on that I basically agree with the comments of AcurB some posts above)
I think that the “most” in the sentence “most philosophers and AI people do think that neurol networks can be conscious if they run the right algorithm” is an overstatement, though I do not know to what extent.
It is probably an overstatement. At least among philosophers in the 2020 Philpapers survey, most of the relevant questions would put that at a large but sub-majority position: 52% embrace physicalism (which is probably an upper bound); 54% say uploading = death; and 39% “Accept or lean towards: future AI systems [can be conscious]”. So, it would be very hard to say that ‘most philosophers’ in this survey would endorse an artificial neural network with an appropriate scale/algorithm being conscious.
I know I said the intelligence scale is the crux, but now I think the real crux is what you said here:
In a LLM language is not only the final layer, you have also the fact that the aim of the algorithm is p(next words), so it is a specific kind of algorithms. My feeling is that a p(next words) algorithms cannot be sentient, and I think that most ML researchers would agree with that, though I am not sure.
Can you explain why you believe this? How does the output/training signal restrict the kind of algorithm that generates it? I feel like if you have novel thoughts, people here would be very interested in those, because most of them think we just don’t understand what happens inside the network at all, and that it could totally be an agent. (A mesa optimizer to use the technical term; an optimizer that appears as a result of gradient descent tweaking the model.)
The consciousness thing in particular is perhaps less relevant than functional restrictions.
There is a hypothetical example of simulating a ridiculous number of humans typing text and seeing what fraction of those people that type out the current text type out each next token. In the limit, this approaches the best possible text predictor. This would simulate a lot of consciousness.
Thank you for your answers.
Unfortunately I have to say that it did not help me so far to have a stronger feeling about ai safety.
(I feel very sympathetic with this post for example https://forum.effectivealtruism.org/posts/ST3JjsLdTBnaK46BD/how-i-failed-to-form-views-on-ai-safety-3 )
To rephrase, my prior is that LLM just predict next words (it is their only capability). I would be worried when a LLM does something else (though I think it cannot happen), that would be what I would call “misalignment”.
On the meantime , what I read a lot about people worrying about ChatGPT/Bing sounds just like anthropomorphizing the AI with the prior that it can be sentient, have “intents” / and to me it is just not right.
I am not sure to understand how having the ability to search the internet change dramatically that.
If a LLM, when p(next words) is too low, can ‴″decide‴′ to search the internet to have better inputs, I do not feel that it makes a change in what I say above.
I do not want to have a too long fruitless discussion, I think indeed that I have to continue to read some materials on AI safety to better understand what are your models , but at this stage to be honest I cannot help thinking that some comments or posts are made by people who lack some basic understanding about what a LLM is , which may result in anthropomorphizing AI more than it should be. It is very easy when you do not know what a LLM is to wonder for exemple “CHAT Gpt answered that, but he seems to say that to not hurt me, I wonder what does ChatGPT really think ?” and I typically think that this sentence makes no sense at all, because of what a LLM is.
The predicting next token thing is the output channel. Strictly logically speaking, this is independent of agenty-ness of the neural network. You can have anything, from a single rule-based table looking only at the previous token to a superintelligent agent, predicting the next token.
I’m not saying ChatGPT has thoughts or is sentient, but I’m saying that it trying to predict the next token doesn’t logically preclude either. If you lock me into a room and give me only a single output channel in which I can give probability distributions over the next token, and only a single input channel in which I can read text, then I will be an agent trying to predict the next token, and I will be sentient and have thoughts.
Plus, the comment you’re responding to gave an example of how you can use token prediction specifically to build other AIs. (You responded to the third paragraph, but not the second.)
Also, welcome to the forum!
Thank you,
I agree with your reasoning strictly logically speaking, but it seems to me that a LLM cannot be sentient or have thoughts, even theoritically, and the burden of proof seems strongly on the side of someone who would made opposite claims.
And for someone who do not know what is a LLM, it is of course easy to anthropomorphize the LLM for obvious reasons (it can be designed to sound sentient or to express ‘thoughts’), and it is my feeling that this post was a little bit about that.
Overall, I find the arguments that I received after my first comment more convincing in making me feel what could be the problem, than the original post.
As for the possibility of a LLM to accelerate scientific progress towards agentic AI, I am skeptical, but I may be lacking imagination.
And again, nothing in the exemples presented in the original post is related to this risk, It seems that people that are worried are more trying to find exemples where the “character” of the AI is strange (which in my opinion are mistaken worries due to anthropomorphization of the AI), rather than finding exemples where the AI is particularly “capable” in terms of generating powerful reasoning or impressive “new ideas” (maybe also because at this stage the best LLM are far from being there).
This seems not-obvious—ChatGPT is a neural network, and most philosophers and AI people do think that neural networks can be conscious if they run the right algorithm. (The fact that it’s a language model doesn’t seem very relevant here for the same reason as before; it’s just a statement about its final layer.)
I think the most important question is about where on a reasoning-capability scale you would put
GPT-2
ChatGPT/Bing
human-level intelligence
Opinions on this vary widely even between well informed people. E.g., if you think (1) is a 10, (2) an 11, and (3) a 100, you wouldn’t be worried. But if it’s 10 → 20 → 50, that’s a different story. I think it’s easy to underestimate how different other people’s intuitions are from yours. But depending on your intuitions, you could consider the dog thing as an example that Bing is capable of “powerful reasoning”.
I think that the “most” in the sentence “most philosophers and AI people do think that neurol networks can be conscious if they run the right algorithm” is an overstatement, though I do not know to what extent.
I have no strong view on that, primarly because I think I lack some deep ML knowledge (I would weigh far more the view of ML experts than the view of philosophers on this topic).
Anyway, even accepting that neural networks can be conscious with the right algorithm, I think I disagree about “the fact that it’s a language model doesn’t seem relevant”. In a LLM language is not only the final layer, you have also the fact that the aim of the algorithm is p(next words), so it is a specific kind of algorithms. My feeling is that a p(next words) algorithms cannot be sentient, and I think that most ML researchers would agree with that, though I am not sure.
I am also not sure about the “reasoning-capability” scale, even if a LLM is very close to human for most parts of conversations, or better than human for some specific tasks (i.e doing summaries, for exemple), that would not mean that it is close to do a scientific breakthrough (on that I basically agree with the comments of AcurB some posts above)
It is probably an overstatement. At least among philosophers in the 2020 Philpapers survey, most of the relevant questions would put that at a large but sub-majority position: 52% embrace physicalism (which is probably an upper bound); 54% say uploading = death; and 39% “Accept or lean towards: future AI systems [can be conscious]”. So, it would be very hard to say that ‘most philosophers’ in this survey would endorse an artificial neural network with an appropriate scale/algorithm being conscious.
I know I said the intelligence scale is the crux, but now I think the real crux is what you said here:
Can you explain why you believe this? How does the output/training signal restrict the kind of algorithm that generates it? I feel like if you have novel thoughts, people here would be very interested in those, because most of them think we just don’t understand what happens inside the network at all, and that it could totally be an agent. (A mesa optimizer to use the technical term; an optimizer that appears as a result of gradient descent tweaking the model.)
The consciousness thing in particular is perhaps less relevant than functional restrictions.
There is a hypothetical example of simulating a ridiculous number of humans typing text and seeing what fraction of those people that type out the current text type out each next token. In the limit, this approaches the best possible text predictor. This would simulate a lot of consciousness.