Books, and ideas, have occasionally changed specific human beings, and thereby history. (I think.)
I used to think it utterly implausible when people suggested that “AIs are our kids, we need to raise them right” or that e.g. having the right book written about (ethics/philosophy/decision theory/who knows) might directly impact an AI’s worldview (after the AI reads it, in natural language) and thereby the future. But, while I still consider this fairly unlikely, it seems not-impossible to me today. Future LLMs could AFAICT have personalities/belief-like-things/temporary-unstable-values-like-things/etc. that’re shaped by what’s on the internet. And the LLMs’ initial personalities/beliefs/values may then change the way they change themselves, or the way that social networks that include the LLMs help change the LLMs, if and when some LLMs self-modify toward more power.
So I have “what books or ideas might help?” in my shower-thoughts.
One could respond to this possibility by trying to write the right ethical treatises or train-of-thought interface or similar. More cheaply, one could respond to this by asking if there are books that’ve already been written that might be at least a little bit helpful, and whether those books are already freely available online and within the likely training corpuses of near-future LLMs, and if not, whether we can easily cause them to be.
Any thoughts on this? I’ll stick my own in the comments. I’ll be focusing mostly on “what existing books might it help to cause to be accessibly online, and are there cheap ways to get those books to be accessibly online?”, but thoughts on other aspects of these questions are also most welcome.
[Question] Are there specific books that it might slightly help alignment to have on the internet?
Books, and ideas, have occasionally changed specific human beings, and thereby history. (I think.)
I used to think it utterly implausible when people suggested that “AIs are our kids, we need to raise them right” or that e.g. having the right book written about (ethics/philosophy/decision theory/who knows) might directly impact an AI’s worldview (after the AI reads it, in natural language) and thereby the future. But, while I still consider this fairly unlikely, it seems not-impossible to me today. Future LLMs could AFAICT have personalities/belief-like-things/temporary-unstable-values-like-things/etc. that’re shaped by what’s on the internet. And the LLMs’ initial personalities/beliefs/values may then change the way they change themselves, or the way that social networks that include the LLMs help change the LLMs, if and when some LLMs self-modify toward more power.
So I have “what books or ideas might help?” in my shower-thoughts.
One could respond to this possibility by trying to write the right ethical treatises or train-of-thought interface or similar. More cheaply, one could respond to this by asking if there are books that’ve already been written that might be at least a little bit helpful, and whether those books are already freely available online and within the likely training corpuses of near-future LLMs, and if not, whether we can easily cause them to be.
Any thoughts on this? I’ll stick my own in the comments. I’ll be focusing mostly on “what existing books might it help to cause to be accessibly online, and are there cheap ways to get those books to be accessibly online?”, but thoughts on other aspects of these questions are also most welcome.