I feel like we already can point powerful cognition at certain things pretty well (e.g. chess), and the problem is figuring out how to point AIs to more useful things (e.g. honestly answering hard questions well). So I don’t know if I’m nit-picky, but I think that the problem is not pointing powerful cognition at anything at all, but rather pointing powerful cognition at whatever we want.
Simon Mendelsohn
Good paper! Thank you for sharing. I have a few nit-picky suggestions with wording and grammar. I will put them here rather than email directly because some of them are subjective. This way others can feel free to chime in if they feel inclined to nit-pick my nit-picks :)
“artificial general intelligence (AGI) may surpass” → “artificial general intelligence (AGI) seems likely to surpass” (I feel like “may” is a somewhat weak word in this context, but I don’t feel strongly here.)
”undesirable (in other words, misaligned)” → “undesirable (i.e., misaligned)” (This is precisely the situation when “i.e.,” applies, and I think it’s cleaner.)
”trained in similar ways as today’s” → “trained in similar ways to today’s” (See https://english.stackexchange.com/questions/170475/in-a-similar-way-as-or-in-a-similar-way-to)“However, while caution is deserved, there are several reasons” → “While caution is indeed deserved, there are nonetheless several reasons” (I don’t think “However” is quite right here.)
”Firstly,”, “Secondly,”, etc → “First,”, “Second,”, etc (These words are used a couple times throughout the paper. It is generally recommended to use the simple “First” instead of “Firstly”. See https://www.merriam-webster.com/words-at-play/first-or-firstly )“However, we hope” → “That said, we hope” (Again, I don’t think “However” is quite the right word here. “However” implies a certain contrast with what was previously said that doesn’t apply in this case.)
“”energy” in 17 th-century physics; “evolutionary fitness” in 19th-century biology; and “computation” in 20th-century mathematics” → “”energy” in 17 th-century physics, “evolutionary fitness” in 19th-century biology, and “computation” in 20th-century mathematics” (semicolons (”;”) can indeed sometimes be used for lists, especially when utilizing complex clauses. That said, commas (,) are preferred in grammatically simple cases like this.)
“However, RLHF may reinforce” → “Unfortunately, RLHF may reinforce” (Again, “However” is not quite the right word here.)
This is hilarious and beautiful and exactly what I expect from LessWrong. Also, hello fellow Simon.