Lone Pine answers Does the Structure of an algorithm matter for AI Risk and/or consciousness?

Lone Pine Dec 4, 2021, 6:44 AM
4 points
0
In the paper “Reward is Enough”, it is argued that all AI is really RL, and that loss is the reward. This means that a language model has a goal function to predict the next word in a text. By this reasoning, your human-level RL system should be equivalent to your GPT-n system.

That said, my intuition tells me there should be some fundamental difference. It always seemed to me that NLP is the light side of the force and RL is the dark side. Giving AI a numerical goal? That’s how you get paperclips. Giving AI the ability to understand all of human thought and wisdom? That sounds like a better idea.

To give a model of how things could go wrong in your hypothetical, suppose that the RL system was misaligned in such a way that, when you give it a goal function like “predict the next word”, it builds a model of the entire planet and all of human society, and then conquers the world to get as much computing power as possible, all because it wants to be 99.9999% sure rather than 99.99% sure that it will predict the next word correctly. A GPT-n system is more chill, it wants to get the next word correct but it’s not a goal, more like an instinct.

However, I think you’re likely to be tempted to put a layer of RL on top of your GPT-n so it can act like an agent, and then we’re back where we started.
- JBlack Dec 5, 2021, 3:52 AM
  3 points
  0
  Parent
  I suspect the difference is mostly in what training opportunities are available, not what type of system is used internally.
  In principle, a strong NLP AI might learn some behaviour that manipulates humans. It’s just that in practice it is more difficult for it to do so, because in almost all of the training phase there is no interaction at all. The input is decoupled from its output, so there is no training signal to improve any ability to manipulate the input.
  In reality there are some side-channels that are interactive, such as selection of fine-tuning training based on human evaluation. A sufficiently powerful system might be able to learn enough from that to manipulate the world, but it seems much less likely than some other type of system with more interactive learning doing it first.

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer

Lone Pine answers Does the Structure of an algorithm matter for AI Risk and/​or consciousness?

Lone Pine answers Does the Structure of an algorithm matter for AI Risk and/or consciousness?