Chris van Merwijk comments on Natural Value Learning

Chris van Merwijk Mar 21, 2022, 4:28 PM
3 points
“I tend to think that learning and following the norms of a particular culture (further discussion) isn’t too hard a problem for an AGI which is motivated to do so”. If the AGI is motivated to do so then the value learning problem is already solved and nothing else matters (in particular my post becomes irrelevant), because indeed it can learn the further details in whichever way it wants. We somehow already managed to create an agent with an internal objective that points to Bedouin culture (human values), which is the whole/complete problem.

I could say more about the rest of your comment but just checking if the above changes your model of my model significantly?

Also regarding “I think I’m much more open-minded than you to …”: to be clear, I’m not at all convinced about this I’m open to this distinction not mattering at all. I hope I didn’t come accross as not open minded about this.
- Steven Byrnes Mar 21, 2022, 6:31 PM
  2 points
  Parent
  There’s sorta a use/mention distinction between:
  - An AGI with the motivation “I want to follow London cultural norms (whatever those are)”, versus
  - An AGI with the motivation “I want to follow the following 500 rules (avoid public nudity, speak English, don’t lick strangers, …), which by the way comprise London cultural norms as I understand them”
  Normally I think of “value learning” (or in this case, “norm learning”) as related to the second bullet point—i.e., the AI watches one or more people and learn their actual preferences and desires. I also had the impression that your OP was along the lines of the second (not first) bullet point.
  If that’s right, and if we figure out how to make an agent with the first-bullet-point motivation, then I wouldn’t say that “the value learning problem is already solved”, instead I would say that we have made great progress towards safe & beneficial AGI in a way that does not involve “solving value learning”. Instead the agent will hopefully go ahead and solve value learning all by itself.
  (I’m not confident that my definitions here are standard or correct, and I’m certainly oversimplifying in various ways.)

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer