Ramana Kumar comments on Where I agree and disagree with Eliezer

Ramana Kumar 23 Jun 2022 9:58 UTC
LW: 13 AF: 10
4
AF
I basically agree with you. I think you go too far in saying Lethailty 19 is solved, though. Using the 3 feats from your linked comment, which I’ll summarise as “produce a mind that...”:
1. cares about something
2. cares about something external (not shallow function of local sensory data)
3. cares about something specific and external
(clearly each one is strictly harder than the previous) I recognise that Lethality 19 concerns feat 3, though it is worded as if being about both feat 2 and feat 3.
I think I need to distinguish two versions of feat 3:
1. there is a reliable (and maybe predictable) mapping between the specific targets of caring and the mind-producing process
2. there is a principal who gets to choose what the specific targets of caring are (and they succeed)
Humans show that feat 2 at least has been accomplished, but also 3a, as I take you to be pointing out. I maintain that 3b is not demonstrated by humans and is probably something we need.
- TurnTrout 26 Jun 2022 0:36 UTC
  LW: 4 AF: 3
  2
  AF Parent
  Hm. I feel confused about the importance of 3b as opposed to 3a. Here’s my first guess: Because we need to target the AI’s motivation in particular ways in order to align it with particular desired goals, it’s important for there not just to be a predictable mapping, but a flexibly steerable one, such that we can choose to steer towards “dog” or “rock” or “cheese wheels” or “cooperating with humans.”
  Is this close?
  - Ramana Kumar 27 Jun 2022 9:41 UTC
    LW: 1 AF: 1
    0
    AF Parent
    Yes that sounds right to me.