Rohin Shah comments on Learning with catastrophes

Rohin Shah Jul 30, 2019, 9:10 PM
4 points
Paul explicitly writes that the oracle sees both observations and actions: ‘This oracle can be applied to arbitrary sequences of observations and actions […].’
I know; I’m asking how the oracle would have to work in practice. Presumably at some point we will want to actually run the “learning with catastrophes algorithm”, and it will need an oracle, and I’d like to know what needs to be true of the oracle.
This is also covered
Indeed, my point with that sentence was that it sounds like we are only trying to avoid catastrophes that could have been foreseen, as opposed to literally all catastrophes as the post suggests, which is why the next sentence is:
In the latter case, it sounds like we are trying to train “robust corrigibility” as opposed to “never letting a catastrophe happen”.
- rmoehn Jul 31, 2019, 12:17 AM
  1 point
  Parent
  “never letting a catastrophe happen” would incentivize the agent to spend a lot of resources on foreseeing catastrophes and building capacity to ward them off. This would distract from the agent’s main task. So we have to give the agent some slack. Is this what you’re getting at? The oracle needs to decide whether or not the agent can be held accountable for a catastrophe, but the article doesn’t say anything how it would do this?
  - Rohin Shah Jul 31, 2019, 4:23 PM
    3 points
    Parent
    The oracle needs to decide whether or not the agent can be held accountable for a catastrophe, but the article doesn’t say anything how it would do this?
    Yes, basically. I’m not saying the article should specify how the oracle should do this, I’m saying that it should flag this as a necessary property of the oracle (or argue why it is not a necessary property).
    - rmoehn Jul 31, 2019, 10:10 PM
      1 point
      Parent
      I agree.

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer