prase comments on asking an AI to make itself friendly

prase Jun 27, 2011, 9:37 AM
7 points

give it the temporary goal to always answer questions thruthfully as far as possible while admitting uncertainty

Questions can be interpreted in different ways. Especially considering your further suggestion to involve ethicists and philosophers, once someone asks whether “is it moral to nuke Pyongyang”, and I am far from sure you can prove that “yes” is not a truthful answer.

also give it the goal to not alter reality in any way besides answering questions.

Answers can be formulated creatively. “Either thirteen, or we may consider nuking Pyongyang” is a truthful answer to “how much is six plus seven”. Now this is trivial and unlikely to persuade anybody, but perhaps you can imagine far more creative works of sophistry on the output of a superintelligent AI.

ask it what it thinks would be the optimal definition of the goal of a friendly AI, from the point of view of humanity, accounting for things that humans are too stupid to see coming.

This is opaque. What exactly the question means? You have to specify optimal, and that’s the difficult thing. Unless you are very certain and strict about meaning of “optimal”, you may end up with arbitrary answer.

have a discussion between it and a group of ethicists/philosophers wherein both parties are encouraged to point out any flaws in the definition.

Given the history of moral philosophy, I wouldn’t trust a group of ethicists enough. Philosophers can be persuaded to defend a lot of atrocities.

have this go on for a long time until everyone (especially the AI, seeing as it is smarter than anyone else) is certain that there is no flaw in the definition and that it accounts for all kinds of ethical contingencies that might arise after the singularity.

How does the flaw detection process work? What does it mean to have a flaw in a definition?

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer