kwiat.dev

Karma: 251

[Question] How to validate research ideas?

kwiat.devJun 4, 2020, 9:37 PM

12 points

2 comments1 min readLW link

kwiat.dev Jun 4, 2020, 9:25 PM
4 points
0
on: Ariel Kwiatkowski’s Shortform
Looking for research idea feedback:
Learning to manipulate: consider a system with a large population of agents working on a certain goal, either learned or rule-based, but at this point—fixed. This could be an environment of ants using pheromones to collect food and bring it home.
Now add another agent (or some number of them) which learns in this environment, and tries to get other agents to instead fulfil a different goal. It could be ants redirecting others to a different “home”, hijacking their work.

Does this sound interesting? If it works, would it potentially be publishable as a research paper? (or at least a post on LW) Any other feedback is welcome!

kwiat.dev May 30, 2020, 8:02 PM
7 points
0
in reply to: Draconarius’s comment on: Draconarius’s Shortform
But isn’t the whole point that the hotel is full initially, and yet can accept more guests?

kwiat.dev May 30, 2020, 7:58 PM
11 points
0
on: Ariel Kwiatkowski’s Shortform
Has anyone tried to work with neural networks predicting the weights of other neural networks? I’m thinking about that in the context of something like subsystem alignment, e.g. in an RL setting where an agent first learns about the environment, and then creates the subagent (by outputting the weights or some embedding of its policy) who actually obtains some reward

Ariel Kwiatkowski’s Shortform

kwiat.devMay 30, 2020, 7:58 PM

2 points

4 comments LW link

kwiat.dev May 16, 2020, 9:49 PM
3 points
0
on: Multi-agent safety
This reminds me of an idea bouncing around my mind recently, admittedly not aiming to solve this problem, but possibly exhibiting it.
Drawing inspiration from human evolution, then given a sufficiently rich environment where agents have some necessities for surviving (like gathering food), they could be pretrained with something like a survival prior which doesn’t require any specific reward signals.
Then, agents produced this way could be fine-tuned for downstream tasks, or in a way obeying orders. The problem would arise when an agent is given an order that results in its death. We might want to ensure it follows its original (survival) instinct, unless overridden by a more specific order.
And going back to a multiagent scenario, similar issues might arise when the order would require antisocial behavior in a usually cooperative environment. The AI Economist comes to mind where that could come into play, where agents actually learn some nontrivial social relations https://blog.einstein.ai/the-ai-economist/

[Question] How to choose a PhD with AI Safety in mind

kwiat.devMay 15, 2020, 10:19 PM

10 points

1 comment1 min readLW link

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer