Stuart_Armstrong

Karma: 17,986

Consistencies as (meta-)preferences

Stuart_ArmstrongMay 3, 2021, 3:10 PM

17 points

0 comments3 min readLW link

Why unriggable almost implies uninfluenceable

Stuart_ArmstrongApr 9, 2021, 5:07 PM

11 points

0 comments4 min readLW link

A possible preference algorithm

Stuart_ArmstrongApr 8, 2021, 6:25 PM

22 points

0 comments4 min readLW link

If you don’t design for extrapolation, you’ll extrapolate poorly—possibly fatally

Stuart_ArmstrongApr 8, 2021, 6:10 PM

17 points

0 comments4 min readLW link

Which counterfactuals should an AI follow?

Stuart_ArmstrongApr 7, 2021, 4:47 PM

19 points

5 comments7 min readLW link

Toy model of preference, bias, and extra information

Stuart_ArmstrongMar 24, 2021, 10:14 AM

9 points

0 comments4 min readLW link

Preferences and biases, the information argument

Stuart_ArmstrongMar 23, 2021, 12:44 PM

14 points

5 comments1 min readLW link

Why sigmoids are so hard to predict

Stuart_ArmstrongMar 18, 2021, 6:21 PM

56 points

7 comments5 min readLW link

Connecting the good regulator theorem with semantics and symbol grounding

Stuart_ArmstrongMar 4, 2021, 2:35 PM

13 points

0 comments2 min readLW link

Cartesian frames as generalised models

Stuart_ArmstrongFeb 16, 2021, 4:09 PM

20 points

0 comments5 min readLW link

Generalised models as a category

Stuart_ArmstrongFeb 16, 2021, 4:08 PM

25 points

9 comments4 min readLW link

Counterfactual control incentives

Stuart_ArmstrongJan 21, 2021, 4:54 PM

21 points

10 comments9 min readLW link

Short summary of mAIry’s room

Stuart_ArmstrongJan 18, 2021, 6:11 PM

26 points

2 comments4 min readLW link

Syntax, semantics, and symbol grounding, simplified

Stuart_ArmstrongNov 23, 2020, 4:12 PM

30 points

4 comments9 min readLW link

The ethics of AI for the Routledge Encyclopedia of Philosophy

Stuart_ArmstrongNov 18, 2020, 5:55 PM

45 points

8 comments1 min readLW link

Extortion beats brinksmanship, but the audience matters

Stuart_ArmstrongNov 16, 2020, 9:13 PM

27 points

15 comments4 min readLW link

Humans are stunningly rational and stunningly irrational

Stuart_ArmstrongOct 23, 2020, 2:13 PM

23 points

4 comments2 min readLW link

Knowledge, manipulation, and free will

Stuart_ArmstrongOct 13, 2020, 5:47 PM

33 points

15 comments3 min readLW link

Dehumanisation errors

Stuart_ArmstrongSep 23, 2020, 9:51 AM

13 points

0 comments1 min readLW link

Anthropomorphisation vs value learning: type 1 vs type 2 errors

Stuart_ArmstrongSep 22, 2020, 10:46 AM

16 points

10 comments1 min readLW link