Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Coherence Arguments
Tag
Relevant
New
Old
There are no coherence theorems
Dan H
and
EJT
20 Feb 2023 21:25 UTC
145
points
124
comments
19
min read
LW
link
Coherent decisions imply consistent utilities
Eliezer Yudkowsky
12 May 2019 21:33 UTC
149
points
81
comments
26
min read
LW
link
3
reviews
Coherence arguments do not entail goal-directed behavior
Rohin Shah
3 Dec 2018 3:26 UTC
133
points
69
comments
7
min read
LW
link
3
reviews
[Question]
What do coherence arguments actually prove about agentic behavior?
sunwillrise
1 Jun 2024 9:37 UTC
123
points
35
comments
6
min read
LW
link
Coherence arguments imply a force for goal-directed behavior
KatjaGrace
26 Mar 2021 16:10 UTC
91
points
25
comments
11
min read
LW
link
1
review
(aiimpacts.org)
Coherence of Caches and Agents
johnswentworth
1 Apr 2024 23:04 UTC
76
points
9
comments
11
min read
LW
link
A Simple Toy Coherence Theorem
johnswentworth
and
David Lorell
2 Aug 2024 17:47 UTC
74
points
19
comments
7
min read
LW
link
When Most VNM-Coherent Preference Orderings Have Convergent Instrumental Incentives
TurnTrout
9 Aug 2021 17:22 UTC
53
points
4
comments
5
min read
LW
link
Counting-down vs. counting-up coherence
TsviBT
27 Feb 2023 14:59 UTC
29
points
4
comments
13
min read
LW
link
Contra “Strong Coherence”
DragonGod
4 Mar 2023 20:05 UTC
39
points
24
comments
1
min read
LW
link
[Question]
Is “Strong Coherence” Anti-Natural?
DragonGod
11 Apr 2023 6:22 UTC
23
points
25
comments
2
min read
LW
link
[Question]
Money Pump Arguments assume Memoryless Agents. Isn’t this Unrealistic?
Dalcy
16 Aug 2024 4:16 UTC
23
points
6
comments
1
min read
LW
link
The Impossibility of a Rational Intelligence Optimizer
Nicolas Villarreal
6 Jun 2024 16:14 UTC
−9
points
5
comments
14
min read
LW
link
Three ways that “Sufficiently optimized agents appear coherent” can be false
Wei Dai
5 Mar 2019 21:52 UTC
65
points
3
comments
3
min read
LW
link
Comment on Coherence arguments do not imply goal directed behavior
Ronny Fernandez
6 Dec 2019 9:30 UTC
30
points
8
comments
5
min read
LW
link
[Question]
Is there a “coherent decisions imply consistent utilities”-style argument for non-lexicographic preferences?
Tetraspace
29 Jun 2021 19:14 UTC
4
points
20
comments
1
min read
LW
link
The hot mess theory of AI misalignment: More intelligent agents behave less coherently
Jonathan Yan
10 Mar 2023 0:20 UTC
47
points
21
comments
1
min read
LW
link
(sohl-dickstein.github.io)
Deriving Conditional Expected Utility from Pareto-Efficient Decisions
Thomas Kwa
5 May 2022 3:21 UTC
24
points
1
comment
6
min read
LW
link
The “Measuring Stick of Utility” Problem
johnswentworth
25 May 2022 16:17 UTC
74
points
25
comments
3
min read
LW
link
[Question]
Why The Focus on Expected Utility Maximisers?
DragonGod
27 Dec 2022 15:49 UTC
116
points
84
comments
3
min read
LW
link
[Request for Distillation] Coherence of Distributed Decisions With Different Inputs Implies Conditioning
johnswentworth
25 Apr 2022 17:01 UTC
22
points
14
comments
2
min read
LW
link
Measuring Coherence and Goal-Directedness in RL Policies
dx26
22 Apr 2024 18:26 UTC
10
points
0
comments
7
min read
LW
link
Coherent behaviour in the real world is an incoherent concept
Richard_Ngo
11 Feb 2019 17:00 UTC
51
points
17
comments
9
min read
LW
link
Let’s look for coherence theorems
Valdes
7 May 2023 14:45 UTC
25
points
18
comments
6
min read
LW
link
[Linkpost] Will AI avoid exploitation?
cdkg
6 Aug 2023 14:28 UTC
22
points
1
comment
1
min read
LW
link
Do incoherent entities have stronger reason to become more coherent than less?
KatjaGrace
30 Jun 2021 5:50 UTC
46
points
5
comments
4
min read
LW
link
(worldspiritsockpuppet.com)
It Can’t Be Mesa-Optimizers All The Way Down (Or Else It Can’t Be Long-Term Supercoherence?)
Austin Witte
31 Mar 2023 7:21 UTC
20
points
5
comments
4
min read
LW
link
No comments.
Back to top