Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
tailcalled
Karma:
7,456
All
Posts
Comments
New
Top
Old
Page
1
Evolution’s selection target depends on your weighting
tailcalled
19 Nov 2024 18:24 UTC
23
points
22
comments
1
min read
LW
link
Empathy/Systemizing Quotient is a poor/biased model for the autism/sex link
tailcalled
4 Nov 2024 21:11 UTC
33
points
0
comments
7
min read
LW
link
Binary encoding as a simple explicit construction for superposition
tailcalled
12 Oct 2024 21:18 UTC
12
points
0
comments
1
min read
LW
link
Rationalist Gnosticism
tailcalled
10 Oct 2024 9:06 UTC
9
points
10
comments
3
min read
LW
link
RLHF is the worst possible thing done when facing the alignment problem
tailcalled
19 Sep 2024 18:56 UTC
32
points
10
comments
6
min read
LW
link
[Question]
Does life actually locally *increase* entropy?
tailcalled
16 Sep 2024 20:30 UTC
10
points
27
comments
1
min read
LW
link
Why I’m bearish on mechanistic interpretability: the shards are not in the network
tailcalled
13 Sep 2024 17:09 UTC
19
points
40
comments
1
min read
LW
link
In defense of technological unemployment as the main AI concern
tailcalled
27 Aug 2024 17:58 UTC
44
points
36
comments
1
min read
LW
link
The causal backbone conjecture
tailcalled
17 Aug 2024 18:50 UTC
26
points
0
comments
2
min read
LW
link
Rationalists are missing a core piece for agent-like structure (energy vs information overload)
tailcalled
17 Aug 2024 9:57 UTC
59
points
9
comments
4
min read
LW
link
[LDSL#6] When is quantification needed, and when is it hard?
tailcalled
13 Aug 2024 20:39 UTC
31
points
0
comments
2
min read
LW
link
[LDSL#5] Comparison and magnitude/diminishment
tailcalled
12 Aug 2024 18:47 UTC
21
points
0
comments
2
min read
LW
link
[LDSL#4] Root cause analysis versus effect size estimation
tailcalled
11 Aug 2024 16:12 UTC
29
points
0
comments
2
min read
LW
link
[LDSL#3] Information-orientation is in tension with magnitude-orientation
tailcalled
10 Aug 2024 21:58 UTC
22
points
2
comments
3
min read
LW
link
[LDSL#2] Latent variable models, network models, and linear diffusion of sparse lognormals
tailcalled
9 Aug 2024 19:57 UTC
23
points
2
comments
3
min read
LW
link
[LDSL#1] Performance optimization as a metaphor for life
tailcalled
8 Aug 2024 16:16 UTC
31
points
4
comments
5
min read
LW
link
[LDSL#0] Some epistemological conundrums
tailcalled
7 Aug 2024 19:52 UTC
49
points
10
comments
10
min read
LW
link
Yann LeCun: We only design machines that minimize costs [therefore they are safe]
tailcalled
15 Jun 2024 17:25 UTC
19
points
8
comments
1
min read
LW
link
(twitter.com)
DPO/PPO-RLHF on LLMs incentivizes sycophancy, exaggeration and deceptive hallucination, but not misaligned powerseeking
tailcalled
10 Jun 2024 21:20 UTC
29
points
13
comments
2
min read
LW
link
Each Llama3-8b text uses a different “random” subspace of the activation space
tailcalled
22 May 2024 7:31 UTC
3
points
4
comments
7
min read
LW
link
Back to top
Next