Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
GÖDEL GOING DOWN
Jimdrix_Hendri
6 Mar 2023 23:06 UTC
−9
points
3
comments
1
min read
LW
link
Against ubiquitous alignment taxes
beren
6 Mar 2023 19:50 UTC
57
points
10
comments
2
min read
LW
link
Addendum: basic facts about language models during training
beren
6 Mar 2023 19:24 UTC
22
points
2
comments
5
min read
LW
link
Understanding The Roots Of Mathematics Before Finding The Roots Of A Function.
LiesLaris
6 Mar 2023 18:47 UTC
2
points
0
comments
1
min read
LW
link
Discussion: LLaMA Leak & Whistleblowing in pre-AGI era
jirahim
6 Mar 2023 18:47 UTC
1
point
4
comments
1
min read
LW
link
[Question]
Are we too confident about unaligned AGI killing off humanity?
RomanS
6 Mar 2023 16:19 UTC
21
points
63
comments
1
min read
LW
link
Introducing Leap Labs, an AI interpretability startup
Jessica Rumbelow
6 Mar 2023 16:16 UTC
103
points
12
comments
1
min read
LW
link
Monthly Roundup #4: March 2023
Zvi
6 Mar 2023 14:10 UTC
31
points
0
comments
24
min read
LW
link
(thezvi.wordpress.com)
Fundamental Uncertainty: Chapter 6 - How can we be certain about the truth?
Gordon Seidoh Worley
6 Mar 2023 13:52 UTC
10
points
18
comments
16
min read
LW
link
The idea
JNS
6 Mar 2023 13:42 UTC
3
points
0
comments
9
min read
LW
link
Honesty, Openness, Trustworthiness, and Secrets
NormanPerlmutter
6 Mar 2023 9:03 UTC
13
points
0
comments
9
min read
LW
link
EA & LW Forum Weekly Summary (27th Feb − 5th Mar 2023)
Zoe Williams
6 Mar 2023 3:18 UTC
12
points
0
comments
1
min read
LW
link
The Type II Inner-Compass Theorem
Tristan Miano
6 Mar 2023 2:35 UTC
−16
points
0
comments
22
min read
LW
link
AGI’s Impact on Employment
TheUnkown
6 Mar 2023 1:56 UTC
1
point
1
comment
1
min read
LW
link
(www.apricitas.io)
Why did you trash the old HPMOR.com?
AnnoyedReader
6 Mar 2023 1:55 UTC
55
points
68
comments
2
min read
LW
link
Cap Model Size for AI Safety
research_prime_space
6 Mar 2023 1:11 UTC
0
points
4
comments
1
min read
LW
link
What should we do about network-effect monopolies?
benkuhn
6 Mar 2023 0:50 UTC
31
points
7
comments
1
min read
LW
link
(www.benkuhn.net)
Who Aligns the Alignment Researchers?
Ben Smith
5 Mar 2023 23:22 UTC
48
points
0
comments
11
min read
LW
link
Startups are like firewood
Adam Zerner
5 Mar 2023 23:09 UTC
26
points
2
comments
3
min read
LW
link
A concerning observation from media coverage of AI industry dynamics
Justin Olive
5 Mar 2023 21:38 UTC
8
points
3
comments
3
min read
LW
link
Steven Pinker on ChatGPT and AGI (Feb 2023)
Evan R. Murphy
5 Mar 2023 21:34 UTC
11
points
8
comments
1
min read
LW
link
(news.harvard.edu)
Is it time to talk about AI doomsday prepping yet?
bokov
5 Mar 2023 21:17 UTC
0
points
8
comments
1
min read
LW
link
Coordination explosion before intelligence explosion...?
tailcalled
5 Mar 2023 20:48 UTC
47
points
9
comments
2
min read
LW
link
The Ogdoad
Tristan Miano
5 Mar 2023 20:01 UTC
−15
points
1
comment
37
min read
LW
link
[Question]
What are some good ways to heighten my emotions?
oh54321
5 Mar 2023 18:06 UTC
5
points
5
comments
1
min read
LW
link
Research proposal: Leveraging Jungian archetypes to create values-based models
MiguelDev
5 Mar 2023 17:39 UTC
5
points
2
comments
2
min read
LW
link
Abusing Snap Circuits IC
jefftk
5 Mar 2023 17:00 UTC
19
points
3
comments
3
min read
LW
link
(www.jefftk.com)
Do humans derive values from fictitious imputed coherence?
TsviBT
5 Mar 2023 15:23 UTC
45
points
8
comments
14
min read
LW
link
The Inner-Compass Theorem
Tristan Miano
5 Mar 2023 15:21 UTC
−18
points
12
comments
16
min read
LW
link
Halifax Monthly Meetup: AI Safety Discussion
Ideopunk
5 Mar 2023 12:42 UTC
10
points
0
comments
1
min read
LW
link
Why kill everyone?
arisAlexis
5 Mar 2023 11:53 UTC
7
points
5
comments
2
min read
LW
link
Selective, Corrective, Structural: Three Ways of Making Social Systems Work
Said Achmiz
5 Mar 2023 8:45 UTC
99
points
13
comments
2
min read
LW
link
Substitute goods for leisure are abundant
Adam Zerner
5 Mar 2023 3:45 UTC
20
points
7
comments
5
min read
LW
link
[Question]
Does polyamory at a workplace turn nepotism up to eleven?
Viliam
5 Mar 2023 0:57 UTC
45
points
11
comments
2
min read
LW
link
Why We MUST Build an (aligned) Artificial Superintelligence That Takes Over Human Society—A Thought Experiment
twkaiser
5 Mar 2023 0:47 UTC
−13
points
12
comments
2
min read
LW
link
Forecasts on Moore v Harper from Samotsvety
gregjustice
5 Mar 2023 0:47 UTC
7
points
0
comments
1
min read
LW
link
(samotsvety.org)
Why Not Just… Build Weak AI Tools For AI Alignment Research?
johnswentworth
5 Mar 2023 0:12 UTC
175
points
18
comments
6
min read
LW
link
Consciousness is irrelevant—instead solve alignment by asking this question
Oliver Siegel
4 Mar 2023 22:06 UTC
−10
points
6
comments
1
min read
LW
link
More money with less risk: sell services instead of model access
lemonhope
4 Mar 2023 20:51 UTC
9
points
3
comments
1
min read
LW
link
Contra “Strong Coherence”
DragonGod
4 Mar 2023 20:05 UTC
39
points
24
comments
1
min read
LW
link
The Practitioner’s Path 2.0: A new framework for structured self-improvement
Evenflair
4 Mar 2023 19:19 UTC
32
points
2
comments
11
min read
LW
link
(guildoftherose.org)
The Benefits of Distillation in Research
Jonas Hallgren
4 Mar 2023 17:45 UTC
15
points
2
comments
5
min read
LW
link
Optimal Music Choice
mbazzani
4 Mar 2023 17:26 UTC
5
points
0
comments
1
min read
LW
link
Why don’t more people talk about ecological psychology?
Ppau
4 Mar 2023 17:03 UTC
21
points
10
comments
7
min read
LW
link
Switching to Electric Mandolin
jefftk
4 Mar 2023 15:40 UTC
16
points
0
comments
1
min read
LW
link
(www.jefftk.com)
Predictive Performance on Metaculus vs. Manifold Markets
nikos
4 Mar 2023 8:10 UTC
18
points
0
comments
5
min read
LW
link
Contra Hanson on AI Risk
Liron
4 Mar 2023 8:02 UTC
36
points
23
comments
8
min read
LW
link
Bite Sized Tasks
Johannes C. Mayer
4 Mar 2023 3:31 UTC
18
points
2
comments
2
min read
LW
link
How popular is ChatGPT? Part 2: slower growth than Pokémon GO
Richard Korzekwa
3 Mar 2023 23:40 UTC
42
points
4
comments
6
min read
LW
link
(aiimpacts.org)
Acausal normalcy
Andrew_Critch
3 Mar 2023 23:34 UTC
194
points
36
comments
8
min read
LW
link
1
review
Back to top
Next