Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
2
[Interim research report] Taking features out of superposition with sparse autoencoders
Lee Sharkey
,
Dan Braun
and
beren
Dec 13, 2022, 3:41 PM
150
points
23
comments
22
min read
LW
link
2
reviews
[Question]
Is the ChatGPT-simulated Linux virtual machine real?
Kenoubi
Dec 13, 2022, 3:41 PM
18
points
7
comments
1
min read
LW
link
Existential AI Safety is NOT separate from near-term applications
scasper
Dec 13, 2022, 2:47 PM
37
points
17
comments
3
min read
LW
link
What is the correlation between upvoting and benefit to readers of LW?
banev
Dec 13, 2022, 2:26 PM
7
points
15
comments
1
min read
LW
link
Limits of Superintelligence
Aleksei Petrenko
Dec 13, 2022, 12:19 PM
1
point
5
comments
1
min read
LW
link
Bay 2022 Solstice
Raemon
Dec 13, 2022, 8:58 AM
17
points
0
comments
1
min read
LW
link
Last day to nominate things for the Review. Also, 2019 books still exist.
Raemon
Dec 13, 2022, 8:53 AM
15
points
0
comments
1
min read
LW
link
AI alignment is distinct from its near-term applications
paulfchristiano
Dec 13, 2022, 7:10 AM
255
points
21
comments
2
min read
LW
link
(ai-alignment.com)
Take 10: Fine-tuning with RLHF is aesthetically unsatisfying.
Charlie Steiner
Dec 13, 2022, 7:04 AM
37
points
3
comments
2
min read
LW
link
[Question]
Are lawsuits against AGI companies extending AGI timelines?
SlowingAGI
Dec 13, 2022, 6:00 AM
1
point
1
comment
1
min read
LW
link
EA & LW Forums Weekly Summary (5th Dec − 11th Dec 22′)
Zoe Williams
Dec 13, 2022, 2:53 AM
7
points
0
comments
LW
link
Alignment with argument-networks and assessment-predictions
Tor Økland Barstad
Dec 13, 2022, 2:17 AM
10
points
5
comments
45
min read
LW
link
Revisiting algorithmic progress
Tamay
and
Ege Erdil
Dec 13, 2022, 1:39 AM
95
points
15
comments
2
min read
LW
link
1
review
(arxiv.org)
An exploration of GPT-2′s embedding weights
Adam Scherlis
Dec 13, 2022, 12:46 AM
44
points
4
comments
10
min read
LW
link
12 career-related questions that may (or may not) be helpful for people interested in alignment research
Orpheus16
Dec 12, 2022, 10:36 PM
20
points
0
comments
2
min read
LW
link
Concept extrapolation for hypothesis generation
Stuart_Armstrong
,
Patrick Leask
and
rgorman
Dec 12, 2022, 10:09 PM
20
points
2
comments
3
min read
LW
link
Let’s go meta: Grammatical knowledge and self-referential sentences [ChatGPT]
Bill Benzon
Dec 12, 2022, 9:50 PM
5
points
0
comments
9
min read
LW
link
D&D.Sci December 2022 Evaluation and Ruleset
abstractapplic
Dec 12, 2022, 9:21 PM
17
points
8
comments
2
min read
LW
link
Log-odds are better than Probabilities
Robert_AIZI
Dec 12, 2022, 8:10 PM
22
points
4
comments
4
min read
LW
link
(aizi.substack.com)
Bengaluru LW/ACX Social Meetup—December 2022
faiz
Dec 12, 2022, 7:30 PM
4
points
0
comments
1
min read
LW
link
Psychological Disorders and Problems
adamShimi
and
Gabriel Alfour
Dec 12, 2022, 6:15 PM
39
points
6
comments
1
min read
LW
link
Confusing the goal and the path
adamShimi
Dec 12, 2022, 4:42 PM
44
points
7
comments
1
min read
LW
link
(epistemologicalvigilance.substack.com)
Meaningful things are those the universe possesses a semantics for
Abhimanyu Pallavi Sudhir
Dec 12, 2022, 4:03 PM
16
points
14
comments
14
min read
LW
link
Tradeoffs in complexity, abstraction, and generality
remember
and
Gabriel Alfour
Dec 12, 2022, 3:55 PM
32
points
0
comments
2
min read
LW
link
Green Line Extension Opening Dates
jefftk
Dec 12, 2022, 2:40 PM
12
points
0
comments
1
min read
LW
link
(www.jefftk.com)
Join the AI Testing Hackathon this Friday
Esben Kran
Dec 12, 2022, 2:24 PM
10
points
0
comments
LW
link
Side-channels: input versus output
davidad
Dec 12, 2022, 12:32 PM
44
points
16
comments
2
min read
LW
link
Take 9: No, RLHF/IDA/debate doesn’t solve outer alignment.
Charlie Steiner
Dec 12, 2022, 11:51 AM
33
points
13
comments
2
min read
LW
link
Creating a database for base rates
nikos
Dec 12, 2022, 10:09 AM
2
points
1
comment
3
min read
LW
link
(forum.effectivealtruism.org)
Trivial GPT-3.5 limitation workaround
Dave Lindbergh
Dec 12, 2022, 8:42 AM
5
points
4
comments
1
min read
LW
link
Ponzi schemes can be highly profitable if your timing is good
GeneSmith
Dec 12, 2022, 6:42 AM
10
points
18
comments
5
min read
LW
link
Prodding ChatGPT to solve a basic algebra problem
Shmi
Dec 12, 2022, 4:09 AM
14
points
6
comments
1
min read
LW
link
(twitter.com)
Wider Default Audio Player in Chrome?
jefftk
Dec 12, 2022, 3:30 AM
11
points
2
comments
1
min read
LW
link
(www.jefftk.com)
A brainteaser for language models
Adam Scherlis
Dec 12, 2022, 2:43 AM
47
points
3
comments
2
min read
LW
link
Benchmarks for Comparing Human and AI Intelligence
MrThink
Dec 11, 2022, 10:06 PM
9
points
4
comments
2
min read
LW
link
Reflections on the PIBBSS Fellowship 2022
Nora_Ammann
and
particlemania
Dec 11, 2022, 9:53 PM
32
points
0
comments
18
min read
LW
link
A crisis for online communication: bots and bot users will overrun the Internet?
Mitchell_Porter
Dec 11, 2022, 9:11 PM
15
points
11
comments
1
min read
LW
link
Finite Factored Sets in Pictures
Magdalena Wache
Dec 11, 2022, 6:49 PM
174
points
35
comments
12
min read
LW
link
Formalization as suspension of intuition
adamShimi
Dec 11, 2022, 3:16 PM
54
points
18
comments
1
min read
LW
link
(epistemologicalvigilance.substack.com)
An argument on animal consciousness (soliciting criticism)
SciHamster
Dec 11, 2022, 3:12 PM
1
point
2
comments
1
min read
LW
link
ChatGPT’s new novel rationality technique of fact checking
ChristianKl
Dec 11, 2022, 1:54 PM
−14
points
7
comments
1
min read
LW
link
Reframing inner alignment
davidad
Dec 11, 2022, 1:53 PM
53
points
13
comments
4
min read
LW
link
A poem about applied rationality by ChatGPT
ChristianKl
11 Dec 2022 13:43 UTC
4
points
0
comments
1
min read
LW
link
ChatGPT goes through a wormhole hole in our Shandyesque universe [virtual wacky weed]
Bill Benzon
11 Dec 2022 11:59 UTC
−1
points
2
comments
3
min read
LW
link
Using Obsidian if you’re used to using Roam
Solenoid_Entity
11 Dec 2022 8:59 UTC
19
points
4
comments
2
min read
LW
link
[fiction] Our Final Hour
Mati_Roy
11 Dec 2022 5:49 UTC
23
points
5
comments
3
min read
LW
link
Consider using reversible automata for alignment research
Alex_Altair
11 Dec 2022 1:00 UTC
88
points
30
comments
2
min read
LW
link
High level discourse structure in ChatGPT: Part 2 [Quasi-symbolic?]
Bill Benzon
10 Dec 2022 22:26 UTC
7
points
0
comments
6
min read
LW
link
Poll Results on AGI
Niclas Kupper
10 Dec 2022 21:25 UTC
18
points
0
comments
2
min read
LW
link
Reflecting on the 2022 Guild of the Rose Workshops
moridinamael
10 Dec 2022 21:21 UTC
26
points
7
comments
8
min read
LW
link
Previous
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel