Who Aligns the Align­ment Re­searchers?

Ben Smith5 Mar 2023 23:22 UTC
48 points
0 comments11 min readLW link

Star­tups are like firewood

Adam Zerner5 Mar 2023 23:09 UTC
26 points
2 comments3 min readLW link

A con­cern­ing ob­ser­va­tion from me­dia cov­er­age of AI in­dus­try dynamics

Justin Olive5 Mar 2023 21:38 UTC
8 points
3 comments3 min readLW link

Steven Pinker on ChatGPT and AGI (Feb 2023)

Evan R. Murphy5 Mar 2023 21:34 UTC
11 points
8 comments1 min readLW link
(news.harvard.edu)

Is it time to talk about AI dooms­day prep­ping yet?

bokov5 Mar 2023 21:17 UTC
0 points
8 comments1 min readLW link

Co­or­di­na­tion ex­plo­sion be­fore in­tel­li­gence ex­plo­sion...?

tailcalled5 Mar 2023 20:48 UTC
47 points
9 comments2 min readLW link

The Ogdoad

Tristan Miano5 Mar 2023 20:01 UTC
−15 points
1 comment37 min readLW link

[Question] What are some good ways to heighten my emo­tions?

oh543215 Mar 2023 18:06 UTC
5 points
5 comments1 min readLW link

Re­search pro­posal: Lev­er­ag­ing Jun­gian archetypes to cre­ate val­ues-based models

MiguelDev5 Mar 2023 17:39 UTC
5 points
2 comments2 min readLW link

Abus­ing Snap Cir­cuits IC

jefftk5 Mar 2023 17:00 UTC
19 points
3 comments3 min readLW link
(www.jefftk.com)

Do hu­mans de­rive val­ues from fic­ti­tious im­puted co­her­ence?

TsviBT5 Mar 2023 15:23 UTC
45 points
8 comments14 min readLW link

The In­ner-Com­pass Theorem

Tristan Miano5 Mar 2023 15:21 UTC
−18 points
12 comments16 min readLW link

QACI: the prob­lem of blob lo­ca­tion, causal­ity, and counterfactuals

Tamsin Leake5 Mar 2023 14:06 UTC
23 points
1 comment2 min readLW link
(carado.moe)

Hal­i­fax Monthly Meetup: AI Safety Discussion

Ideopunk5 Mar 2023 12:42 UTC
10 points
0 comments1 min readLW link

Why kill ev­ery­one?

arisAlexis5 Mar 2023 11:53 UTC
7 points
5 comments2 min readLW link

Selec­tive, Cor­rec­tive, Struc­tural: Three Ways of Mak­ing So­cial Sys­tems Work

Said Achmiz5 Mar 2023 8:45 UTC
99 points
13 comments2 min readLW link

Sub­sti­tute goods for leisure are abundant

Adam Zerner5 Mar 2023 3:45 UTC
20 points
7 comments5 min readLW link

[Question] Does polyamory at a work­place turn nepo­tism up to eleven?

Viliam5 Mar 2023 0:57 UTC
45 points
11 comments2 min readLW link

Why We MUST Build an (al­igned) Ar­tifi­cial Su­per­in­tel­li­gence That Takes Over Hu­man So­ciety—A Thought Experiment

twkaiser5 Mar 2023 0:47 UTC
−13 points
12 comments2 min readLW link

Fore­casts on Moore v Harper from Samotsvety

gregjustice5 Mar 2023 0:47 UTC
7 points
0 comments1 min readLW link
(samotsvety.org)

Why Not Just… Build Weak AI Tools For AI Align­ment Re­search?

johnswentworth5 Mar 2023 0:12 UTC
158 points
18 comments6 min readLW link

Con­scious­ness is ir­rele­vant—in­stead solve al­ign­ment by ask­ing this question

Oliver Siegel4 Mar 2023 22:06 UTC
−10 points
6 comments1 min readLW link

More money with less risk: sell ser­vices in­stead of model access

lemonhope4 Mar 2023 20:51 UTC
9 points
3 comments1 min readLW link

Con­tra “Strong Co­her­ence”

DragonGod4 Mar 2023 20:05 UTC
39 points
24 comments1 min readLW link

The Prac­ti­tioner’s Path 2.0: A new frame­work for struc­tured self-improvement

Evenflair4 Mar 2023 19:19 UTC
32 points
2 comments11 min readLW link
(guildoftherose.org)

The Benefits of Distil­la­tion in Research

Jonas Hallgren4 Mar 2023 17:45 UTC
15 points
2 comments5 min readLW link

Op­ti­mal Mu­sic Choice

mbazzani4 Mar 2023 17:26 UTC
5 points
0 comments1 min readLW link

Why don’t more peo­ple talk about ecolog­i­cal psy­chol­ogy?

Ppau4 Mar 2023 17:03 UTC
21 points
10 comments7 min readLW link

Switch­ing to Elec­tric Mandolin

jefftk4 Mar 2023 15:40 UTC
16 points
0 comments1 min readLW link
(www.jefftk.com)

Pre­dic­tive Perfor­mance on Me­tac­u­lus vs. Man­i­fold Markets

nikos4 Mar 2023 8:10 UTC
18 points
0 comments5 min readLW link

Con­tra Han­son on AI Risk

Liron4 Mar 2023 8:02 UTC
36 points
23 comments8 min readLW link

Bite Sized Tasks

Johannes C. Mayer4 Mar 2023 3:31 UTC
18 points
2 comments2 min readLW link

How pop­u­lar is ChatGPT? Part 2: slower growth than Poké­mon GO

Richard Korzekwa 3 Mar 2023 23:40 UTC
42 points
4 comments6 min readLW link
(aiimpacts.org)

Acausal normalcy

Andrew_Critch3 Mar 2023 23:34 UTC
181 points
30 comments8 min readLW link

Com­ments on OpenAI’s “Plan­ning for AGI and be­yond”

So8res3 Mar 2023 23:01 UTC
148 points
2 comments14 min readLW link

Why are coun­ter­fac­tu­als elu­sive?

Martín Soto3 Mar 2023 20:13 UTC
14 points
6 comments2 min readLW link

Si­tu­a­tional aware­ness in Large Lan­guage Models

Simon Möller3 Mar 2023 18:59 UTC
30 points
2 comments7 min readLW link

AI Gover­nance & Strat­egy: Pri­ori­ties, tal­ent gaps, & opportunities

Akash3 Mar 2023 18:09 UTC
56 points
2 comments4 min readLW link

Mea­sur­ing Ads Opt-Out Compliance

jefftk3 Mar 2023 16:00 UTC
18 points
2 comments2 min readLW link
(www.jefftk.com)

ChatGPT tells sto­ries, and a note about re­verse en­g­ineer­ing: A Work­ing Paper

Bill Benzon3 Mar 2023 15:12 UTC
3 points
0 comments3 min readLW link

Group Wiki Walk

Screwtape3 Mar 2023 15:10 UTC
9 points
0 comments3 min readLW link

Robin Han­son’s lat­est AI risk po­si­tion statement

Liron3 Mar 2023 14:25 UTC
55 points
18 comments1 min readLW link
(www.overcomingbias.com)

A re­ply to Byrnes on the Free En­ergy Principle

Roman Leventov3 Mar 2023 13:03 UTC
28 points
16 comments14 min readLW link

state of my al­ign­ment re­search, and what needs work

Tamsin Leake3 Mar 2023 10:28 UTC
51 points
0 comments2 min readLW link
(carado.moe)

Syd­ney can play chess and kind of keep track of the board state

Erik Jenner3 Mar 2023 9:39 UTC
64 points
19 comments6 min readLW link

[Fic­tion] The boy in the glass dome

Kaj_Sotala3 Mar 2023 7:50 UTC
28 points
0 comments2 min readLW link
(kajsotala.fi)

The Waluigi Effect (mega-post)

Cleo Nardo3 Mar 2023 3:22 UTC
628 points
187 comments16 min readLW link

Aspiring AI safety re­searchers should ~argmax over AGI timelines

Ryan Kidd3 Mar 2023 2:04 UTC
29 points
8 comments2 min readLW link

ACX/​SSC/​LW meetup

Épiphanie Gédéon2 Mar 2023 23:37 UTC
8 points
0 comments1 min readLW link

Re­sults Pre­dic­tion Thread About How Differ­ent Fac­tors Affect AI X-Risk

MrThink2 Mar 2023 22:13 UTC
9 points
0 comments2 min readLW link