Hu­mans, chim­panzees and other animals

gjm30 May 2023 23:53 UTC
21 points
18 comments1 min readLW link

The case for re­mov­ing al­ign­ment and ML re­search from the train­ing dataset

beren30 May 2023 20:54 UTC
48 points
8 comments5 min readLW link

Why Job Dis­place­ment Pre­dic­tions are Wrong: Ex­pla­na­tions of Cog­ni­tive Automation

Moritz Wallawitsch30 May 2023 20:43 UTC
−4 points
0 comments8 min readLW link

PaLM-2 & GPT-4 in “Ex­trap­o­lat­ing GPT-N perfor­mance”

Lukas Finnveden30 May 2023 18:33 UTC
55 points
6 comments6 min readLW link

RoboNet—A new in­ter­net pro­to­col for AI

antoniomax30 May 2023 17:55 UTC
−13 points
1 comment18 min readLW link

Why I don’t think that the prob­a­bil­ity that AGI kills ev­ery­one is roughly 1 (but rather around 0.995).

Bastumannen30 May 2023 17:54 UTC
−6 points
0 comments2 min readLW link

AI X-risk is a pos­si­ble solu­tion to the Fermi Paradox

magic9mushroom30 May 2023 17:42 UTC
11 points
20 comments2 min readLW link

LIMA: Less Is More for Alignment

Ulisse Mini30 May 2023 17:10 UTC
16 points
6 comments1 min readLW link
(arxiv.org)

Boomerang—pro­to­col to dis­solve some com­mit­ment races

Filip Sondej30 May 2023 16:21 UTC
37 points
10 comments8 min readLW link

An­nounc­ing Apollo Research

30 May 2023 16:17 UTC
217 points
11 comments8 min readLW link

Ad­vice for new al­ign­ment peo­ple: Info Max

Jonas Hallgren30 May 2023 15:42 UTC
27 points
4 comments5 min readLW link

[Question] Who is li­able for AI?

jmh30 May 2023 13:54 UTC
14 points
4 comments1 min readLW link

AI Safety Newslet­ter #8: Rogue AIs, how to screen for AI risks, and grants for re­search on demo­cratic gov­er­nance of AI

30 May 2023 11:52 UTC
20 points
0 comments6 min readLW link
(newsletter.safe.ai)

The bul­ls­eye frame­work: My case against AI doom

titotal30 May 2023 11:52 UTC
89 points
35 comments1 min readLW link

State­ment on AI Ex­tinc­tion—Signed by AGI Labs, Top Aca­demics, and Many Other Notable Figures

Dan H30 May 2023 9:05 UTC
372 points
77 comments1 min readLW link
(www.safe.ai)

The­o­ret­i­cal Limi­ta­tions of Au­tore­gres­sive Models

Gabriel Wu30 May 2023 2:37 UTC
20 points
1 comment10 min readLW link
(gabrieldwu.github.io)

A book re­view for “An­i­mal Weapons” and cross-ap­ply­ing the les­sons to x-risk

Habeeb Abdulfatah30 May 2023 0:58 UTC
−6 points
1 comment1 min readLW link
(www.super-linear.org)

Without a tra­jec­tory change, the de­vel­op­ment of AGI is likely to go badly

Max H29 May 2023 23:42 UTC
16 points
2 comments13 min readLW link

Win­ners-take-how-much?

YonatanK29 May 2023 21:56 UTC
3 points
2 comments3 min readLW link

Re­ply to a fer­til­ity doc­tor con­cern­ing poly­genic em­bryo screening

GeneSmith29 May 2023 21:50 UTC
58 points
6 comments8 min readLW link

Sen­tience matters

So8res29 May 2023 21:25 UTC
143 points
96 comments2 min readLW link

Wikipe­dia as an in­tro­duc­tion to the al­ign­ment problem

SoerenMind29 May 2023 18:43 UTC
83 points
10 comments1 min readLW link
(en.wikipedia.org)

[Question] What are some of the best in­tro­duc­tions/​break­downs of AI ex­is­ten­tial risk for those un­fa­mil­iar?

Isaac King29 May 2023 17:04 UTC
17 points
2 comments1 min readLW link

Creat­ing Flash­cards with LLMs

Diogo Cruz29 May 2023 16:55 UTC
14 points
3 comments9 min readLW link

On the Im­pos­si­bil­ity of In­tel­li­gent Paper­clip Maximizers

Michael Simkin29 May 2023 16:55 UTC
−21 points
5 comments4 min readLW link

Min­i­mum Vi­able Exterminator

Richard Horvath29 May 2023 16:32 UTC
14 points
5 comments5 min readLW link

An LLM-based “ex­em­plary ac­tor”

Roman Leventov29 May 2023 11:12 UTC
16 points
0 comments12 min readLW link

Align­ing an H-JEPA agent via train­ing on the out­puts of an LLM-based “ex­em­plary ac­tor”

Roman Leventov29 May 2023 11:08 UTC
12 points
10 comments30 min readLW link

Gem­ini will bring the next big timeline update

p.b.29 May 2023 6:05 UTC
50 points
6 comments1 min readLW link

Pro­posed Align­ment Tech­nique: OSNR (Out­put San­i­ti­za­tion via Nois­ing and Re­con­struc­tion) for Safer Usage of Po­ten­tially Misal­igned AGI

sudo29 May 2023 1:35 UTC
14 points
9 comments6 min readLW link

Mo­ral­ity is Ac­ci­den­tal & Self-Congratulatory

ymeskhout29 May 2023 0:40 UTC
25 points
40 comments5 min readLW link

TinyS­to­ries: Small Lan­guage Models That Still Speak Co­her­ent English

Ulisse Mini28 May 2023 22:23 UTC
66 points
8 comments2 min readLW link
(arxiv.org)

“Mem­branes” is bet­ter ter­minol­ogy than “bound­aries” alone

28 May 2023 22:16 UTC
30 points
12 comments3 min readLW link

The king token

p.b.28 May 2023 19:18 UTC
17 points
0 comments4 min readLW link

Lan­guage Agents Re­duce the Risk of Ex­is­ten­tial Catastrophe

28 May 2023 19:10 UTC
39 points
14 comments26 min readLW link

Devil’s Ad­vo­cate: Ad­verse Selec­tion Against Con­scien­tious­ness

lionhearted (Sebastian Marshall)28 May 2023 17:53 UTC
10 points
2 comments1 min readLW link

Re­acts now en­abled on 100% of posts, though still just ex­per­i­ment­ing

Ruby28 May 2023 5:36 UTC
88 points
73 comments2 min readLW link

My AI Align­ment Re­search Agenda and Threat Model, right now (May 2023)

Nicholas / Heather Kross28 May 2023 3:23 UTC
25 points
0 comments6 min readLW link
(www.thinkingmuchbetter.com)

Kelly bet­ting vs ex­pec­ta­tion max­i­miza­tion

MorgneticField28 May 2023 1:54 UTC
35 points
33 comments5 min readLW link

Why and When In­ter­pretabil­ity Work is Dangerous

Nicholas / Heather Kross28 May 2023 0:27 UTC
20 points
9 comments8 min readLW link
(www.thinkingmuchbetter.com)

Twin Cities ACX Meetup—June 2023

Timothy M.27 May 2023 20:11 UTC
1 point
1 comment1 min readLW link

Pro­ject Idea: Challenge Groups for Align­ment Researchers

Adam Zerner27 May 2023 20:10 UTC
13 points
0 comments1 min readLW link

In­tro­spec­tive Bayes

False Name27 May 2023 19:35 UTC
−3 points
2 comments16 min readLW link

Should Ra­tional An­i­ma­tions in­vite view­ers to read con­tent on LessWrong?

Writer27 May 2023 19:26 UTC
40 points
9 comments3 min readLW link

Who are the Ex­perts on Cry­on­ics?

Mati_Roy27 May 2023 19:24 UTC
30 points
9 comments1 min readLW link
(biostasis.substack.com)

AI and Planet Earth are in­com­pat­i­ble.

archeon27 May 2023 18:59 UTC
−4 points
2 comments1 min readLW link

South Bay ACX/​LW Meetup

IS27 May 2023 17:25 UTC
2 points
0 comments1 min readLW link

Hands-On Ex­pe­rience Is Not Magic

Thane Ruthenis27 May 2023 16:57 UTC
21 points
14 comments5 min readLW link

Is Deon­tolog­i­cal AI Safe? [Feed­back Draft]

27 May 2023 16:39 UTC
19 points
15 comments20 min readLW link

San Fran­cisco ACX Meetup “First Satur­day” June 3, 1 pm

guenael27 May 2023 13:58 UTC
1 point
0 comments1 min readLW link