[Question] When to men­tion ir­rele­vant ac­cu­sa­tions?

philh14 Jan 2023 21:58 UTC
20 points
50 comments1 min readLW link

World-Model In­ter­pretabil­ity Is All We Need

Thane Ruthenis14 Jan 2023 19:37 UTC
35 points
22 comments21 min readLW link

Cur­rent AI Models Seem Suffi­cient for Low-Risk, Benefi­cial AI

harsimony14 Jan 2023 18:55 UTC
17 points
1 comment2 min readLW link

[Question] Ba­sic Ques­tion about LLMs: how do they know what task to perform

Garak14 Jan 2023 13:13 UTC
1 point
3 comments1 min readLW link

Aligned with what?

Program Den14 Jan 2023 10:28 UTC
3 points
41 comments1 min readLW link

Wok­ism, re­think­ing pri­ori­ties and the Bostrom case

Arturo Macias14 Jan 2023 2:27 UTC
−31 points
2 comments4 min readLW link

A gen­eral com­ment on dis­cus­sions of ge­netic group differences

anonymous810114 Jan 2023 2:11 UTC
70 points
46 comments3 min readLW link

Ab­strac­tions as mor­phisms be­tween (co)algebras

Erik Jenner14 Jan 2023 1:51 UTC
17 points
1 comment8 min readLW link

Con­crete Rea­sons for Hope about AI

Zac Hatfield-Dodds14 Jan 2023 1:22 UTC
100 points
13 comments1 min readLW link

Nega­tive Ex­per­tise

Jonas Kgomo14 Jan 2023 0:51 UTC
4 points
0 comments1 min readLW link
(twitter.com)

Mid-At­lantic AI Align­ment Alli­ance Unconference

Quinn13 Jan 2023 20:33 UTC
7 points
2 comments1 min readLW link

Smal­lpox vac­cines are widely available, for now

David Hornbein13 Jan 2023 20:02 UTC
26 points
5 comments1 min readLW link

How does GPT-3 spend its 175B pa­ram­e­ters?

Robert_AIZI13 Jan 2023 19:21 UTC
41 points
14 comments6 min readLW link
(aizi.substack.com)

[ASoT] Si­mu­la­tors show us be­havi­oural prop­er­ties by default

Jozdien13 Jan 2023 18:42 UTC
35 points
3 comments3 min readLW link

Wheel of Con­sent The­ory for Ra­tion­al­ists and Effec­tive Altruists

adamwilder13 Jan 2023 17:59 UTC
1 point
0 comments2 min readLW link

Money is a way of thank­ing strangers

DirectedEvolution13 Jan 2023 17:06 UTC
13 points
5 comments4 min readLW link

Tracr: Com­piled Trans­form­ers as a Lab­o­ra­tory for In­ter­pretabil­ity | Deep­Mind

DragonGod13 Jan 2023 16:53 UTC
62 points
12 comments1 min readLW link
(arxiv.org)

How we could stum­ble into AI catastrophe

HoldenKarnofsky13 Jan 2023 16:20 UTC
71 points
18 comments18 min readLW link
(www.cold-takes.com)

Ro­bust­ness & Evolu­tion [MLAISU W02]

Esben Kran13 Jan 2023 15:47 UTC
10 points
0 comments3 min readLW link
(newsletter.apartresearch.com)

On Cook­ing With Gas

Zvi13 Jan 2023 14:20 UTC
38 points
60 comments6 min readLW link
(thezvi.wordpress.com)

Be­ware safety-washing

Lizka13 Jan 2023 13:59 UTC
43 points
2 comments4 min readLW link

Some Ar­gu­ments Against Strong Scaling

Joar Skalse13 Jan 2023 12:04 UTC
26 points
21 comments16 min readLW link

[Question] Where do you find peo­ple who ac­tu­ally do things?

Ulisse Mini13 Jan 2023 6:57 UTC
7 points
12 comments1 min readLW link

[Question] Could Si­mu­lat­ing an AGI Tak­ing Over the World Ac­tu­ally Lead to a LLM Tak­ing Over the World?

simeon_c13 Jan 2023 6:33 UTC
15 points
1 comment1 min readLW link

Burn­ing Up­time: When your Sand­box of Em­pa­thy is Leaky and also an Hourglass

Cedar13 Jan 2023 5:18 UTC
12 points
2 comments3 min readLW link

Disen­tan­gling Shard The­ory into Atomic Claims

Leon Lang13 Jan 2023 4:23 UTC
86 points
6 comments18 min readLW link

AGISF adap­ta­tion for in-per­son groups

13 Jan 2023 3:24 UTC
44 points
2 comments3 min readLW link

Ac­tions and Flows

Alok Singh13 Jan 2023 3:20 UTC
5 points
0 comments1 min readLW link
(alok.github.io)

A Thor­ough In­tro­duc­tion to Abstraction

RohanS13 Jan 2023 0:30 UTC
9 points
1 comment18 min readLW link

The AI Con­trol Prob­lem in a wider in­tel­lec­tual context

philosophybear13 Jan 2023 0:28 UTC
11 points
3 comments12 min readLW link

The Align­ment Problems

Martín Soto12 Jan 2023 22:29 UTC
20 points
0 comments4 min readLW link

Pro­posal for In­duc­ing Steganog­ra­phy in LMs

Logan Riggs12 Jan 2023 22:15 UTC
22 points
3 comments2 min readLW link

An­nounc­ing the 2023 PIBBSS Sum­mer Re­search Fellowship

12 Jan 2023 21:31 UTC
32 points
0 comments1 min readLW link

Vic­to­ria Krakovna on AGI Ruin, The Sharp Left Turn and Paradigms of AI Alignment

Michaël Trazzi12 Jan 2023 17:09 UTC
40 points
3 comments4 min readLW link
(www.theinsideview.ai)

[Question] What is a dis­agree­ment you have around AI safety?

tailcalled12 Jan 2023 16:58 UTC
16 points
7 comments1 min readLW link

Re­ward is not Ne­c­es­sary: How to Create a Com­po­si­tional Self-Pre­serv­ing Agent for Life-Long Learning

Roman Leventov12 Jan 2023 16:43 UTC
17 points
2 comments2 min readLW link
(arxiv.org)

ChatGPT strug­gles to re­spond to the real world

Alex Flint12 Jan 2023 16:02 UTC
31 points
9 comments24 min readLW link

Covid 1/​12/​23: Un­ex­pected Spike in Deaths

Zvi12 Jan 2023 14:30 UTC
31 points
2 comments8 min readLW link
(thezvi.wordpress.com)

[Linkpost] Scal­ing Laws for Gen­er­a­tive Mixed-Mo­dal Lan­guage Models

Amal 12 Jan 2023 14:24 UTC
15 points
2 comments1 min readLW link
(arxiv.org)

ea.do­mains—Do­mains Free to a Good Home

plex12 Jan 2023 13:32 UTC
24 points
0 comments1 min readLW link

VIRTUA: a novel about AI alignment

Karl von Wendt12 Jan 2023 9:37 UTC
46 points
12 comments1 min readLW link

Iron defi­cien­cies are very bad and you should treat them

Elizabeth12 Jan 2023 9:10 UTC
108 points
34 comments11 min readLW link1 review
(acesounderglass.com)

Non­stan­dard anal­y­sis in ethics

Alok Singh12 Jan 2023 5:58 UTC
−1 points
0 comments78 min readLW link
(nickbostrom.com)

Ex­am­ple of the name­less ra­tio­nal­ist virtue

Alok Singh12 Jan 2023 5:45 UTC
−9 points
2 comments1 min readLW link

FFMI Gains: A List of Vitalities

porby12 Jan 2023 4:48 UTC
26 points
3 comments7 min readLW link

[Linkpost] Dream­erV3: A Gen­eral RL Architecture

simeon_c12 Jan 2023 3:55 UTC
23 points
3 comments1 min readLW link
(arxiv.org)

Microsoft Plans to In­vest $10B in OpenAI; $3B In­vested to Date | For­tune

DragonGod12 Jan 2023 3:55 UTC
23 points
10 comments2 min readLW link
(fortune.com)

Progress and re­search dis­rup­tive­ness

Eleni Angelou12 Jan 2023 3:51 UTC
3 points
2 comments1 min readLW link
(www.nature.com)

The Fable of the AI Coomer: Why the So­cial Prowess of Machines is AI’s Most Prox­i­mal Threat

Ace Delgado12 Jan 2023 1:15 UTC
−10 points
4 comments4 min readLW link

Write to Think

Michael Samoilov12 Jan 2023 0:33 UTC
10 points
2 comments2 min readLW link