[Question] What should we tell an AI if it asks why it was cre­ated?

cSkeleton10 Apr 2024 20:37 UTC
1 point
1 comment1 min readLW link

RTFB: On the New Pro­posed CAIP AI Bill

Zvi10 Apr 2024 18:30 UTC
119 points
14 comments34 min readLW link
(thezvi.wordpress.com)

(Ra­tional) De­ci­sion-Mak­ing In Wartime

Danylo Zhyrko10 Apr 2024 18:08 UTC
15 points
2 comments5 min readLW link

Think­ing harder doesn’t work

Jakob Greenfeld10 Apr 2024 18:00 UTC
−13 points
7 comments6 min readLW link
(jakobgreenfeld.com)

Scal­ing Laws and Superposition

Pavan Katta10 Apr 2024 15:36 UTC
9 points
4 comments5 min readLW link
(www.pavankatta.com)

Re­spon­si­ble Ad­vanced Ar­tifi­cial In­tel­li­gence Act

Anon4210 Apr 2024 14:35 UTC
4 points
0 comments1 min readLW link
(assets.caip.org)

Ap­ply to the Pivotal Re­search Fel­low­ship (AI Safety & Biose­cu­rity)

10 Apr 2024 12:08 UTC
18 points
0 comments1 min readLW link

Is Con­scious­ness Si­mu­lated?

Daniele De Nuntiis10 Apr 2024 9:02 UTC
−1 points
2 comments5 min readLW link

AI DOESN’T NEED TO KILL HUMANITY TO EXIST. IT WILL JUST SEE US IMPLODE. OR NOT. [2024]

X O10 Apr 2024 8:52 UTC
−37 points
0 comments27 min readLW link

How I se­lect al­ign­ment re­search projects

10 Apr 2024 4:33 UTC
35 points
4 comments24 min readLW link

[Question] How to ac­cel­er­ate re­cov­ery from sleep debt with bio­hack­ing?

exanova10 Apr 2024 1:27 UTC
10 points
2 comments1 min readLW link

[Question] What are some posthu­man­ist/​more-than-hu­man ap­proaches to defi­ni­tions of in­tel­li­gence and agency? Par­tic­u­larly in ap­pli­ca­tion to AI re­search.

Eli Hiton9 Apr 2024 21:52 UTC
1 point
0 comments1 min readLW link

Ophiol­ogy (or, how the Mamba ar­chi­tec­ture works)

9 Apr 2024 19:31 UTC
67 points
8 comments10 min readLW link

Ap­ply to LASR Labs: a Lon­don-based tech­ni­cal AI safety re­search programme

9 Apr 2024 17:34 UTC
45 points
1 comment3 min readLW link

“De­cen­tral­ized Au­tonomous Ed­u­ca­tion”—Call for Re­view­ers (Seeds of Science)

rogersbacon9 Apr 2024 14:39 UTC
6 points
0 comments1 min readLW link

D&D.Sci: The Mad Tyrant’s Pet Tur­tles [Eval­u­a­tion and Rule­set]

abstractapplic9 Apr 2024 14:01 UTC
48 points
6 comments3 min readLW link

Med­i­cal Roundup #2

Zvi9 Apr 2024 13:40 UTC
37 points
18 comments16 min readLW link
(thezvi.wordpress.com)

[Closed] PIBBSS is hiring in a va­ri­ety of roles (al­ign­ment re­search and in­cu­ba­tion pro­gram)

9 Apr 2024 8:12 UTC
54 points
0 comments3 min readLW link

Any ev­i­dence or rea­son to ex­pect a mul­ti­verse /​ Everett branches?

lemonhope9 Apr 2024 5:26 UTC
9 points
125 comments1 min readLW link

Fer­ment­ing Form

koratkar9 Apr 2024 2:46 UTC
19 points
2 comments4 min readLW link
(careerscouting.substack.com)

[Question] Non-ul­ti­ma­tum game problem

numpyNaN8 Apr 2024 23:25 UTC
9 points
4 comments2 min readLW link

Pan­demic Iden­ti­fi­ca­tion Simulator

jefftk8 Apr 2024 19:00 UTC
22 points
0 comments1 min readLW link
(www.jefftk.com)

How We Pic­ture Bayesian Agents

8 Apr 2024 18:12 UTC
69 points
14 comments7 min readLW link

CEA seeks co-founder for AI safety group sup­port spin-off

agucova8 Apr 2024 15:42 UTC
18 points
0 comments1 min readLW link

In­ves­ti­gat­ing the role of agency in AI x-risk

Corin Katzke8 Apr 2024 15:12 UTC
10 points
0 comments1 min readLW link
(www.convergenceanalysis.org)

Mea­sur­ing Learned Op­ti­miza­tion in Small Trans­former Models

J Bostock8 Apr 2024 14:41 UTC
22 points
0 comments11 min readLW link

[Question] Can sin­gu­lar­ity emerge from trans­form­ers?

MP8 Apr 2024 14:26 UTC
−3 points
1 comment1 min readLW link

Gated At­ten­tion Blocks: Pre­limi­nary Progress to­ward Re­mov­ing At­ten­tion Head Superposition

8 Apr 2024 11:14 UTC
37 points
4 comments15 min readLW link

Math-to-English Cheat Sheet

nahoj8 Apr 2024 9:19 UTC
54 points
5 comments6 min readLW link

[Question] What does it take to trans­fer the knowl­edge to ac­tion?

EL_File41388 Apr 2024 6:23 UTC
3 points
7 comments1 min readLW link

Nor­mal­iz­ing Sparse Autoencoders

Fengyuan Hu8 Apr 2024 6:17 UTC
21 points
18 comments13 min readLW link

A Dozen Ways to Get More Dakka

Davidmanheim8 Apr 2024 4:45 UTC
132 points
11 comments3 min readLW link

[Cross­post] In­tro­duc­ing the Hyper­man­i­fest: Redefin­ing AI’s Role in Hu­man Con­nec­tion and Interaction

Suzie. EXE7 Apr 2024 17:21 UTC
4 points
0 comments5 min readLW link

Ap­pli­ca­tions Open: Ele­vate Your Men­tal Wel­lbe­ing with Re­think Wel­lbe­ing’s CBT Program

Inga G.7 Apr 2024 14:03 UTC
13 points
2 comments1 min readLW link

The Poker The­ory of Poker Night

omark7 Apr 2024 9:47 UTC
29 points
13 comments9 min readLW link
(www.codeandbugs.com)

Cen­trists are (prob­a­bly) less biased

Kevin Dorst7 Apr 2024 6:40 UTC
1 point
2 comments5 min readLW link
(kevindorst.substack.com)

on the dol­lar-yen ex­change rate

bhauth7 Apr 2024 4:49 UTC
50 points
21 comments10 min readLW link
(www.bhauth.com)

Con­flict in Posthu­man Literature

Martín Soto6 Apr 2024 22:26 UTC
40 points
1 comment2 min readLW link
(twitter.com)

“Frac­tal Strat­egy” work­shop report

Raemon6 Apr 2024 21:26 UTC
67 points
22 comments10 min readLW link

The 2nd De­mo­graphic Transition

Maxwell Tabarrok6 Apr 2024 14:10 UTC
68 points
17 comments4 min readLW link
(www.maximum-progress.com)

My in­tel­lec­tual jour­ney to (dis)solve the hard prob­lem of consciousness

Charbel-Raphaël6 Apr 2024 9:32 UTC
44 points
42 comments30 min readLW link

Mea­sur­ing Pre­dictabil­ity of Per­sona Evaluations

6 Apr 2024 8:46 UTC
20 points
0 comments7 min readLW link

Pri­vacy and writing

Neil 6 Apr 2024 8:20 UTC
20 points
1 comment5 min readLW link

[Question] How does the ever-in­creas­ing use of AI in the mil­i­tary for the di­rect pur­pose of mur­der­ing peo­ple af­fect your p(doom)?

Justausername6 Apr 2024 6:31 UTC
19 points
16 comments1 min readLW link

Two tools for re­think­ing ex­is­ten­tial risk

Arepo6 Apr 2024 2:55 UTC
2 points
0 comments25 min readLW link

Ex­plor­ing Whole Brain Emulation

PeterMcCluskey6 Apr 2024 2:38 UTC
13 points
1 comment2 min readLW link
(bayesianinvestor.com)

Koan: di­v­in­ing alien datas­truc­tures from RAM activations

TsviBT5 Apr 2024 18:04 UTC
43 points
10 comments21 min readLW link

On the 2nd CWT with Jonathan Haidt

Zvi5 Apr 2024 17:30 UTC
27 points
3 comments33 min readLW link
(thezvi.wordpress.com)

End-to-end hack­ing with lan­guage models

tchauvin5 Apr 2024 15:06 UTC
29 points
0 comments8 min readLW link

Par­tial value takeover with­out world takeover

KatjaGrace5 Apr 2024 6:20 UTC
89 points
23 comments3 min readLW link
(worldspiritsockpuppet.com)