Cu­ri­os­ity as a Solu­tion to AGI Alignment

Harsha G.26 Feb 2023 23:36 UTC
7 points
7 comments3 min readLW link

Learn­ing How to Learn (And 20+ Stud­ies)

maxa26 Feb 2023 22:46 UTC
62 points
12 comments6 min readLW link
(max2c.com)

Bayesian Sce­nario: Snipers & Soldiers

abstractapplic26 Feb 2023 21:48 UTC
23 points
8 comments1 min readLW link
(h-b-p.github.io)

NYT: Lab Leak Most Likely Caused Pan­demic, En­ergy Dept. Says

trevor26 Feb 2023 21:21 UTC
17 points
9 comments4 min readLW link
(www.nytimes.com)

[Link Post] Cy­ber Digi­tal Author­i­tar­i­anism (Na­tional In­tel­li­gence Coun­cil Re­port)

Phosphorous26 Feb 2023 20:51 UTC
12 points
2 comments1 min readLW link
(www.dni.gov)

Reflec­tions on Zen and the Art of Mo­tor­cy­cle Maintenance

LoganStrohl26 Feb 2023 20:46 UTC
33 points
3 comments23 min readLW link

Ta­boo “hu­man-level in­tel­li­gence”

Sherrinford26 Feb 2023 20:42 UTC
12 points
7 comments1 min readLW link

[Link] Pe­ti­tion on brain preser­va­tion: Allow global ac­cess to high-qual­ity brain preser­va­tion as an op­tion rapidly af­ter death

Mati_Roy26 Feb 2023 15:56 UTC
29 points
2 comments1 min readLW link
(www.change.org)

Some thoughts on the cults LW had

Noosphere8926 Feb 2023 15:46 UTC
−5 points
28 comments1 min readLW link

A library for safety re­search in con­di­tion­ing on RLHF tasks

James Chua26 Feb 2023 14:50 UTC
10 points
2 comments1 min readLW link

The Prefer­ence Fulfill­ment Hypothesis

Kaj_Sotala26 Feb 2023 10:55 UTC
66 points
62 comments11 min readLW link

All of my grand­par­ents were prodi­gies, I am ex­tremely bored at Oxford Univer­sity. Please let me in­tern/​work for you!

politicalpersuasion26 Feb 2023 7:50 UTC
−17 points
7 comments3 min readLW link

“Ra­tion­al­ist Dis­course” Is Like “Physi­cist Mo­tors”

Zack_M_Davis26 Feb 2023 5:58 UTC
136 points
153 comments9 min readLW link1 review

[Question] Ways to pre­pare to a vastly new world?

Annapurna26 Feb 2023 4:56 UTC
12 points
6 comments1 min readLW link

In­cen­tives and Selec­tion: A Miss­ing Frame From AI Threat Dis­cus­sions?

DragonGod26 Feb 2023 1:18 UTC
11 points
16 comments2 min readLW link

A mechanis­tic ex­pla­na­tion for SolidGoldMag­ikarp-like to­kens in GPT2

MadHatter26 Feb 2023 1:10 UTC
61 points
14 comments6 min readLW link

Poli­tics is the Fun-Killer

Adam Zerner25 Feb 2023 23:29 UTC
28 points
5 comments2 min readLW link

Bayes is Out-Dated, and You’re Do­ing it Wrong

AnthonyRepetto25 Feb 2023 23:18 UTC
−45 points
44 comments4 min readLW link

[Question] Would more model evals teams be good?

Ryan Kidd25 Feb 2023 22:01 UTC
20 points
4 comments1 min readLW link

Nod posts

Adam Zerner25 Feb 2023 21:53 UTC
26 points
8 comments2 min readLW link

Pre­dic­tion mar­ket: Will John Went­worth’s Gears of Aging se­ries hold up in 2033?

tailcalled25 Feb 2023 20:15 UTC
15 points
4 comments1 min readLW link
(manifold.markets)

Mak­ing Im­plied Stan­dards Explicit

Logan Riggs25 Feb 2023 20:02 UTC
22 points
0 comments4 min readLW link

Two Rea­sons for no Utilitarianism

False Name25 Feb 2023 19:51 UTC
−4 points
3 comments3 min readLW link

Cog­ni­tive Emu­la­tion: A Naive AI Safety Proposal

25 Feb 2023 19:35 UTC
195 points
46 comments4 min readLW link

[Pre­dic­tion] Hu­man­ity will sur­vive the next hun­dred years

lsusr25 Feb 2023 18:59 UTC
33 points
44 comments2 min readLW link

The Ca­plan-Yud­kowsky End-of-the-World Bet Scheme Doesn’t Ac­tu­ally Work

lsusr25 Feb 2023 18:57 UTC
6 points
14 comments2 min readLW link

The Prac­ti­tioner’s Path 2.0: the Em­piri­cist Archetype

Evenflair25 Feb 2023 17:05 UTC
15 points
0 comments1 min readLW link
(guildoftherose.org)

[Question] Pink Shog­goths: What does al­ign­ment look like in prac­tice?

Yuli_Ban25 Feb 2023 12:23 UTC
25 points
13 comments11 min readLW link

Just How Hard a Prob­lem is Align­ment?

Roger Dearnaley25 Feb 2023 9:00 UTC
1 point
1 comment21 min readLW link

Bud­dhist Psy­chotech­nol­ogy for With­stand­ing Apoca­lypse Stress

romeostevensit25 Feb 2023 3:11 UTC
59 points
10 comments5 min readLW link

How to Read Papers Effi­ciently: Fast-then-Slow Three pass method

25 Feb 2023 2:56 UTC
36 points
4 comments4 min readLW link
(ccr.sigcomm.org)

What kind of place is this?

Jim Pivarski25 Feb 2023 2:14 UTC
24 points
24 comments8 min readLW link

Agents vs. Pre­dic­tors: Con­crete differ­en­ti­at­ing factors

evhub24 Feb 2023 23:50 UTC
37 points
3 comments4 min readLW link

Chris­ti­ano (ARC) and GA (Con­jec­ture) Dis­cuss Align­ment Cruxes

24 Feb 2023 23:03 UTC
61 points
7 comments47 min readLW link

Ret­ro­spec­tive on the 2022 Con­jec­ture AI Discussions

Andrea_Miotti24 Feb 2023 22:41 UTC
90 points
5 comments2 min readLW link

How pop­u­lar is ChatGPT? Part 1: more pop­u­lar than Tay­lor Swift

Harlan24 Feb 2023 22:30 UTC
56 points
0 comments2 min readLW link
(aiimpacts.org)

Are you sta­bly al­igned?

Seth Herd24 Feb 2023 22:08 UTC
13 points
0 comments2 min readLW link

Puz­zle Cycles

Screwtape24 Feb 2023 21:35 UTC
8 points
2 comments4 min readLW link

Sam Alt­man: “Plan­ning for AGI and be­yond”

LawrenceC24 Feb 2023 20:28 UTC
104 points
54 comments6 min readLW link
(openai.com)

A Pro­posed Test to Deter­mine the Ex­tent to Which Large Lan­guage Models Un­der­stand the Real World

Bruce G24 Feb 2023 20:20 UTC
4 points
7 comments8 min readLW link

Meta “open sources” LMs com­pet­i­tive with Chin­chilla, PaLM, and code-davinci-002 (Paper)

LawrenceC24 Feb 2023 19:57 UTC
38 points
19 comments1 min readLW link
(research.facebook.com)

Re­la­tion­ship Orientations

DaystarEld24 Feb 2023 19:43 UTC
37 points
1 comment3 min readLW link
(daystareld.com)

The alien simu­la­tion meme doesn’t make sense

FTPickle24 Feb 2023 19:27 UTC
4 points
1 comment1 min readLW link

Exit Duty Gen­er­a­tor by Matti Häyry

Oldphan24 Feb 2023 18:35 UTC
−2 points
0 comments1 min readLW link
(www.cambridge.org)

2023 Stan­ford Ex­is­ten­tial Risks Conference

elizabethcooper24 Feb 2023 18:35 UTC
7 points
0 comments1 min readLW link

How ma­jor gov­ern­ments can help with the most im­por­tant century

HoldenKarnofsky24 Feb 2023 18:20 UTC
29 points
0 comments4 min readLW link
(www.cold-takes.com)

Con­sent Isn’t Always Enough

jefftk24 Feb 2023 15:40 UTC
57 points
16 comments3 min readLW link
(www.jefftk.com)

[Question] Train­ing for cor­ri­ga­bil­ity: ob­vi­ous prob­lems?

Ben Amitay24 Feb 2023 14:02 UTC
4 points
6 comments1 min readLW link

Death and Des­per­a­tion

Ustice24 Feb 2023 12:43 UTC
1 point
3 comments1 min readLW link

[Question] Are there ra­tio­nal­ity tech­niques similar to star­ing at the wall for 4 hours?

trevor24 Feb 2023 11:48 UTC
31 points
8 comments1 min readLW link