Some Rules for an Alge­bra of Bayes Nets

16 Nov 2023 23:53 UTC
77 points
38 comments14 min readLW link1 review

How much to up­date on re­cent AI gov­er­nance moves?

16 Nov 2023 23:46 UTC
112 points
5 comments29 min readLW link

New LessWrong fea­ture: Dialogue Matching

jacobjacob16 Nov 2023 21:27 UTC
106 points
22 comments3 min readLW link

Towards Eval­u­at­ing AI Sys­tems for Mo­ral Sta­tus Us­ing Self-Reports

16 Nov 2023 20:18 UTC
45 points
3 comments1 min readLW link
(arxiv.org)

So­cial Dark Matter

Duncan Sabien (Deactivated)16 Nov 2023 20:00 UTC
326 points
116 comments34 min readLW link1 review

AI #38: Let’s Make a Deal

Zvi16 Nov 2023 19:50 UTC
44 points
2 comments55 min readLW link
(thezvi.wordpress.com)

Fore­cast­ing AI (Overview)

jsteinhardt16 Nov 2023 19:00 UTC
35 points
0 comments2 min readLW link
(bounded-regret.ghost.io)

We Should Talk About This More. Epistemic World Col­lapse as Im­mi­nent Safety Risk of Gen­er­a­tive AI.

Joerg Weiss16 Nov 2023 18:46 UTC
11 points
2 comments29 min readLW link

In­tel­li­gence in sys­tems (hu­man, AI) can be con­cep­tu­al­ized as the re­s­olu­tion and through­put at which a sys­tem can pro­cess and af­fect Shan­non in­for­ma­tion.

AiresJL16 Nov 2023 17:46 UTC
0 points
0 comments2 min readLW link

Life on the Grid (Part 2)

rogersbacon16 Nov 2023 17:22 UTC
7 points
0 comments15 min readLW link
(www.secretorum.life)

The im­pos­si­bil­ity of ra­tio­nally an­a­lyz­ing par­ti­san news

RationalDino16 Nov 2023 16:19 UTC
4 points
4 comments1 min readLW link

We are Peace­craft.ai!

MadHatter16 Nov 2023 14:15 UTC
15 points
20 comments2 min readLW link

A di­alec­ti­cal view of the his­tory of AI, Part 1: We’re only in the an­tithe­sis phase. [A syn­the­sis is in the fu­ture.]

Bill Benzon16 Nov 2023 12:34 UTC
6 points
0 comments12 min readLW link

[Question] How much fraud is there in academia?

ChristianKl16 Nov 2023 11:50 UTC
23 points
10 comments1 min readLW link

Learn­ing co­effi­cient es­ti­ma­tion: the details

Zach Furman16 Nov 2023 3:19 UTC
36 points
0 comments2 min readLW link
(colab.research.google.com)

[Question] AI Safety orgs- what’s your biggest bot­tle­neck right now?

Kabir Kumar16 Nov 2023 2:02 UTC
1 point
0 comments1 min readLW link

My cri­tique of Eliezer’s deeply ir­ra­tional beliefs

Jorterder16 Nov 2023 0:34 UTC
−33 points
1 comment9 min readLW link
(docs.google.com)

Ex­trap­o­lat­ing from Five Words

Gordon Seidoh Worley15 Nov 2023 23:21 UTC
40 points
11 comments2 min readLW link

In Defense of Parselmouths

Screwtape15 Nov 2023 23:02 UTC
48 points
10 comments10 min readLW link

Life on the Grid (Part 1)

rogersbacon15 Nov 2023 22:37 UTC
12 points
4 comments9 min readLW link
(www.secretorum.life)

Glo­ma­riza­tion FAQ

Zane15 Nov 2023 20:20 UTC
30 points
5 comments5 min readLW link

Testbed evals: eval­u­at­ing AI safety even when it can’t be di­rectly mea­sured

joshc15 Nov 2023 19:00 UTC
71 points
2 comments4 min readLW link

EA/​ACX/​LW Novem­ber Santa Cruz Meetup

madmail15 Nov 2023 18:39 UTC
1 point
0 comments1 min readLW link

New re­port: “Schem­ing AIs: Will AIs fake al­ign­ment dur­ing train­ing in or­der to get power?”

Joe Carlsmith15 Nov 2023 17:16 UTC
80 points
26 comments30 min readLW link

Large Lan­guage Models can Strate­gi­cally De­ceive their Users when Put Un­der Pres­sure.

ReaderM15 Nov 2023 16:36 UTC
89 points
9 comments2 min readLW link1 review
(arxiv.org)

AISN #26: Na­tional In­sti­tu­tions for AI Safety, Re­sults From the UK Sum­mit, and New Re­leases From OpenAI and xAI

15 Nov 2023 16:07 UTC
13 points
0 comments6 min readLW link
(newsletter.safe.ai)

‘The­o­ries of Values’ and ‘The­o­ries of Agents’: con­fu­sions, mus­ings and desiderata

15 Nov 2023 16:00 UTC
35 points
8 comments24 min readLW link

Ex­pe­riences and learn­ings from both sides of the AI safety job market

Marius Hobbhahn15 Nov 2023 15:40 UTC
110 points
4 comments18 min readLW link

Good busi­nesses cre­ate epistemic monopolies

Logan Kieller15 Nov 2023 14:04 UTC
−2 points
2 comments4 min readLW link
(logankieller.substack.com)

A con­cep­tual pre­cur­sor to to­day’s lan­guage ma­chines [Shan­non]

Bill Benzon15 Nov 2023 13:50 UTC
24 points
6 comments2 min readLW link

[Question] Should Ad­vanced Place­ment High School classes dis­cuss Is­rael-Pales­tine? If so, how? If not, why? Who should make this de­ci­sion?

Gesild Muka15 Nov 2023 4:50 UTC
−1 points
5 comments1 min readLW link

Re­in­force­ment Via Giv­ing Peo­ple Cookies

Screwtape15 Nov 2023 4:34 UTC
66 points
9 comments6 min readLW link

In­ci­den­tal polysemanticity

15 Nov 2023 4:00 UTC
43 points
7 comments11 min readLW link

LLMs May Find It Hard to FOOM

RogerDearnaley15 Nov 2023 2:52 UTC
11 points
30 comments12 min readLW link

Lin­ear­ity Fallacies

hippo15 Nov 2023 2:23 UTC
15 points
0 comments5 min readLW link

SIA Is Just Be­ing a Bayesian About the Fact That One Ex­ists

omnizoid14 Nov 2023 22:55 UTC
3 points
5 comments4 min readLW link

AI Align­ment [progress] this Week (11/​12/​2023)

Logan Zoellner14 Nov 2023 22:21 UTC
6 points
0 comments2 min readLW link
(midwitalignment.substack.com)

[Question] When did Eliezer Yud­kowsky change his mind about neu­ral net­works?

[deactivated]14 Nov 2023 21:24 UTC
31 points
15 comments1 min readLW link

Bet­ting on what is un-falsifi­able and un-verifiable

Abhimanyu Pallavi Sudhir14 Nov 2023 21:11 UTC
13 points
0 comments15 min readLW link

Face­book is Pay­ing Me to Post

jefftk14 Nov 2023 19:10 UTC
26 points
5 comments1 min readLW link
(www.jefftk.com)

Feel­ings, Noth­ing More than Feel­ings, About AI

PaulBecon14 Nov 2023 18:50 UTC
7 points
0 comments3 min readLW link

Kids or No kids

Kids or no kids14 Nov 2023 18:37 UTC
95 points
10 comments13 min readLW link

Rae­mon’s De­liber­ate (“Pur­pose­ful?”) Prac­tice Club

14 Nov 2023 18:24 UTC
60 points
11 comments22 min readLW link

More metal less ore

Logan Kieller14 Nov 2023 16:59 UTC
6 points
3 comments2 min readLW link
(logankieller.substack.com)

Monthly Roundup #12: Novem­ber 2023

Zvi14 Nov 2023 15:20 UTC
34 points
5 comments33 min readLW link
(thezvi.wordpress.com)

Do you want a first-prin­ci­pled pre­pared­ness guide to pre­pare your­self and loved ones for po­ten­tial catas­tro­phes?

Ulrik Horn14 Nov 2023 12:13 UTC
16 points
5 comments15 min readLW link

[Question] Is there Work on Embed­ded Agency in Cel­lu­lar Au­tomata Toy Models?

Johannes C. Mayer14 Nov 2023 9:08 UTC
10 points
0 comments1 min readLW link

[Question] Would this be Progress in Solv­ing Embed­ded Agency?

Johannes C. Mayer14 Nov 2023 9:08 UTC
9 points
2 comments2 min readLW link

Is In­ter­pretabil­ity All We Need?

RogerDearnaley14 Nov 2023 5:31 UTC
1 point
1 comment1 min readLW link

What is wis­dom?

TsviBT14 Nov 2023 2:13 UTC
37 points
3 comments13 min readLW link