The Mask Comes Off: At What Price?

Zvi21 Oct 2024 23:50 UTC
71 points
16 comments8 min readLW link
(thezvi.wordpress.com)

Dist­in­guish­ing ways AI can be “con­cen­trated”

Matthew Barnett21 Oct 2024 22:21 UTC
28 points
2 comments1 min readLW link

Jailbreak­ing ChatGPT and Claude us­ing Web API Con­text Injection

Jaehyuk Lim21 Oct 2024 21:34 UTC
4 points
0 comments3 min readLW link

How to Teach Your Brain to Hate Procrastination

10xyz21 Oct 2024 20:12 UTC
3 points
0 comments2 min readLW link

Paus­ing for what?

MountainPath21 Oct 2024 20:12 UTC
0 points
1 comment1 min readLW link

What is au­ton­omy? Why bound­aries are nec­es­sary.

Chipmonk21 Oct 2024 17:56 UTC
8 points
1 comment1 min readLW link
(chrislakin.blog)

Could ran­domly choos­ing peo­ple to serve as rep­re­sen­ta­tives lead to bet­ter gov­ern­ment?

John Huang21 Oct 2024 17:10 UTC
75 points
13 comments10 min readLW link

There aren’t enough smart peo­ple in biol­ogy do­ing some­thing boring

Abhishaike Mahajan21 Oct 2024 15:52 UTC
27 points
13 comments10 min readLW link

Au­toma­tion collapse

21 Oct 2024 14:50 UTC
70 points
9 comments7 min readLW link

What AI com­pa­nies should do: Some rough ideas

Zach Stein-Perlman21 Oct 2024 14:00 UTC
33 points
10 comments5 min readLW link

[Question] What should OpenAI do that it hasn’t already done, to stop their va­can­cies from be­ing ad­ver­tised on the 80k Job Board?

WitheringWeights21 Oct 2024 13:57 UTC
21 points
0 comments1 min readLW link

A Rocket–In­ter­pretabil­ity Analogy

plex21 Oct 2024 13:55 UTC
149 points
31 comments1 min readLW link

Tokyo AI Safety 2025: Call For Papers

Blaine21 Oct 2024 8:43 UTC
24 points
0 comments3 min readLW link
(www.tais2025.cc)

OpenAI defected, but we can take hon­est actions

Remmelt21 Oct 2024 8:41 UTC
17 points
16 comments1 min readLW link

Slightly More Than You Wanted To Know: Preg­nancy Length Effects

JustisMills21 Oct 2024 1:26 UTC
62 points
4 comments5 min readLW link
(justismills.substack.com)

In­for­ma­tion vs Assurance

johnswentworth20 Oct 2024 23:16 UTC
185 points
17 comments2 min readLW link

Liquid vs Illiquid Ca­reers

vaishnav9220 Oct 2024 23:03 UTC
33 points
7 comments7 min readLW link
(vaishnavsunil.substack.com)

AI Can be “Gra­di­ent Aware” Without Do­ing Gra­di­ent hack­ing.

Sodium20 Oct 2024 21:02 UTC
20 points
0 comments2 min readLW link

A brief the­ory of why we think things are good or bad

David Johnston20 Oct 2024 20:31 UTC
7 points
10 comments1 min readLW link

Think­ing in 2D

sarahconstantin20 Oct 2024 19:30 UTC
27 points
0 comments8 min readLW link
(sarahconstantin.substack.com)

Pod­cast dis­cussing Han­son’s Cul­tural Drift Argument

20 Oct 2024 17:58 UTC
3 points
0 comments1 min readLW link
(moralmayhem.substack.com)

Ad­vice on Com­mu­ni­cat­ing Concisely

EvolutionByDesign20 Oct 2024 16:45 UTC
2 points
9 comments1 min readLW link

Am­bi­gui­ties or the is­sues we face with AI in medicine

Thehumanproject.ai20 Oct 2024 16:45 UTC
2 points
0 comments5 min readLW link

The Per­sonal Im­pli­ca­tions of AGI Realism

xizneb20 Oct 2024 16:43 UTC
7 points
7 comments5 min readLW link

Safety tax functions

owencb20 Oct 2024 14:08 UTC
30 points
0 comments6 min readLW link
(strangecities.substack.com)

Ex­plor­ing the Pla­tonic Rep­re­sen­ta­tion Hy­poth­e­sis Beyond In-Distri­bu­tion Data

rokosbasilisk20 Oct 2024 8:40 UTC
3 points
2 comments1 min readLW link

Elec­toral Systems

RedFishBlueFish20 Oct 2024 3:25 UTC
1 point
0 comments14 min readLW link

Over­com­ing Bias Anthology

Arjun Panickssery20 Oct 2024 2:01 UTC
164 points
14 comments2 min readLW link
(overcoming-bias-anthology.com)

D/​acc AI Se­cu­rity Salon

Allison Duettmann19 Oct 2024 22:17 UTC
19 points
0 comments1 min readLW link

Who Should Have Been Killed, and Con­tains Neato? Who Else Could It Be, but that Villain Mag­neto!

Ace Delgado19 Oct 2024 20:39 UTC
−16 points
0 comments1 min readLW link

If far-UV is so great, why isn’t it ev­ery­where?

Austin Chen19 Oct 2024 18:56 UTC
70 points
23 comments1 min readLW link
(strainhardening.substack.com)

What if AGI was already ac­ci­den­tally cre­ated in 2019? [Fic­tional story]

Alice Wanderland19 Oct 2024 9:17 UTC
−3 points
2 comments15 min readLW link
(aliceandbobinwanderland.substack.com)

[Question] What ac­tual bad out­come has “ethics-based” RLHF AI Align­ment already pre­vented?

Roko19 Oct 2024 6:11 UTC
7 points
16 comments1 min readLW link

[Question] What’s a good book for a tech­ni­cally-minded 11-year old?

Martin Sustrik19 Oct 2024 6:05 UTC
10 points
32 comments1 min readLW link

Method­ol­ogy: Con­ta­gious Beliefs

James Stephen Brown19 Oct 2024 3:58 UTC
3 points
0 comments7 min readLW link

AI Prej­u­dices: Prac­ti­cal Implications

PeterMcCluskey19 Oct 2024 2:19 UTC
12 points
0 comments5 min readLW link
(bayesianinvestor.com)

Start an Up­per-Room UV In­stal­la­tion Com­pany?

jefftk19 Oct 2024 2:00 UTC
44 points
9 comments1 min readLW link
(www.jefftk.com)

How I’d like al­ign­ment to get done (as of 2024-10-18)

TristanTrim18 Oct 2024 23:39 UTC
11 points
4 comments4 min readLW link

Sab­o­tage Eval­u­a­tions for Fron­tier Models

18 Oct 2024 22:33 UTC
93 points
55 comments6 min readLW link
(assets.anthropic.com)

D&D Sci Coli­seum: Arena of Data

aphyer18 Oct 2024 22:02 UTC
41 points
23 comments4 min readLW link

the Day­di­ca­tion technique

chaosmage18 Oct 2024 21:47 UTC
27 points
0 comments2 min readLW link

[Linkpost] Hawk­ish na­tion­al­ism vs in­ter­na­tional AI power and benefit sharing

18 Oct 2024 18:13 UTC
7 points
5 comments1 min readLW link
(nacicankaya.substack.com)

LLM Psy­cho­met­rics and Prompt-In­duced Psychopathy

Korbinian K.18 Oct 2024 18:11 UTC
12 points
2 comments10 min readLW link

A short pro­ject on Mamba: grokking & interpretability

Alejandro Tlaie18 Oct 2024 16:59 UTC
21 points
0 comments6 min readLW link

LLMs can learn about them­selves by introspection

18 Oct 2024 16:12 UTC
102 points
38 comments9 min readLW link

[Question] Are there more than 12 paths to Su­per­in­tel­li­gence?

p4rziv4l18 Oct 2024 16:05 UTC
−3 points
0 comments1 min readLW link

Low Prob­a­bil­ity Es­ti­ma­tion in Lan­guage Models

Gabriel Wu18 Oct 2024 15:50 UTC
50 points
0 comments10 min readLW link
(www.alignment.org)

The Mys­te­ri­ous Trump Buy­ers on Polymarket

Annapurna18 Oct 2024 13:26 UTC
52 points
10 comments2 min readLW link
(jorgevelez.substack.com)

On In­ten­tion­al­ity, or: Towards a More In­clu­sive Con­cept of Lying

Cornelius Dybdahl18 Oct 2024 10:37 UTC
8 points
0 comments4 min readLW link

Species as Canon­i­cal Refer­ents of Su­per-Organisms

Yudhister Kumar18 Oct 2024 7:49 UTC
9 points
8 comments2 min readLW link
(www.yudhister.me)