What’s go­ing on with Per-Com­po­nent Weight Up­dates?

4gate22 Aug 2024 21:22 UTC
1 point
0 comments6 min readLW link

In­ter­op­er­a­ble High Level Struc­tures: Early Thoughts on Adjectives

22 Aug 2024 21:12 UTC
49 points
1 comment7 min readLW link

In­ter­est poll: A time-waster blocker for desk­top Linux programs

nahoj22 Aug 2024 20:44 UTC
4 points
5 comments1 min readLW link

Turn­ing 22 in the Pre-Apocalypse

testingthewaters22 Aug 2024 20:28 UTC
37 points
14 comments24 min readLW link
(utilityhotbar.github.io)

A Ro­bust Nat­u­ral La­tent Over A Mixed Distri­bu­tion Is Nat­u­ral Over The Distri­bu­tions Which Were Mixed

22 Aug 2024 19:19 UTC
42 points
4 comments4 min readLW link

what be­com­ing more se­cure did for me

Chipmonk22 Aug 2024 17:44 UTC
26 points
5 comments2 min readLW link
(chrislakin.blog)

A primer on the cur­rent state of longevity research

Abhishaike Mahajan22 Aug 2024 17:14 UTC
109 points
6 comments14 min readLW link
(www.owlposting.com)

Some rea­sons to start a pro­ject to stop harm­ful AI

Remmelt22 Aug 2024 16:23 UTC
5 points
0 comments2 min readLW link

The eco­nomics of space tethers

harsimony22 Aug 2024 16:15 UTC
67 points
22 comments7 min readLW link
(splittinginfinity.substack.com)

Dima’s Shortform

Dmitrii Krasheninnikov22 Aug 2024 14:49 UTC
1 point
0 comments1 min readLW link

AI #78: Some Wel­come Calm

Zvi22 Aug 2024 14:20 UTC
61 points
15 comments33 min readLW link
(thezvi.wordpress.com)

[Question] How do we know dreams aren’t real?

Logan Zoellner22 Aug 2024 12:41 UTC
5 points
31 comments1 min readLW link

Mea­sur­ing Struc­ture Devel­op­ment in Al­gorith­mic Transformers

22 Aug 2024 8:38 UTC
56 points
4 comments11 min readLW link

De­cep­tion and Jailbreak Se­quence: 1. Iter­a­tive Refine­ment Stages of De­cep­tion in LLMs

22 Aug 2024 7:32 UTC
23 points
1 comment21 min readLW link

Just be­cause an LLM said it doesn’t mean it’s true: an illus­tra­tive example

dirk21 Aug 2024 21:05 UTC
26 points
12 comments3 min readLW link

[Question] How do you finish your tasks faster?

Cipolla21 Aug 2024 20:01 UTC
4 points
2 comments1 min readLW link

AI Safety Newslet­ter #40: Cal­ifor­nia AI Leg­is­la­tion Plus, NVIDIA De­lays Chip Pro­duc­tion, and Do AI Safety Bench­marks Ac­tu­ally Mea­sure Safety?

21 Aug 2024 18:09 UTC
11 points
0 comments6 min readLW link
(newsletter.safe.ai)

[Question] Should LW sug­gest stan­dard metaprompts?

Dagon21 Aug 2024 16:41 UTC
3 points
6 comments1 min readLW link

Eter­nal Ex­is­tence and Eter­nal Bore­dom: The Case for AI and Im­mor­tal Humans

Tuan Tu Nguyen21 Aug 2024 9:58 UTC
−12 points
2 comments5 min readLW link

Please do not use AI to write for you

Richard_Kennaway21 Aug 2024 9:53 UTC
65 points
34 comments4 min readLW link

Ap­ply to Aether—In­de­pen­dent LLM Agent Safety Re­search Group

RohanS21 Aug 2024 9:47 UTC
9 points
0 comments7 min readLW link
(forum.effectivealtruism.org)

the Giga Press was a mistake

bhauth21 Aug 2024 4:51 UTC
95 points
26 comments5 min readLW link
(bhauth.com)

Ex­plor­ing the Boundaries of Cog­ni­to­haz­ards and the Na­ture of Reality

Victor Novikov21 Aug 2024 3:42 UTC
−2 points
2 comments1 min readLW link

[Question] What is the point of 2v2 de­bates?

Axel Ahlqvist20 Aug 2024 21:59 UTC
2 points
1 comment1 min readLW link

[Question] Where should I look for in­for­ma­tion on gut health?

FinalFormal220 Aug 2024 19:44 UTC
10 points
10 comments1 min readLW link

Would you benefit from, or ob­ject to, a page with LW users’ re­acts?

Raemon20 Aug 2024 16:35 UTC
23 points
6 comments1 min readLW link

Free­dom of Speech

Zero Contradictions20 Aug 2024 16:34 UTC
−13 points
2 comments2 min readLW link
(thewaywardaxolotl.blogspot.com)

AGI Safety and Align­ment at Google Deep­Mind: A Sum­mary of Re­cent Work

20 Aug 2024 16:22 UTC
222 points
33 comments9 min readLW link

Try­ing to be ra­tio­nal for the wrong reasons

Viliam20 Aug 2024 16:18 UTC
25 points
9 comments3 min readLW link

[Question] How great is the util­ity of “sav­ing” en­dan­gered lan­guages?

SpectrumDT20 Aug 2024 13:14 UTC
18 points
29 comments1 min readLW link

Guide to SB 1047

Zvi20 Aug 2024 13:10 UTC
71 points
18 comments53 min readLW link
(thezvi.wordpress.com)

Find­ing De­cep­tion in Lan­guage Models

20 Aug 2024 9:42 UTC
18 points
4 comments4 min readLW link

Next au­to­mated rea­son­ing grand challenge: CompCert

sanxiyn20 Aug 2024 5:27 UTC
−5 points
0 comments1 min readLW link

Thiel on AI & Rac­ing with China

Ben Pace20 Aug 2024 3:19 UTC
54 points
10 comments12 min readLW link

Reflect­ing on the tran­shu­man­ist re­but­tal to AI ex­is­ten­tial risk and cri­tique of our de­bate method­olo­gies and mi­suse of statistics

catgirlsruletheworld20 Aug 2024 1:59 UTC
−5 points
0 comments4 min readLW link

Ar­tifi­cial In­tel­li­gence and Eter­nal Tor­ture and Suffering

Tuan Tu Nguyen20 Aug 2024 1:53 UTC
−1 points
0 comments4 min readLW link

AI #77: A Few Upgrades

Zvi20 Aug 2024 0:20 UTC
23 points
3 comments52 min readLW link
(thezvi.wordpress.com)

Monthly Roundup #21: Au­gust 2024

Zvi20 Aug 2024 0:20 UTC
22 points
6 comments40 min readLW link
(thezvi.wordpress.com)

[Linkpost] Au­to­mated De­sign of Agen­tic Systems

Bogdan Ionut Cirstea19 Aug 2024 23:06 UTC
8 points
1 comment1 min readLW link
(arxiv.org)

Limi­ta­tions on For­mal Ver­ifi­ca­tion for AI Safety

Andrew Dickson19 Aug 2024 23:03 UTC
134 points
60 comments23 min readLW link

The Con­scious River: Con­scious Tur­ing ma­chines negate ma­te­ri­al­ism

blallo19 Aug 2024 21:54 UTC
0 points
4 comments7 min readLW link

LLM Ap­pli­ca­tions I Want To See

sarahconstantin19 Aug 2024 21:10 UTC
102 points
5 comments8 min readLW link
(sarahconstantin.substack.com)

Defin­ing al­ign­ment research

Richard_Ngo19 Aug 2024 20:42 UTC
91 points
23 comments7 min readLW link

Vilnius – ACX Mee­tups Every­where Fall 2024

19 Aug 2024 17:38 UTC
3 points
1 comment1 min readLW link

Can Cur­rent LLMs be Trusted To Pro­duce Paper­clips Safely?

Rohit Chatterjee19 Aug 2024 17:17 UTC
4 points
0 comments9 min readLW link

A primer on why com­pu­ta­tional pre­dic­tive tox­i­col­ogy is hard

Abhishaike Mahajan19 Aug 2024 17:16 UTC
63 points
2 comments12 min readLW link
(www.owlposting.com)

In­tro­duc­tion and Ex­plo­ra­tion of AI Ethics Through a Global Lens

ThePathYouWillChoose19 Aug 2024 17:11 UTC
1 point
0 comments1 min readLW link

Trust­wor­thy and un­trust­wor­thy models

Olli Järviniemi19 Aug 2024 16:27 UTC
46 points
3 comments8 min readLW link

Apart­ment Price Map Discontinuity

jefftk19 Aug 2024 15:30 UTC
12 points
0 comments1 min readLW link
(www.jefftk.com)

Will we ever run out of new jobs?

Kevin Kohler19 Aug 2024 15:04 UTC
17 points
7 comments7 min readLW link
(machinocene.substack.com)