Ap­prais­ing ag­grega­tivism and utilitarianism

Cleo Nardo21 Jun 2024 23:10 UTC
27 points
10 comments19 min readLW link

Best-of-n with mis­al­igned re­ward mod­els for Math reasoning

Fabien Roger21 Jun 2024 22:53 UTC
25 points
0 comments3 min readLW link

No re­ally, the Sticker Short­cut fal­lacy is in­deed a fallacy

ymeskhout21 Jun 2024 22:27 UTC
11 points
2 comments5 min readLW link
(www.ymeskhout.com)

Sara­jevo 1914: Black Swan Questions

SebastianG 21 Jun 2024 21:27 UTC
8 points
0 comments2 min readLW link

Yud­kowsky is too op­ti­mistic about how AI will treat hu­mans.

ProfessorFalken21 Jun 2024 19:01 UTC
0 points
1 comment1 min readLW link

Juneberry Puffs

jefftk21 Jun 2024 18:50 UTC
15 points
0 comments1 min readLW link
(www.jefftk.com)

Let’s De­sign a School, Part 3.2 Costs

Sable21 Jun 2024 17:58 UTC
8 points
0 comments5 min readLW link
(affablyevil.substack.com)

2022 AI Align­ment Course: 5→37% work­ing on AI safety

Dewi21 Jun 2024 17:45 UTC
7 points
3 comments3 min readLW link

Some Thoughts on AI Align­ment: Us­ing AI to Con­trol AI

eigenvalue21 Jun 2024 17:44 UTC
1 point
1 comment1 min readLW link
(github.com)

What dis­t­in­guishes “early”, “mid” and “end” games?

Raemon21 Jun 2024 17:41 UTC
47 points
22 comments1 min readLW link

Nu­clear War, Map and Ter­ri­tory, Values | Guild of the Rose Newslet­ter, May 2024

moridinamael21 Jun 2024 17:39 UTC
18 points
0 comments4 min readLW link
(guildoftherose.org)

AI gov­er­nance needs a the­ory of victory

21 Jun 2024 16:15 UTC
34 points
6 comments1 min readLW link
(www.convergenceanalysis.org)

Con­nect­ing the Dots: LLMs can In­fer & Ver­bal­ize La­tent Struc­ture from Train­ing Data

21 Jun 2024 15:54 UTC
160 points
13 comments8 min readLW link
(arxiv.org)

On OpenAI’s Model Spec

Zvi21 Jun 2024 13:00 UTC
46 points
3 comments30 min readLW link
(thezvi.wordpress.com)

At­ten­tion Out­put SAEs Im­prove Cir­cuit Analysis

21 Jun 2024 12:56 UTC
33 points
1 comment19 min readLW link

“New­ton’s laws” of finance

pchvykov21 Jun 2024 9:41 UTC
9 points
3 comments10 min readLW link

Cap­i­tal­is­ing On Trust—A Simulation

James Stephen Brown21 Jun 2024 4:43 UTC
2 points
0 comments1 min readLW link
(nonzerosum.games)

″… than av­er­age” is (al­most) meaningless

jwfiredragon21 Jun 2024 4:42 UTC
16 points
6 comments3 min readLW link

The Ker­nel of Mean­ing in Prop­erty Rights

Abhimanyu Pallavi Sudhir21 Jun 2024 1:12 UTC
7 points
6 comments2 min readLW link

En­riched tab is now the de­fault LW Front­page ex­pe­rience for logged-in users

21 Jun 2024 0:09 UTC
46 points
27 comments3 min readLW link

De­bate, Or­a­cles, and Obfus­cated Arguments

20 Jun 2024 23:14 UTC
40 points
2 comments21 min readLW link

Eva­po­ra­tion of improvements

Viliam20 Jun 2024 18:34 UTC
28 points
27 comments2 min readLW link

In­ter­pret­ing and Steer­ing Fea­tures in Images

Gytis Daujotas20 Jun 2024 18:33 UTC
65 points
6 comments5 min readLW link

Claude 3.5 Sonnet

Zach Stein-Perlman20 Jun 2024 18:00 UTC
75 points
41 comments1 min readLW link
(www.anthropic.com)

[Question] What is go­ing to hap­pen in a case of an AGI era where hu­mans are out of the game?

Cipolla20 Jun 2024 17:44 UTC
−2 points
1 comment1 min readLW link

Jailbreak steer­ing generalization

20 Jun 2024 17:25 UTC
41 points
4 comments2 min readLW link
(arxiv.org)

Case stud­ies on so­cial-welfare-based stan­dards in var­i­ous industries

HoldenKarnofsky20 Jun 2024 13:33 UTC
42 points
0 comments1 min readLW link

AI #69: Nice

Zvi20 Jun 2024 12:40 UTC
65 points
9 comments51 min readLW link
(thezvi.wordpress.com)

Niche product design

Itay Dreyfus20 Jun 2024 6:34 UTC
2 points
1 comment3 min readLW link
(productidentity.co)

Data on AI

20 Jun 2024 6:31 UTC
1 point
0 comments1 min readLW link
(epochai.org)

Ac­tu­ally, Power Plants May Be an AI Train­ing Bot­tle­neck.

Lao Mein20 Jun 2024 4:41 UTC
83 points
13 comments2 min readLW link

Propos­ing the Post-Sin­gu­lar­ity Sym­biotic Researches

Hiroshi Yamakawa20 Jun 2024 4:05 UTC
5 points
0 comments12 min readLW link

Week One of Study­ing Trans­form­ers Architecture

JustisMills20 Jun 2024 3:47 UTC
3 points
0 comments15 min readLW link
(justismills.substack.com)

[Question] What are things you’re al­lowed to do as a startup?

Elizabeth20 Jun 2024 0:01 UTC
30 points
9 comments1 min readLW link

LessWrong/​ACX meetup Tran­sil­vanya tour—Alba Iulia

Marius Adrian Nicoară19 Jun 2024 19:56 UTC
1 point
1 comment1 min readLW link

Chronic perfec­tion­ism through the eyes of school reports

Stuart Johnson19 Jun 2024 17:46 UTC
13 points
3 comments1 min readLW link

Ilya Sutskever cre­ated a new AGI startup

harfe19 Jun 2024 17:17 UTC
95 points
35 comments1 min readLW link
(ssi.inc)

Beyond the Board: Ex­plor­ing AI Ro­bust­ness Through Go

AdamGleave19 Jun 2024 16:40 UTC
41 points
2 comments1 min readLW link
(far.ai)

A study on cults and non-cults—an­swer ques­tions about a group and get a cult score

spencerg19 Jun 2024 14:30 UTC
1 point
8 comments1 min readLW link
(www.guidedtrack.com)

Work­shop: data anal­y­sis for soft­ware engineers

Derek M. Jones19 Jun 2024 14:20 UTC
2 points
0 comments1 min readLW link

FLEXIBLE AND ADAPTABLE LLM’s WITH CONTINUOUS SELF TRAINING

Escaque 6619 Jun 2024 14:17 UTC
−11 points
0 comments3 min readLW link

Sur­viv­ing Seveneves

Yair Halberstadt19 Jun 2024 13:11 UTC
41 points
4 comments11 min readLW link

Self re­spon­si­bil­ity

Elo19 Jun 2024 10:17 UTC
17 points
3 comments2 min readLW link

Gizmo Watch Review

jefftk18 Jun 2024 20:00 UTC
22 points
3 comments6 min readLW link
(www.jefftk.com)

Boy­cott OpenAI

PeterMcCluskey18 Jun 2024 19:52 UTC
163 points
26 comments1 min readLW link
(bayesianinvestor.com)

Lov­ing a world you don’t trust

Joe Carlsmith18 Jun 2024 19:31 UTC
134 points
13 comments33 min readLW link

Book re­view: the Iliad

philh18 Jun 2024 18:50 UTC
31 points
2 comments14 min readLW link
(reasonableapproximation.net)

AI Safety Newslet­ter #37: US Launches An­titrust In­ves­ti­ga­tions Plus, re­cent crit­i­cisms of OpenAI and An­thropic, and a sum­mary of Si­tu­a­tional Awareness

18 Jun 2024 18:07 UTC
8 points
0 comments5 min readLW link
(newsletter.safe.ai)

Suffer­ing Is Not Pain

jbkjr18 Jun 2024 18:04 UTC
34 points
45 comments5 min readLW link
(jbkjr.me)

Lam­ini’s Tar­geted Hal­lu­ci­na­tion Re­duc­tion May Be a Big Deal for Job Automation

sweenesm18 Jun 2024 15:29 UTC
3 points
0 comments1 min readLW link