An­nounc­ing the AI Fore­cast­ing Bench­mark Series | July 8, $120k in Prizes

ChristianWilliams2 Jul 2024 22:33 UTC
15 points
0 comments1 min readLW link
(www.metaculus.com)

Open Sourc­ing Metaculus

ChristianWilliams2 Jul 2024 22:30 UTC
44 points
0 comments1 min readLW link
(www.metaculus.com)

[Question] Why Can’t Sub-AGI Solve AI Align­ment? Or: Why Would Sub-AGI AI Not be Aligned?

MrThink2 Jul 2024 20:13 UTC
4 points
23 comments1 min readLW link

[Question] Why haven’t there been as­sas­si­na­tion at­tempts against high pro­file AI ac­cel­er­a­tionists like sam alt­man yet?

louisTrem2 Jul 2024 18:16 UTC
−13 points
4 comments2 min readLW link

How ARENA course ma­te­rial gets made

CallumMcDougall2 Jul 2024 18:04 UTC
41 points
2 comments7 min readLW link

An AI Race With China Can Be Bet­ter Than Not Racing

niplav2 Jul 2024 17:57 UTC
69 points
33 comments11 min readLW link

List of Col­lec­tive In­tel­li­gence Projects

Chipmonk2 Jul 2024 14:10 UTC
40 points
9 comments2 min readLW link
(chrislakin.blog)

De­com­pos­ing the QK cir­cuit with Bilin­ear Sparse Dic­tionary Learning

2 Jul 2024 13:17 UTC
81 points
7 comments12 min readLW link

Eco­nomics Roundup #2

Zvi2 Jul 2024 12:40 UTC
35 points
5 comments23 min readLW link
(thezvi.wordpress.com)

How Con­gres­sional Offices Pro­cess Con­stituent Communication

Tristan Williams2 Jul 2024 12:38 UTC
24 points
0 comments1 min readLW link

Othel­loGPT learned a bag of heuristics

2 Jul 2024 9:12 UTC
108 points
10 comments9 min readLW link

Blueprint for a Brighter Fu­ture

Alex Beyman2 Jul 2024 6:15 UTC
−1 points
0 comments5 min readLW link

Covert Mal­i­cious Finetuning

2 Jul 2024 2:41 UTC
89 points
4 comments3 min readLW link

In­ter­pret­ing Prefer­ence Models w/​ Sparse Autoencoders

1 Jul 2024 21:35 UTC
74 points
12 comments9 min readLW link

Hon­est sci­ence is spirituality

pchvykov1 Jul 2024 20:33 UTC
−1 points
10 comments4 min readLW link

New Ex­ec­u­tive Team & Board — PIBBSS

Nora_Ammann1 Jul 2024 19:30 UTC
43 points
1 comment1 min readLW link

Un­curs­ing Civilization

Lorec1 Jul 2024 18:44 UTC
−6 points
2 comments5 min readLW link

[Question] Self-cen­sor­ing on AI x-risk dis­cus­sions?

Decaeneus1 Jul 2024 18:24 UTC
17 points
2 comments1 min readLW link

Ra­tion­al­ists As Peo­ple Who Build Piles Of Rocks

Sable1 Jul 2024 10:32 UTC
9 points
0 comments5 min readLW link
(affablyevil.substack.com)

How good are LLMs at do­ing ML on an un­known dataset?

Håvard Tveit Ihle1 Jul 2024 9:04 UTC
33 points
4 comments13 min readLW link

Whirlwind Tour of Chain of Thought Liter­a­ture Rele­vant to Au­tomat­ing Align­ment Re­search.

sevdeawesome1 Jul 2024 5:50 UTC
25 points
0 comments17 min readLW link

Prob­a­bil­is­tic Logic ⇔ Or­a­cles?

Yudhister Kumar1 Jul 2024 5:36 UTC
15 points
0 comments4 min readLW link

Im­por­tant open prob­lems in voting

Closed Limelike Curves1 Jul 2024 2:53 UTC
33 points
1 comment1 min readLW link

Anti-Cir­cum­ci­sion Es­say 3 of 3: Now That I Think About It, Is There Ac­tu­ally a Space Between “Info” and “Hazard”? Isn’t It Just One Word?

Harry Stevenage1 Jul 2024 2:21 UTC
12 points
0 comments7 min readLW link

In Defense of Lawyers Play­ing Their Part

Isaac King1 Jul 2024 1:32 UTC
32 points
9 comments9 min readLW link

Anti-cir­cum­ci­sion Es­say 2 of 3: Phys­i­cal and Psy­cholog­i­cal Realities

Harry Stevenage30 Jun 2024 22:13 UTC
12 points
5 comments9 min readLW link

Re­view of METR’s pub­lic eval­u­a­tion protocol

30 Jun 2024 22:03 UTC
10 points
0 comments5 min readLW link

Su­per­po­si­tion, Self-Model­ing, and the Path to AGI: A New Perspective

Peterpiper30 Jun 2024 17:20 UTC
−13 points
0 comments2 min readLW link

Anti-Cir­cum­ci­sion Es­say 1 of 3: Ac­cord­ing To Their Crit­ics, In­tac­tivists Are The Best-Be­haved Protest Move­ment In His­tory

Harry Stevenage30 Jun 2024 17:17 UTC
12 points
6 comments5 min readLW link

The Xerox Parc/​ARPA ver­sion of the in­tel­lec­tual Tur­ing test: Class 1 vs Class 2 disagreement

hamishtodd130 Jun 2024 15:34 UTC
6 points
3 comments1 min readLW link

LLMs Univer­sally Learn a Fea­ture Rep­re­sent­ing To­ken Fre­quency /​ Rarity

Sean Osier30 Jun 2024 2:48 UTC
12 points
5 comments6 min readLW link
(github.com)

My 5-step pro­gram for los­ing weight

Nikita Sokolsky30 Jun 2024 1:05 UTC
22 points
20 comments5 min readLW link
(nsokolsky.substack.com)

Datasets that change the odds you exist

dynomight29 Jun 2024 18:45 UTC
56 points
4 comments6 min readLW link
(dynomight.net)

A “Scal­ing Monose­man­tic­ity” Explainer

29 Jun 2024 17:50 UTC
10 points
0 comments3 min readLW link

Anal­y­sis of key AI analogies

Kevin Kohler29 Jun 2024 10:55 UTC
10 points
2 comments15 min readLW link

Ge­or­gism Crash Course

Zero Contradictions29 Jun 2024 6:18 UTC
9 points
5 comments1 min readLW link
(zerocontradictions.net)

Ac­ti­va­tion Pat­tern SVD: A pro­posal for SAE Interpretability

Daniel Tan28 Jun 2024 22:12 UTC
15 points
2 comments2 min readLW link

Pod­cast: Eliz­a­beth & Austin on “What Man­i­fold was al­lowed to do”

Austin Chen28 Jun 2024 22:10 UTC
20 points
0 comments1 min readLW link
(share.descript.com)

The In­cred­ible Fen­tanyl-De­tect­ing Machine

sarahconstantin28 Jun 2024 22:10 UTC
154 points
26 comments7 min readLW link
(sarahconstantin.substack.com)

Sav­ing Lives Re­duces Over-Pop­u­la­tion—A Counter-In­tu­itive Non-Zero-Sum Game

James Stephen Brown28 Jun 2024 19:29 UTC
6 points
0 comments5 min readLW link
(nonzerosum.games)

Men­tor­ship in AGI Safety: Ap­pli­ca­tions for men­tor­ship are open!

28 Jun 2024 14:49 UTC
5 points
0 comments1 min readLW link

Con­tra Ace­moglu on AI

Maxwell Tabarrok28 Jun 2024 13:13 UTC
48 points
0 comments5 min readLW link
(www.maximum-progress.com)

Five toy wor­lds to think about her­i­ta­bil­ity

David Hugh-Jones28 Jun 2024 13:11 UTC
13 points
0 comments9 min readLW link
(wyclif.substack.com)

[Question] How do nat­u­ral sci­ences prove cau­sa­tion?

Kongo Landwalker28 Jun 2024 11:58 UTC
1 point
3 comments1 min readLW link

LessWrong/​ACX meetup Tran­sil­vanya tour—Sibiu

Marius Adrian Nicoară28 Jun 2024 11:41 UTC
1 point
1 comment1 min readLW link

Bayes’ The­o­rem: In Search of Gold (Les­son 1)

bayesyatina28 Jun 2024 8:39 UTC
3 points
0 comments3 min readLW link

How a chip is designed

YM28 Jun 2024 8:04 UTC
65 points
4 comments5 min readLW link

The Wis­dom of Liv­ing for 200 Years

Martin Sustrik28 Jun 2024 4:44 UTC
25 points
3 comments4 min readLW link

A Gen­er­ally In­tel­li­gent Game

snerx28 Jun 2024 1:31 UTC
−1 points
1 comment4 min readLW link

Cor­rigi­bil­ity = Tool-ness?

28 Jun 2024 1:19 UTC
78 points
8 comments9 min readLW link