RSS

Nat­u­ral Abstraction

TagLast edit: 10 Oct 2022 17:45 UTC by Raemon

The Natural Abstraction hypothesis says that:

Our physical world abstracts well: for most systems, the information relevant “far away” from the system (in various senses) is much lower-dimensional than the system itself. These low-dimensional summaries are exactly the high-level abstract objects/​concepts typically used by humans.

These abstractions are “natural”: a wide variety of cognitive architectures will learn to use approximately the same high-level abstract objects/​concepts to reason about the world.

(from “Testing the Natural Abstraction Hypothesis”)

Nat­u­ral Ab­strac­tions: Key claims, The­o­rems, and Critiques

16 Mar 2023 16:37 UTC
228 points
20 comments45 min readLW link

Nat­u­ral La­tents: The Concepts

20 Mar 2024 18:21 UTC
87 points
18 comments19 min readLW link

Nat­u­ral La­tents: The Math

27 Dec 2023 19:03 UTC
120 points
37 comments12 min readLW link

The Nat­u­ral Ab­strac­tion Hy­poth­e­sis: Im­pli­ca­tions and Evidence

CallumMcDougall14 Dec 2021 23:14 UTC
39 points
9 comments19 min readLW link

Test­ing The Nat­u­ral Ab­strac­tion Hy­poth­e­sis: Pro­ject Intro

johnswentworth6 Apr 2021 21:24 UTC
168 points
41 comments6 min readLW link1 review

What is a Tool?

25 Jun 2024 23:40 UTC
62 points
4 comments6 min readLW link

Align­ment By Default

johnswentworth12 Aug 2020 18:54 UTC
174 points
96 comments11 min readLW link2 reviews

[ASoT] Nat­u­ral ab­strac­tions and AlphaZero

Ulisse Mini10 Dec 2022 17:53 UTC
33 points
1 comment1 min readLW link
(arxiv.org)

Public Static: What is Ab­strac­tion?

johnswentworth9 Jun 2020 18:36 UTC
97 points
18 comments11 min readLW link

Agency As a Nat­u­ral Abstraction

Thane Ruthenis13 May 2022 18:02 UTC
55 points
9 comments13 min readLW link

Test­ing The Nat­u­ral Ab­strac­tion Hy­poth­e­sis: Pro­ject Update

johnswentworth20 Sep 2021 3:44 UTC
87 points
17 comments8 min readLW link1 review

Con­tra­pos­i­tive Nat­u­ral Ab­strac­tion—Pro­ject Intro

Elliot Callender24 Jun 2024 18:37 UTC
4 points
5 comments2 min readLW link

Rele­vant to nat­u­ral ab­strac­tions: Eu­clidean Sym­me­try Equiv­ar­i­ant Ma­chine Learn­ing—Overview, Ap­pli­ca­tions, and Open Questions

the gears to ascension8 Dec 2022 18:01 UTC
8 points
0 comments1 min readLW link
(youtu.be)

Nat­u­ral Ab­strac­tion: Con­ver­gent Prefer­ences Over In­for­ma­tion Structures

paulom14 Oct 2023 18:34 UTC
13 points
1 comment36 min readLW link

AISafety.info: What is the “nat­u­ral ab­strac­tions hy­poth­e­sis”?

Algon5 Oct 2024 12:31 UTC
38 points
2 comments3 min readLW link
(aisafety.info)

Towards the Oper­a­tional­iza­tion of Philos­o­phy & Wisdom

Thane Ruthenis28 Oct 2024 19:45 UTC
20 points
2 comments33 min readLW link
(aiimpacts.org)

Min­i­mal Mo­ti­va­tion of Nat­u­ral Latents

14 Oct 2024 22:51 UTC
43 points
14 comments3 min readLW link

A rough and in­com­plete re­view of some of John Went­worth’s research

So8res28 Mar 2023 18:52 UTC
175 points
18 comments18 min readLW link

What Does The Nat­u­ral Ab­strac­tion Frame­work Say About ELK?

johnswentworth15 Feb 2022 2:27 UTC
35 points
0 comments6 min readLW link

AGI will be made of het­ero­ge­neous com­po­nents, Trans­former and Selec­tive SSM blocks will be among them

Roman Leventov27 Dec 2023 14:51 UTC
33 points
9 comments4 min readLW link

The Plan − 2023 Version

johnswentworth29 Dec 2023 23:34 UTC
146 points
40 comments31 min readLW link

From Con­cep­tual Spaces to Quan­tum Con­cepts: For­mal­is­ing and Learn­ing Struc­tured Con­cep­tual Models

Roman Leventov6 Feb 2024 10:18 UTC
8 points
1 comment4 min readLW link
(arxiv.org)

Nat­u­ral La­tents Are Not Ro­bust To Tiny Mixtures

7 Jun 2024 18:53 UTC
61 points
8 comments5 min readLW link

AlignedCut: Vi­sual Con­cepts Dis­cov­ery on Brain-Guided Univer­sal Fea­ture Space

Bogdan Ionut Cirstea14 Sep 2024 23:23 UTC
17 points
1 comment1 min readLW link
(arxiv.org)

Val­i­dat­ing /​ find­ing al­ign­ment-rele­vant con­cepts us­ing neu­ral data

Bogdan Ionut Cirstea20 Sep 2024 21:12 UTC
7 points
0 comments1 min readLW link
(docs.google.com)

Nat­u­ral ab­strac­tions are ob­server-de­pen­dent: a con­ver­sa­tion with John Wentworth

Martín Soto12 Feb 2024 17:28 UTC
39 points
13 comments7 min readLW link

Ideal­ized Agents Are Ap­prox­i­mate Causal Mir­rors (+ Rad­i­cal Op­ti­mism on Agent Foun­da­tions)

Thane Ruthenis22 Dec 2023 20:19 UTC
74 points
14 comments6 min readLW link

[Heb­bian Nat­u­ral Ab­strac­tions] Introduction

21 Nov 2022 20:34 UTC
34 points
3 comments4 min readLW link
(www.snellessen.com)

Select Agent Speci­fi­ca­tions as Nat­u­ral Abstractions

lukemarks7 Apr 2023 23:16 UTC
19 points
3 comments5 min readLW link

[Heb­bian Nat­u­ral Ab­strac­tions] Math­e­mat­i­cal Foundations

25 Dec 2022 20:58 UTC
15 points
2 comments6 min readLW link
(www.snellessen.com)

The Light­cone The­o­rem: A Bet­ter Foun­da­tion For Nat­u­ral Ab­strac­tion?

johnswentworth15 May 2023 2:24 UTC
69 points
25 comments6 min readLW link

$500 Bounty/​Prize Prob­lem: Chan­nel Ca­pac­ity Us­ing “Insen­si­tive” Functions

johnswentworth16 May 2023 21:31 UTC
40 points
11 comments2 min readLW link

Ab­strac­tion is Big­ger than Nat­u­ral Abstraction

Nicholas / Heather Kross31 May 2023 0:00 UTC
18 points
0 comments5 min readLW link
(www.thinkingmuchbetter.com)

Nat­u­ral Cat­e­gories Update

Logan Zoellner10 Oct 2022 15:19 UTC
33 points
6 comments2 min readLW link

Com­put­ing Nat­u­ral Ab­strac­tions: Lin­ear Approximation

johnswentworth15 Apr 2021 17:47 UTC
41 points
22 comments7 min readLW link

AXRP Epi­sode 15 - Nat­u­ral Ab­strac­tions with John Wentworth

DanielFilan23 May 2022 5:40 UTC
34 points
1 comment58 min readLW link

The Core of the Align­ment Prob­lem is...

17 Aug 2022 20:07 UTC
76 points
10 comments9 min readLW link

Causal Ab­strac­tion Toy Model: Med­i­cal Sensor

johnswentworth11 Dec 2019 21:12 UTC
34 points
6 comments6 min readLW link

The Plan − 2022 Update

johnswentworth1 Dec 2022 20:43 UTC
239 points
37 comments8 min readLW link1 review

Take 4: One prob­lem with nat­u­ral ab­strac­tions is there’s too many of them.

Charlie Steiner5 Dec 2022 10:39 UTC
37 points
4 comments1 min readLW link

Take 5: Another prob­lem for nat­u­ral ab­strac­tions is laz­i­ness.

Charlie Steiner6 Dec 2022 7:00 UTC
31 points
4 comments3 min readLW link

If Went­worth is right about nat­u­ral ab­strac­tions, it would be bad for alignment

Wuschel Schulz8 Dec 2022 15:19 UTC
29 points
5 comments4 min readLW link

The “Min­i­mal La­tents” Ap­proach to Nat­u­ral Abstractions

johnswentworth20 Dec 2022 1:22 UTC
53 points
24 comments12 min readLW link

Causal ab­strac­tions vs infradistributions

Pablo Villalobos26 Dec 2022 0:21 UTC
24 points
0 comments6 min readLW link

Si­mu­lacra are Things

janus8 Jan 2023 23:03 UTC
63 points
7 comments2 min readLW link

World-Model In­ter­pretabil­ity Is All We Need

Thane Ruthenis14 Jan 2023 19:37 UTC
35 points
22 comments21 min readLW link

Why I’m not work­ing on {de­bate, RRM, ELK, nat­u­ral ab­strac­tions}

Steven Byrnes10 Feb 2023 19:22 UTC
71 points
19 comments9 min readLW link

The con­cep­tual Dop­pelgänger problem

TsviBT12 Feb 2023 17:23 UTC
12 points
5 comments4 min readLW link

[Question] Is In­struc­tGPT Fol­low­ing In­struc­tions in Other Lan­guages Sur­pris­ing?

DragonGod13 Feb 2023 23:26 UTC
39 points
15 comments1 min readLW link

[Ap­pendix] Nat­u­ral Ab­strac­tions: Key Claims, The­o­rems, and Critiques

16 Mar 2023 16:38 UTC
48 points
0 comments13 min readLW link

Wittgen­stein’s Lan­guage Games and the Cri­tique of the Nat­u­ral Ab­strac­tion Hypothesis

Chris_Leong16 Mar 2023 7:56 UTC
16 points
19 comments2 min readLW link

Jonothan Go­rard:The ter­ri­tory is iso­mor­phic to an equiv­alence class of its maps

Daniel C7 Sep 2024 10:04 UTC
17 points
18 comments2 min readLW link
(x.com)

My AI Model Delta Com­pared To Yudkowsky

johnswentworth10 Jun 2024 16:12 UTC
276 points
102 comments4 min readLW link

Con­tra Steiner on Too Many Nat­u­ral Abstractions

DragonGod24 Dec 2022 17:42 UTC
10 points
6 comments1 min readLW link

Align­ment Tar­gets and The Nat­u­ral Ab­strac­tion Hypothesis

Stephen Fowler8 Mar 2023 11:45 UTC
10 points
0 comments3 min readLW link

[Linkpost] Con­cept Align­ment as a Pr­ereq­ui­site for Value Alignment

Bogdan Ionut Cirstea4 Nov 2023 17:34 UTC
27 points
0 comments1 min readLW link
(arxiv.org)

Si­mu­la­tors In­crease the Like­li­hood of Align­ment by Default

Wuschel Schulz30 Apr 2023 16:32 UTC
13 points
1 comment5 min readLW link

«Boundaries/​Mem­branes» and AI safety compilation

Chipmonk3 May 2023 21:41 UTC
57 points
17 comments8 min readLW link

[Linkpost] MindEye2: Shared-Sub­ject Models En­able fMRI-To-Image With 1 Hour of Data

Bogdan Ionut Cirstea10 Mar 2024 1:30 UTC
10 points
0 comments1 min readLW link
(openreview.net)

[Question] Does the Tele­phone The­o­rem give us a free lunch?

Numendil15 Feb 2023 2:13 UTC
11 points
2 comments1 min readLW link

Ab­strac­tion As Sym­me­try and Other Thoughts

Numendil1 Feb 2023 6:25 UTC
28 points
9 comments2 min readLW link

Na­ture < Nur­ture for AIs

scottviteri4 Jun 2023 20:38 UTC
14 points
22 comments7 min readLW link

[Linkpost] Large Lan­guage Models Con­verge on Brain-Like Word Representations

Bogdan Ionut Cirstea11 Jun 2023 11:20 UTC
36 points
12 comments1 min readLW link

[Linkpost] Scal­ing laws for lan­guage en­cod­ing mod­els in fMRI

Bogdan Ionut Cirstea8 Jun 2023 10:52 UTC
30 points
0 comments1 min readLW link

[Linkpost] Map­ping Brains with Lan­guage Models: A Survey

Bogdan Ionut Cirstea16 Jun 2023 9:49 UTC
5 points
0 comments1 min readLW link

[Linkpost] Rosetta Neu­rons: Min­ing the Com­mon Units in a Model Zoo

Bogdan Ionut Cirstea17 Jun 2023 16:38 UTC
12 points
0 comments1 min readLW link

[Linkpost] A shared lin­guis­tic space for trans­mit­ting our thoughts from brain to brain in nat­u­ral conversations

Bogdan Ionut Cirstea1 Jul 2023 13:57 UTC
17 points
2 comments1 min readLW link

[Linkpost] Large lan­guage mod­els con­verge to­ward hu­man-like con­cept organization

Bogdan Ionut Cirstea2 Sep 2023 6:00 UTC
22 points
1 comment1 min readLW link

An em­bed­ding de­coder model, trained with a differ­ent ob­jec­tive on a differ­ent dataset, can de­code an­other model’s em­bed­dings sur­pris­ingly accurately

Logan Zoellner3 Sep 2023 11:34 UTC
20 points
1 comment1 min readLW link

The util­ity of hu­mans within a Su­per Ar­tifi­cial In­tel­li­gence realm.

Marc Monroy11 Oct 2023 17:30 UTC
1 point
0 comments7 min readLW link

[Linkpost] Gen­er­al­iza­tion in diffu­sion mod­els arises from ge­om­e­try-adap­tive har­monic representation

Bogdan Ionut Cirstea11 Oct 2023 17:48 UTC
4 points
3 comments1 min readLW link

Univer­sal di­men­sions of vi­sual representation

Bogdan Ionut Cirstea28 Aug 2024 10:38 UTC
8 points
0 comments1 min readLW link
(arxiv.org)

Ab­strac­tions are not Natural

Alfred Harwood4 Nov 2024 11:10 UTC
25 points
21 comments11 min readLW link

[Question] [DISC] Are Values Ro­bust?

DragonGod21 Dec 2022 1:00 UTC
12 points
9 comments2 min readLW link
No comments.