Con­tra Com­mon Knowledge

abramdemski4 Jan 2023 22:50 UTC
52 points
31 comments16 min readLW link

Ad­di­tional space com­plex­ity isn’t always a use­ful met­ric

Brendan Long4 Jan 2023 21:53 UTC
4 points
3 comments3 min readLW link
(www.brendanlong.com)

List of links for get­ting into AI safety

zef4 Jan 2023 19:45 UTC
6 points
0 comments1 min readLW link

Open­ing Face­book Links Externally

jefftk4 Jan 2023 19:00 UTC
12 points
3 comments1 min readLW link
(www.jefftk.com)

Con­ver­sa­tional canyons

Henrik Karlsson4 Jan 2023 18:55 UTC
59 points
4 comments7 min readLW link
(escapingflatland.substack.com)

Progress links and tweets, 2023-01-04

jasoncrawford4 Jan 2023 18:23 UTC
15 points
0 comments1 min readLW link
(rootsofprogress.org)

200 COP in MI: Analysing Train­ing Dynamics

Neel Nanda4 Jan 2023 16:08 UTC
16 points
0 comments14 min readLW link

What’s up with ChatGPT and the Tur­ing Test?

4 Jan 2023 15:37 UTC
13 points
19 comments3 min readLW link

2022 was the year AGI ar­rived (Just don’t call it that)

Logan Zoellner4 Jan 2023 15:19 UTC
102 points
60 comments3 min readLW link

From Si­mon’s ant to ma­chine learn­ing, a parable

Bill Benzon4 Jan 2023 14:37 UTC
6 points
5 comments2 min readLW link

Ba­sic Facts about Lan­guage Model Internals

4 Jan 2023 13:01 UTC
130 points
19 comments9 min readLW link

Ri­tual as the only tool for over­writ­ing val­ues and goals

mrcbarbier4 Jan 2023 11:11 UTC
40 points
24 comments32 min readLW link

Nor­malcy bias and Base rate ne­glect: Bias in Eval­u­at­ing AGI X-Risks

Remmelt4 Jan 2023 3:16 UTC
−16 points
0 comments1 min readLW link

Causal rep­re­sen­ta­tion learn­ing as a tech­nique to pre­vent goal misgeneralization

PabloAMC4 Jan 2023 0:07 UTC
19 points
0 comments8 min readLW link

What makes a prob­a­bil­ity ques­tion “well-defined”? (Part II: Ber­trand’s Para­dox)

Noah Topper3 Jan 2023 22:39 UTC
7 points
3 comments9 min readLW link
(naivebayes.substack.com)

“AI” is an indexical

TW1233 Jan 2023 22:00 UTC
10 points
0 comments6 min readLW link
(aiwatchtower.substack.com)

An ML in­ter­pre­ta­tion of Shard Theory

beren3 Jan 2023 20:30 UTC
39 points
5 comments4 min readLW link

Talk­ing to God

abramdemski3 Jan 2023 20:14 UTC
30 points
7 comments2 min readLW link

My Ad­vice for In­com­ing SERI MATS Scholars

Johannes C. Mayer3 Jan 2023 19:25 UTC
58 points
6 comments4 min readLW link

Touch re­al­ity as soon as pos­si­ble (when do­ing ma­chine learn­ing re­search)

LawrenceC3 Jan 2023 19:11 UTC
112 points
8 comments8 min readLW link

Kolb’s: an ap­proach to con­sciously get bet­ter at anything

jacquesthibs3 Jan 2023 18:16 UTC
12 points
1 comment6 min readLW link

[Question] {M|Im|Am}oral Mazes—any large-scale coun­terex­am­ples?

Dagon3 Jan 2023 16:43 UTC
24 points
4 comments1 min readLW link

Effec­tively self-study­ing over the Internet

libai3 Jan 2023 16:23 UTC
4 points
0 comments4 min readLW link

Set-like math­e­mat­ics in type theory

Thomas Kehrenberg3 Jan 2023 14:33 UTC
4 points
1 comment13 min readLW link

Monthly Roundup #2

Zvi3 Jan 2023 12:50 UTC
23 points
3 comments23 min readLW link
(thezvi.wordpress.com)

Whisper’s Wild Implications

Ollie J3 Jan 2023 12:17 UTC
19 points
6 comments5 min readLW link

How to eat potato chips while typing

KatjaGrace3 Jan 2023 11:50 UTC
45 points
12 comments1 min readLW link
(worldspiritsockpuppet.com)

[Question] I have thou­sands of copies of HPMOR in Rus­sian. How to use them with the most im­pact?

Mikhail Samin3 Jan 2023 10:21 UTC
26 points
3 comments1 min readLW link

Is re­cur­sive self-al­ign­ment pos­si­ble?

No77e3 Jan 2023 9:15 UTC
5 points
5 comments1 min readLW link

On the nat­u­ral­is­tic study of the lin­guis­tic be­hav­ior of ar­tifi­cial intelligence

Bill Benzon3 Jan 2023 9:06 UTC
1 point
0 comments4 min readLW link

SF Se­vere Weather Warning

stavros3 Jan 2023 6:04 UTC
3 points
3 comments1 min readLW link
(news.ycombinator.com)

Sta­tus quo bias; Sys­tem jus­tifi­ca­tion: Bias in Eval­u­at­ing AGI X-Risks

3 Jan 2023 2:50 UTC
−11 points
0 comments1 min readLW link

200 COP in MI: Ex­plor­ing Poly­se­man­tic­ity and Superposition

Neel Nanda3 Jan 2023 1:52 UTC
34 points
6 comments16 min readLW link

The need for speed in web frame­works?

Adam Zerner3 Jan 2023 0:06 UTC
19 points
2 comments8 min readLW link

[Si­mu­la­tors sem­i­nar se­quence] #1 Back­ground & shared assumptions

2 Jan 2023 23:48 UTC
50 points
4 comments3 min readLW link

Lin­ear Alge­bra Done Right, Axler

David Udell2 Jan 2023 22:54 UTC
56 points
6 comments9 min readLW link

MacArthur BART (Filk)

Gordon Seidoh Worley2 Jan 2023 22:50 UTC
10 points
1 comment1 min readLW link

Knottiness

abramdemski2 Jan 2023 22:13 UTC
43 points
4 comments2 min readLW link

[Question] De­fault Sort for Short­forms is Very Bad; How Do I Change It?

DragonGod2 Jan 2023 21:50 UTC
15 points
0 comments1 min readLW link

MAKE IT BETTER (a po­etic demon­stra­tion of the ba­nal­ity of GPT-3)

rogersbacon2 Jan 2023 20:47 UTC
7 points
2 comments5 min readLW link

Re­view of “Make Peo­ple Bet­ter”

Metacelsus2 Jan 2023 20:30 UTC
10 points
0 comments3 min readLW link
(denovo.substack.com)

Prepar­ing for Less Privacy

jefftk2 Jan 2023 20:30 UTC
23 points
1 comment2 min readLW link
(www.jefftk.com)

Large lan­guage mod­els can provide “nor­ma­tive as­sump­tions” for learn­ing hu­man preferences

Stuart_Armstrong2 Jan 2023 19:39 UTC
29 points
12 comments3 min readLW link

On the Im­por­tance of Open Sourc­ing Re­ward Models

elandgre2 Jan 2023 19:01 UTC
18 points
5 comments6 min readLW link

Pre­dic­tion Mar­kets for Science

Vaniver2 Jan 2023 17:55 UTC
27 points
7 comments5 min readLW link

Why don’t Ra­tion­al­ists use bidets?

Lakin2 Jan 2023 17:42 UTC
31 points
33 comments2 min readLW link

Soft op­ti­miza­tion makes the value tar­get bigger

Jeremy Gillen2 Jan 2023 16:06 UTC
117 points
20 comments12 min readLW link

Re­sults from the AI test­ing hackathon

Esben Kran2 Jan 2023 15:46 UTC
13 points
0 comments1 min readLW link

In­duc­tion heads—illustrated

CallumMcDougall2 Jan 2023 15:35 UTC
111 points
9 comments3 min readLW link

Op­por­tu­nity Cost Blackmail

adamShimi2 Jan 2023 13:48 UTC
70 points
11 comments2 min readLW link
(epistemologicalvigilance.substack.com)