Con­crete Steps to Get Started in Trans­former Mechanis­tic Interpretability

Neel NandaDec 25, 2022, 10:21 PM
57 points
7 comments12 min readLW link
(www.neelnanda.io)

It’s time to worry about on­line pri­vacy again

MalmesburyDec 25, 2022, 9:05 PM
67 points
23 comments6 min readLW link

[Heb­bian Nat­u­ral Ab­strac­tions] Math­e­mat­i­cal Foundations

Dec 25, 2022, 8:58 PM
15 points
2 comments6 min readLW link
(www.snellessen.com)

[Question] Or­a­cle AGI—How can it es­cape, other than se­cu­rity is­sues? (Steganog­ra­phy?)

RationalSieveDec 25, 2022, 8:14 PM
3 points
6 comments1 min readLW link

YCom­bi­na­tor fraud rates

XodarapDec 25, 2022, 7:21 PM
56 points
3 commentsLW link

How evolu­tion­ary lineages of LLMs can plan their own fu­ture and act on these plans

Roman LeventovDec 25, 2022, 6:11 PM
39 points
16 comments8 min readLW link

Ac­cu­rate Models of AI Risk Are Hyper­ex­is­ten­tial Exfohazards

Thane RuthenisDec 25, 2022, 4:50 PM
33 points
38 comments9 min readLW link

ChatGPT is our Wright Brothers moment

Ron JDec 25, 2022, 4:26 PM
10 points
9 comments1 min readLW link

The Med­i­ta­tion on Winter

RaemonDec 25, 2022, 4:12 PM
59 points
3 comments3 min readLW link

I’ve up­dated to­wards AI box­ing be­ing sur­pris­ingly easy

Noosphere89Dec 25, 2022, 3:40 PM
8 points
20 comments2 min readLW link

Take 14: Cor­rigi­bil­ity isn’t that great.

Charlie SteinerDec 25, 2022, 1:04 PM
15 points
3 comments3 min readLW link

Sim­plified Level Up

jefftkDec 25, 2022, 1:00 PM
12 points
16 comments2 min readLW link
(www.jefftk.com)

Hyper­finite graphs ~ manifolds

Alok SinghDec 25, 2022, 12:24 PM
11 points
5 comments2 min readLW link

In­con­sis­tent math is great

Alok SinghDec 25, 2022, 3:20 AM
1 point
2 comments1 min readLW link

A hun­dredth of a bit of ex­tra entropy

Adam ScherlisDec 24, 2022, 9:12 PM
84 points
4 comments3 min readLW link

Shared re­al­ity: a key driver of hu­man behavior

kdbscottDec 24, 2022, 7:35 PM
126 points
25 comments4 min readLW link

Con­tra Steiner on Too Many Nat­u­ral Abstractions

DragonGodDec 24, 2022, 5:42 PM
10 points
6 comments1 min readLW link

Three rea­sons to cooperate

paulfchristianoDec 24, 2022, 5:40 PM
86 points
14 comments10 min readLW link
(sideways-view.com)

Prac­ti­cal AI risk I: Watch­ing large compute

Gustavo RamiresDec 24, 2022, 1:25 PM
3 points
0 comments1 min readLW link

Non-Ele­vated Air Purifiers

jefftkDec 24, 2022, 12:40 PM
10 points
2 comments1 min readLW link
(www.jefftk.com)

The Case for Chip-Backed Dollars

AnthonyRepettoDec 24, 2022, 10:28 AM
0 points
1 comment4 min readLW link

List #3: Why not to as­sume on prior that AGI-al­ign­ment workarounds are available

RemmeltDec 24, 2022, 9:54 AM
4 points
1 comment3 min readLW link

List #2: Why co­or­di­nat­ing to al­ign as hu­mans to not de­velop AGI is a lot eas­ier than, well… co­or­di­nat­ing as hu­mans with AGI co­or­di­nat­ing to be al­igned with humans

RemmeltDec 24, 2022, 9:53 AM
1 point
0 comments3 min readLW link

List #1: Why stop­ping the de­vel­op­ment of AGI is hard but doable

RemmeltDec 24, 2022, 9:52 AM
6 points
11 comments5 min readLW link

The case against AI alignment

andrew sauerDec 24, 2022, 6:57 AM
126 points
110 comments5 min readLW link

Con­tent and Take­aways from SERI MATS Train­ing Pro­gram with John Wentworth

RohanSDec 24, 2022, 4:17 AM
28 points
3 comments12 min readLW link

Löb’s Lemma: an eas­ier ap­proach to Löb’s Theorem

Andrew_CritchDec 24, 2022, 2:02 AM
30 points
16 comments3 min readLW link

Durkon, an open-source tool for In­her­ently In­ter­pretable Modelling

abstractapplicDec 24, 2022, 1:49 AM
37 points
0 comments4 min readLW link

Is­sues with un­even AI re­source distribution

User_LukeDec 24, 2022, 1:18 AM
3 points
9 comments5 min readLW link
(temporal.substack.com)

Loose Threads on Intelligence

Shoshannah TekofskyDec 24, 2022, 12:38 AM
11 points
3 comments8 min readLW link

[Question] If you fac­tor out next to­ken pre­dic­tion, what are the re­main­ing salient fea­tures of hu­man cog­ni­tion?

ShmiDec 24, 2022, 12:38 AM
9 points
7 comments1 min readLW link

[Question] Why is “Ar­gu­ment Map­ping” Not More Com­mon in EA/​Ra­tion­al­ity (And What Ob­jec­tions Should I Ad­dress in a Post on the Topic?)

HarrisonDurlandDec 23, 2022, 9:58 PM
10 points
5 comments1 min readLW link

The Fear [Fic­tion]

YitzDec 23, 2022, 9:21 PM
7 points
0 comments1 min readLW link

To err is neu­ral: se­lect logs with ChatGPT

VipulNaikDec 23, 2022, 8:26 PM
22 points
2 comments38 min readLW link

AISER—AIS Europe Re­treat

CarolinDec 23, 2022, 7:03 PM
5 points
0 comments1 min readLW link

Two Truths and a Pre­dic­tion Market

ScrewtapeDec 23, 2022, 6:52 PM
22 points
2 comments6 min readLW link

ChatGPT un­der­stands, but largely does not gen­er­ate Span­glish (and other code-mixed) text

Milan WDec 23, 2022, 5:40 PM
15 points
5 comments4 min readLW link

On sincerity

Joe CarlsmithDec 23, 2022, 5:13 PM
75 points
6 comments42 min readLW link

Epi­ge­net­ics of the mam­malian germline

MetacelsusDec 23, 2022, 3:21 PM
37 points
0 comments7 min readLW link
(denovo.substack.com)

Bos­ton Sols­tice Songs

jefftkDec 23, 2022, 1:00 PM
9 points
0 comments1 min readLW link
(www.jefftk.com)

Are there any re­li­able CAPTCHAs? Com­pe­ti­tion for CAPTCHA ideas that AIs can’t solve.

MrThinkDec 23, 2022, 12:52 PM
7 points
37 comments1 min readLW link

“Search” is dead. What is the new paradigm?

ShmiDec 23, 2022, 10:33 AM
15 points
9 comments1 min readLW link

Ar­ti­cle Re­view: Dis­cov­er­ing La­tent Knowl­edge (Burns, Ye, et al)

Robert_AIZIDec 22, 2022, 6:16 PM
13 points
4 comments6 min readLW link
(aizi.substack.com)

Let’s think about slow­ing down AI

KatjaGraceDec 22, 2022, 5:40 PM
551 points
182 comments38 min readLW link3 reviews
(aiimpacts.org)

Some Notes on the math­e­mat­ics of Toy Au­toen­cod­ing Problems

carboniferous_umbraculum Dec 22, 2022, 5:21 PM
18 points
1 comment12 min readLW link

De­cem­ber 2022 up­dates and fundraising

AI ImpactsDec 22, 2022, 5:20 PM
39 points
1 comment3 min readLW link
(aiimpacts.org)

Covid 12/​22/​22: Ree­val­u­at­ing Past Options

ZviDec 22, 2022, 4:50 PM
30 points
2 comments9 min readLW link
(thezvi.wordpress.com)

China Covid #4

ZviDec 22, 2022, 4:30 PM
50 points
2 comments11 min readLW link
(thezvi.wordpress.com)

Rac­ing through a minefield: the AI de­ploy­ment problem

HoldenKarnofskyDec 22, 2022, 4:10 PM
38 points
2 comments13 min readLW link
(www.cold-takes.com)

Lead in Cho­co­late?

jefftkDec 22, 2022, 4:10 PM
41 points
6 comments2 min readLW link
(www.jefftk.com)