All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024

All JanFebMar Apr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 456 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28

Some miscellaneous thoughts on ChatGPT, stories, and mechanical interpretability

Bill Benzon4 Feb 2023 19:35 UTC

2 points

0 comments3 min readLW link

O(“AGI Safety”)>O(“Stop Tyrants”)

AnthonyRepetto4 Feb 2023 18:38 UTC

−4 points

11 comments1 min readLW link

Monthly Doom Argument Threads? Doom Argument Wiki?

LVSN4 Feb 2023 16:59 UTC

3 points

0 comments1 min readLW link

The Future of Structured Self Improvement

Evenflair4 Feb 2023 16:02 UTC

27 points

4 comments1 min readLW link

(guildoftherose.org)

Empathy as a natural consequence of learnt reward models

beren4 Feb 2023 15:35 UTC

46 points

26 comments13 min readLW link

Mech Interp Project Advising Call: Memorisation in GPT-2 Small

Neel Nanda4 Feb 2023 14:17 UTC

7 points

0 comments1 min readLW link

Do IQ tests measure intelligence? - A prediction market on my future beliefs about the topic

tailcalled4 Feb 2023 11:19 UTC

1 point

10 comments1 min readLW link

(manifold.markets)

AXRP Episode 19 - Mechanistic Interpretability with Neel Nanda

DanielFilan4 Feb 2023 3:00 UTC

45 points

0 comments117 min readLW link

The 2/3 rule for multi-factor authentication

RomanHauksson4 Feb 2023 2:57 UTC

4 points

0 comments1 min readLW link

(roman.computer)

Path-Dependence in ChatGPT’s Political Outputs

lsusr4 Feb 2023 2:02 UTC

28 points

4 comments4 min readLW link

Fucking Goddamn Basics of Rationalist Discourse

LoganStrohl4 Feb 2023 1:47 UTC

321 points

103 comments1 min readLW link 3 reviews

Small Talk is Good, Actually

Gordon Seidoh Worley4 Feb 2023 0:38 UTC

51 points

9 comments3 min readLW link

Update on Book Review Dominant Assurance Contract

Arjun Panickssery3 Feb 2023 23:16 UTC

9 points

0 comments1 min readLW link

[Question] 2+2=π√2+n

Logan Zoellner3 Feb 2023 22:27 UTC

16 points

15 comments1 min readLW link

[Question] If I encounter a capabilities paper that kinda spooks me, what should I do with it?

the gears to ascension3 Feb 2023 21:37 UTC

28 points

8 comments1 min readLW link

[Question] What Are The Preconditions/Prerequisites for Asymptotic Analysis?

DragonGod3 Feb 2023 21:26 UTC

8 points

2 comments1 min readLW link

[Linkpost] Google invested $300M in Anthropic in late 2022

Akash3 Feb 2023 19:13 UTC

73 points

14 comments1 min readLW link

(www.ft.com)

Many AI governance proposals have a tradeoff between usefulness and feasibility

Akash and Carson Ezell

3 Feb 2023 18:49 UTC

22 points

2 comments2 min readLW link

Reply to Duncan Sabien on Strawmanning

Zack_M_Davis3 Feb 2023 17:57 UTC

42 points

11 comments4 min readLW link

Semi-rare plain language words that are great to remember

LVSN3 Feb 2023 16:33 UTC

4 points

7 comments1 min readLW link

[Question] What qualities does an AGI need to have to realize the risk of false vacuum, without hardcoding physics theories into it?

RationalSieve3 Feb 2023 16:00 UTC

1 point

4 comments1 min readLW link

Housing and Transit Roundup #3

Zvi3 Feb 2023 15:10 UTC

21 points

6 comments16 min readLW link

(thezvi.wordpress.com)

Taboo P(doom)

NathanBarnard3 Feb 2023 10:37 UTC

14 points

10 comments1 min readLW link

ChatGPT: Tantalizing afterthoughts in search of story trajectories [induction heads]

Bill Benzon3 Feb 2023 10:35 UTC

4 points

0 comments20 min readLW link

Jordan Peterson: Guru/Villain

Bryan Frances3 Feb 2023 9:02 UTC

−14 points

6 comments9 min readLW link

[Question] What is the risk of asking a counterfactual oracle a question that already had its answer erased?

Chris_Leong3 Feb 2023 3:13 UTC

7 points

0 comments1 min readLW link

I don’t think MIRI “gave up”

Raemon3 Feb 2023 0:26 UTC

106 points

64 comments4 min readLW link

What fact that you know is true but most people aren’t ready to accept it?

lorepieri3 Feb 2023 0:06 UTC

47 points

210 comments1 min readLW link

[Question] Monotonous Work

Gideon Bauer2 Feb 2023 21:35 UTC

1 point

0 comments1 min readLW link

Is AI risk assessment too anthropocentric?

Craig Mattson2 Feb 2023 21:34 UTC

3 points

6 comments1 min readLW link

Halifax Monthly Meetup: Introduction to Effective Altruism

Ideopunk2 Feb 2023 21:10 UTC

10 points

0 comments1 min readLW link

Conditioning Predictive Models: Outer alignment via careful conditioning

evhub, Adam Jermyn, Johannes Treutlein, Rubi J. Hudson and kcwoolverton

2 Feb 2023 20:28 UTC

72 points

15 comments57 min readLW link

Conditioning Predictive Models: Large language models as predictors

evhub, Adam Jermyn, Johannes Treutlein, Rubi J. Hudson and kcwoolverton

2 Feb 2023 20:28 UTC

88 points

4 comments13 min readLW link

Normative vs Descriptive Models of Agency

mattmacdermott2 Feb 2023 20:28 UTC

26 points

5 comments4 min readLW link

Andrew Huberman on How to Optimize Sleep

Leon Lang2 Feb 2023 20:17 UTC

37 points

6 comments6 min readLW link

[Question] How can I help inflammation-based nerve damage be temporary?

Optimization Process2 Feb 2023 19:20 UTC

17 points

4 comments1 min readLW link

More findings on maximal data dimension

Marius Hobbhahn2 Feb 2023 18:33 UTC

27 points

1 comment11 min readLW link

Heritability, Behaviorism, and Within-Lifetime RL

Steven Byrnes2 Feb 2023 16:34 UTC

39 points

3 comments4 min readLW link

Covid 2/2/23: The Emergency Ends on 5/11

Zvi2 Feb 2023 14:00 UTC

22 points

6 comments7 min readLW link

(thezvi.wordpress.com)

You are probably not a good alignment researcher, and other blatant lies

junk heap homotopy2 Feb 2023 13:55 UTC

83 points

16 comments2 min readLW link

Don’t Judge a Tool by its Average Output

silentbob2 Feb 2023 13:42 UTC

11 points

2 comments4 min readLW link

Epoch Impact Report 2022

Jsevillamol2 Feb 2023 13:09 UTC

16 points

0 comments1 min readLW link

You Don’t Exist, Duncan

Duncan Sabien (Deactivated)2 Feb 2023 8:37 UTC

247 points

107 comments9 min readLW link

Temporally Layered Architecture for Adaptive, Distributed and Continuous Control

Roman Leventov2 Feb 2023 6:29 UTC

6 points

4 comments1 min readLW link

(arxiv.org)

Research agenda: Formalizing abstractions of computations

Erik Jenner2 Feb 2023 4:29 UTC

92 points

10 comments31 min readLW link

Progress links and tweets, 2023-02-01

jasoncrawford2 Feb 2023 2:25 UTC

10 points

0 comments1 min readLW link

(rootsofprogress.org)

Retrospective on the AI Safety Field Building Hub

Vael Gates2 Feb 2023 2:06 UTC

30 points

0 comments1 min readLW link

How to export Android Chrome tabs to an HTML file in Linux (as of February 2023)

Adam Scherlis2 Feb 2023 2:03 UTC

7 points

3 comments2 min readLW link

(adam.scherlis.com)

Hacked Account Spam

jefftk2 Feb 2023 1:50 UTC

13 points

5 comments1 min readLW link

(www.jefftk.com)

A simple technique to reduce negative rumination

cranberry_bear2 Feb 2023 1:33 UTC

9 points

0 comments1 min readLW link