All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024

All Jan Feb Mar Apr May Jun Jul Aug SepOctNov Dec

All 1 2 3 4 5 678 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Linking Alt Accounts

jefftk6 Oct 2023 17:00 UTC

70 points

33 comments1 min readLW link

(www.jefftk.com)

Super-Exponential versus Exponential Growth in Compute Price-Performance

moridinamael6 Oct 2023 16:23 UTC

37 points

25 comments2 min readLW link

A personal explanation of ELK concept and task.

Zeyu Qin6 Oct 2023 3:55 UTC

1 point

0 comments1 min readLW link

The Long-Term Future Fund is looking for a full-time fund chair

Linch, calebp99 and abergal

5 Oct 2023 22:18 UTC

52 points

0 comments7 min readLW link

(forum.effectivealtruism.org)

Provably Safe AI

PeterMcCluskey5 Oct 2023 22:18 UTC

33 points

15 comments4 min readLW link

(bayesianinvestor.com)

Stampy’s AI Safety Info soft launch

steven0461 and Robert Miles

5 Oct 2023 22:13 UTC

120 points

9 comments2 min readLW link

Impacts of AI on the housing markets

PottedRosePetal5 Oct 2023 21:24 UTC

8 points

0 comments5 min readLW link

Towards Monosemanticity: Decomposing Language Models With Dictionary Learning

Zac Hatfield-Dodds5 Oct 2023 21:01 UTC

287 points

21 comments2 min readLW link

(transformer-circuits.pub)

Ideation and Trajectory Modelling in Language Models

NickyP5 Oct 2023 19:21 UTC

16 points

2 comments10 min readLW link

A well-defined history in measurable factor spaces

Matthias G. Mayer5 Oct 2023 18:36 UTC

22 points

0 comments2 min readLW link

Evaluating the historical value misspecification argument

Matthew Barnett5 Oct 2023 18:34 UTC

171 points

143 comments7 min readLW link

Translations Should Invert

abramdemski5 Oct 2023 17:44 UTC

48 points

19 comments3 min readLW link

Censorship in LLMs is here to stay because it mirrors how our own intelligence is structured

mnvr5 Oct 2023 17:37 UTC

3 points

0 comments1 min readLW link

Twin Cities ACX Meetup October 2023

Timothy M.5 Oct 2023 16:29 UTC

1 point

2 comments1 min readLW link

This anime storyboard doesn’t exist: a graphic novel written and illustrated by GPT4

RomanS5 Oct 2023 14:01 UTC

12 points

7 comments55 min readLW link

AI #32: Lie Detector

Zvi5 Oct 2023 13:50 UTC

45 points

19 comments44 min readLW link

(thezvi.wordpress.com)

Can the House Legislate?

jefftk5 Oct 2023 13:40 UTC

26 points

6 comments2 min readLW link

(www.jefftk.com)

Making progress on the ``what alignment target should be aimed at?″ question, is urgent

ThomasCederborg5 Oct 2023 12:55 UTC

2 points

0 comments18 min readLW link

Response to Quintin Pope’s Evolution Provides No Evidence For the Sharp Left Turn

Zvi5 Oct 2023 11:39 UTC

129 points

29 comments9 min readLW link

How to Get Rationalist Feedback

Nicholas / Heather Kross5 Oct 2023 2:03 UTC

13 points

0 comments2 min readLW link

On my AI Fable, and the importance of de re, de dicto, and de se reference for AI alignment

PhilGoetz5 Oct 2023 0:50 UTC

9 points

5 comments1 min readLW link

Underspecified Probabilities: A Thought Experiment

lunatic_at_large4 Oct 2023 22:25 UTC

8 points

4 comments2 min readLW link

Fraternal Birth Order Effect and the Maternal Immune Hypothesis

Bucky4 Oct 2023 21:18 UTC

20 points

1 comment2 min readLW link

How to solve deception and still fail.

Charlie Steiner4 Oct 2023 19:56 UTC

40 points

7 comments6 min readLW link

PortAudio M1 Latency

jefftk4 Oct 2023 19:10 UTC

8 points

5 comments1 min readLW link

(www.jefftk.com)

Open Philanthropy is hiring for multiple roles across our Global Catastrophic Risks teams

aarongertler4 Oct 2023 18:04 UTC

6 points

0 comments3 min readLW link

(forum.effectivealtruism.org)

Safeguarding Humanity: Ensuring AI Remains a Servant, Not a Master

kgldeshapriya4 Oct 2023 17:52 UTC

−20 points

2 comments2 min readLW link

The 5 Pillars of Happiness

Gabi QUENE4 Oct 2023 17:50 UTC

−24 points

5 comments5 min readLW link

[Question] Using Reinforcement Learning to try to control the heating of a building (district heating)

Tony Karlsson4 Oct 2023 17:47 UTC

3 points

5 comments1 min readLW link

rationalistic probability(litterally just throwing shit out there)

NotaSprayer ASprayer4 Oct 2023 17:46 UTC

−30 points

8 comments2 min readLW link

AISN #23: New OpenAI Models, News from Anthropic, and Representation Engineering

aogara and Dan H

4 Oct 2023 17:37 UTC

15 points

2 comments5 min readLW link

(newsletter.safe.ai)

I don’t find the lie detection results that surprising (by an author of the paper)

JanB4 Oct 2023 17:10 UTC

97 points

8 comments3 min readLW link

[Question] What evidence is there of LLM’s containing world models?

Chris_Leong4 Oct 2023 14:33 UTC

17 points

17 comments1 min readLW link

Entanglement and intuition about words and meaning

Bill Benzon4 Oct 2023 14:16 UTC

4 points

0 comments2 min readLW link

Why a Mars colony would lead to a first strike situation

Remmelt4 Oct 2023 11:29 UTC

−59 points

8 comments1 min readLW link

(mflb.com)

[Question] What are some examples of AIs instantiating the ‘nearest unblocked strategy problem’?

EJT4 Oct 2023 11:05 UTC

6 points

4 comments1 min readLW link

Graphical tensor notation for interpretability

Jordan Taylor4 Oct 2023 8:04 UTC

137 points

11 comments19 min readLW link

[Link] Bay Area Winter Solstice 2023

tcheasdfjkl and TheSkeward

4 Oct 2023 2:19 UTC

18 points

3 comments1 min readLW link

(fb.me)

[Question] Who determines whether an alignment proposal is the definitive alignment solution?

MiguelDev3 Oct 2023 22:39 UTC

−1 points

6 comments1 min readLW link

AXRP Episode 25 - Cooperative AI with Caspar Oesterheld

DanielFilan3 Oct 2023 21:50 UTC

43 points

0 comments92 min readLW link

When to Get the Booster?

jefftk3 Oct 2023 21:00 UTC

50 points

15 comments2 min readLW link

(www.jefftk.com)

OpenAI-Microsoft partnership

Zach Stein-Perlman3 Oct 2023 20:01 UTC

51 points

19 comments1 min readLW link

[Question] Current AI safety techniques?

Zach Stein-Perlman3 Oct 2023 19:30 UTC

30 points

2 comments2 min readLW link

Testing and Automation for Intelligent Systems.

Sai Kiran Kammari3 Oct 2023 17:51 UTC

−13 points

0 comments1 min readLW link

(resource-cms.springernature.com)

Metaculus Announces Forecasting Tournament to Evaluate Focused Research Organizations, in Partnership With the Federation of American Scientists

ChristianWilliams3 Oct 2023 16:44 UTC

13 points

0 comments1 min readLW link

(www.metaculus.com)

What would it mean to understand how a large language model (LLM) works? Some quick notes.

Bill Benzon3 Oct 2023 15:11 UTC

20 points

4 comments8 min readLW link

[Question] Potential alignment targets for a sovereign superintelligent AI

Paul Colognese3 Oct 2023 15:09 UTC

29 points

4 comments1 min readLW link

Monthly Roundup #11: October 2023

Zvi3 Oct 2023 14:10 UTC

42 points

12 comments35 min readLW link

(thezvi.wordpress.com)

Why We Use Money? - A Walrasian View

Savio Coelho3 Oct 2023 12:02 UTC

4 points

3 comments8 min readLW link

Mech Interp Challenge: October—Deciphering the Sorted List Model

CallumMcDougall3 Oct 2023 10:57 UTC

23 points

0 comments3 min readLW link