24 Feb 2024 23:09 UTC

17 points

0 comments1 min readLW link

Cooperating with aliens and AGIs: An ECL explainer

Chi Nguyen, _will_ and Akash

24 Feb 2024 22:58 UTC

51 points

8 comments1 min readLW link

Choosing My Quest (Part 2 of “The Sense Of Physical Necessity”)

LoganStrohl24 Feb 2024 21:31 UTC

40 points

7 comments12 min readLW link

Rationality Research Report: Towards 10x OODA Looping?

Raemon24 Feb 2024 21:06 UTC

113 points

21 comments15 min readLW link

Let’s ask some of the largest LLMs for tips and ideas on how to take over the world

Super AGI24 Feb 2024 20:35 UTC

1 point

0 comments7 min readLW link

Exercise: Planmaking, Surprise Anticipation, and “Baba is You”

Raemon24 Feb 2024 20:33 UTC

48 points

19 comments6 min readLW link

In search of God.

Spiritus Dei24 Feb 2024 18:59 UTC

−19 points

3 comments7 min readLW link

Impossibility of Anthropocentric-Alignment

False Name24 Feb 2024 18:31 UTC

−8 points

2 comments39 min readLW link

The Inner Alignment Problem

Jakub Halmeš24 Feb 2024 17:55 UTC

1 point

1 comment3 min readLW link

(jakubhalmes.substack.com)

We Need Major, But Not Radical, FDA Reform

Maxwell Tabarrok24 Feb 2024 16:54 UTC

42 points

12 comments7 min readLW link

(www.maximum-progress.com)

After Overmorrow: Scattered Musings on the Immediate Post-AGI World

Yuli_Ban24 Feb 2024 15:49 UTC

−3 points

0 comments26 min readLW link

[Question] CDT vs. EDT on Deterrence

notfnofn24 Feb 2024 15:41 UTC

1 point

9 comments1 min readLW link

Balancing Games

jefftk24 Feb 2024 14:40 UTC

61 points

18 comments1 min readLW link

(www.jefftk.com)

How well do truth probes generalise?

mishajw24 Feb 2024 14:12 UTC

87 points

11 comments9 min readLW link

Rawls’s Veil of Ignorance Doesn’t Make Any Sense

Arjun Panickssery24 Feb 2024 13:18 UTC

10 points

9 comments1 min readLW link

[Question] Can someone explain to me what went wrong with ChatGPT?

Valentin Baltadzhiev24 Feb 2024 11:50 UTC

9 points

1 comment1 min readLW link

The Sense Of Physical Necessity: A Naturalism Demo (Introduction)

LoganStrohl24 Feb 2024 2:56 UTC

59 points

1 comment6 min readLW link

Instrumental deception and manipulation in LLMs—a case study

Olli Järviniemi24 Feb 2024 2:07 UTC

39 points

13 comments12 min readLW link

A starting point for making sense of task structure (in machine learning)

Kaarel, RP and jake_mendel

24 Feb 2024 1:51 UTC

45 points

2 comments12 min readLW link

Why you, personally, should want a larger human population

jasoncrawford23 Feb 2024 19:48 UTC

32 points

32 comments5 min readLW link

(rootsofprogress.org)

Deliberative Cognitive Algorithms as Scaffolding

Cole Wyeth23 Feb 2024 17:15 UTC

19 points

4 comments3 min readLW link

The Shutdown Problem: Incomplete Preferences as a Solution

EJT23 Feb 2024 16:01 UTC

52 points

29 comments42 min readLW link

In set theory, everything is a set

Jacob G-W23 Feb 2024 14:35 UTC

11 points

9 comments2 min readLW link

The role of philosophical thinking in understanding large language models: Calibrating and closing the gap between first-person experience and underlying mechanisms

Bill Benzon23 Feb 2024 12:19 UTC

4 points

0 comments10 min readLW link

Deep and obvious points in the gap between your thoughts and your pictures of thought

KatjaGrace23 Feb 2024 7:30 UTC

42 points

6 comments1 min readLW link

(worldspiritsockpuppet.com)

Parasocial relationship logic

KatjaGrace23 Feb 2024 7:30 UTC

20 points

1 comment1 min readLW link

(worldspiritsockpuppet.com)

Shaming with and without naming

KatjaGrace23 Feb 2024 7:30 UTC

15 points

5 comments2 min readLW link

(worldspiritsockpuppet.com)

Complexity of value but not disvalue implies more focus on s-risk. Moral uncertainty and preference utilitarianism also do.

Chi Nguyen23 Feb 2024 6:10 UTC

54 points

18 comments1 min readLW link

[Question] Does increasing the power of a multimodal LLM get you an agentic AI?

yanni kyriacos23 Feb 2024 4:14 UTC

3 points

3 comments1 min readLW link

The natural boundaries between people

Chipmonk23 Feb 2024 1:09 UTC

23 points

2 comments8 min readLW link

(chipmonk.substack.com)

Contra Ngo et al. “Every ‘Every Bay Area House Party’ Bay Area House Party”

Ricki Heicklen22 Feb 2024 23:56 UTC

186 points

5 comments4 min readLW link

(bayesshammai.substack.com)

AI #52: Oops

Zvi22 Feb 2024 21:50 UTC

50 points

9 comments29 min readLW link

(thezvi.wordpress.com)

Embed your second brain in your first brain

dkl922 Feb 2024 21:46 UTC

10 points

3 comments1 min readLW link

(dkl9.net)

The Gemini Incident

Zvi22 Feb 2024 21:00 UTC

80 points

19 comments18 min readLW link

(thezvi.wordpress.com)

Some Thoughts On Using Auctions For Land Valuation

harsimony22 Feb 2024 19:54 UTC

0 points

9 comments9 min readLW link

(progressandpoverty.substack.com)

The Binding of Isaac & Transparent Newcomb’s Problem

suvjectibity22 Feb 2024 18:56 UTC

−11 points

0 comments10 min readLW link

Language Models Don’t Learn the Physical Manifestation of Language

Bruce W. Lee and Jaehyuk Lim

22 Feb 2024 18:52 UTC

39 points

23 comments1 min readLW link

(arxiv.org)

Sora What

Zvi22 Feb 2024 18:10 UTC

47 points

3 comments9 min readLW link

(thezvi.wordpress.com)

Do sparse autoencoders find “true features”?

Demian Till22 Feb 2024 18:06 UTC

73 points

33 comments11 min readLW link

Everything Wrong with Roko’s Claims about an Engineered Pandemic

WitheringWeights22 Feb 2024 15:59 UTC

92 points

10 comments16 min readLW link

The One and a Half Gemini

Zvi22 Feb 2024 13:10 UTC

73 points

4 comments8 min readLW link

(thezvi.wordpress.com)

[Question] How do I make predictions about the future to make sense of what to do with my life?

Raj Thimmiah22 Feb 2024 11:22 UTC

8 points

1 comment1 min readLW link

How are voluntary commitments on vulnerability reporting going?

Adam Jones22 Feb 2024 8:43 UTC

23 points

1 comment1 min readLW link

(adamjones.me)

Notes on Internal Objectives in Toy Models of Agents

Paul Colognese22 Feb 2024 8:02 UTC

16 points

0 comments8 min readLW link

The Byronic Hero Always Loses

Cole Wyeth22 Feb 2024 1:31 UTC

31 points

4 comments2 min readLW link

Job Listing: Managing Editor / Writer

Gretta Duleba21 Feb 2024 23:41 UTC

43 points

2 comments1 min readLW link

The Pareto Best and the Curse of Doom

Screwtape21 Feb 2024 23:10 UTC

113 points

21 comments9 min readLW link

AISN #31: A New AI Policy Bill in California Plus, Precedents for AI Governance and The EU AI Office

aogara and Dan H

21 Feb 2024 21:58 UTC

17 points

0 comments6 min readLW link

(newsletter.safe.ai)

Analogies between scaling labs and misaligned superintelligent AI

scasper21 Feb 2024 19:29 UTC

75 points

5 comments4 min readLW link

Extinction Risks from AI: Invisible to Science?

VojtaKovarik, Chris van Merwijk and Ida Mattsson

21 Feb 2024 18:07 UTC

24 points

7 comments1 min readLW link

(arxiv.org)