Ti­maeus is hiring!

12 Jul 2024 23:42 UTC
67 points
6 comments2 min readLW link

Con­sider at­tend­ing the AI Se­cu­rity Fo­rum ’24, a 1-day pre-DEFCON event

Charlie Rogers-Smith12 Jul 2024 23:01 UTC
21 points
0 comments1 min readLW link

Me­moris­ing molec­u­lar structures

dkl912 Jul 2024 22:40 UTC
6 points
0 comments2 min readLW link
(dkl9.net)

Robin Han­son AI X-Risk De­bate — High­lights and Analysis

Liron12 Jul 2024 21:31 UTC
46 points
7 comments45 min readLW link
(www.youtube.com)

De­sign­ing Ar­tifi­cial Wis­dom: The Wise Work­flow Re­search Organization

Jordan Arel12 Jul 2024 19:18 UTC
2 points
0 comments8 min readLW link

White­board Pen Magaz­ines are Useful

Johannes C. Mayer12 Jul 2024 17:15 UTC
40 points
8 comments1 min readLW link

Align­ment: “Do what I would have wanted you to do”

Oleg Trott12 Jul 2024 16:47 UTC
11 points
48 comments1 min readLW link

Virtue taxation

Dentosal12 Jul 2024 14:56 UTC
9 points
1 comment2 min readLW link

Most smart and skil­led peo­ple are out­side of the EA/​ra­tio­nal­ist com­mu­nity: an analysis

titotal12 Jul 2024 12:13 UTC
107 points
36 comments1 min readLW link
(open.substack.com)

2024 Free­dom Com­mu­ni­ties Events

Tudor Iliescu12 Jul 2024 8:04 UTC
−6 points
1 comment1 min readLW link

Faith­ful vs In­ter­pretable Sparse Au­toen­coder Evals

Louka Ewington-Pitsos12 Jul 2024 5:37 UTC
2 points
0 comments12 min readLW link

Mov­ing away from phys­i­cal continuity

ProgramCrafter12 Jul 2024 5:05 UTC
2 points
1 comment1 min readLW link

Trans­former Cir­cuit Faith­ful­ness Met­rics Are Not Robust

12 Jul 2024 3:47 UTC
104 points
5 comments7 min readLW link
(arxiv.org)

On Ar­tifi­cial Wisdom

Jordan Arel12 Jul 2024 0:20 UTC
3 points
0 comments14 min readLW link

Yoshua Ben­gio: Rea­son­ing through ar­gu­ments against tak­ing AI safety seriously

Judd Rosenblatt11 Jul 2024 23:53 UTC
70 points
3 comments1 min readLW link
(yoshuabengio.org)

Pod­cast: “How the Smart Money teaches trad­ing with Ricki He­ick­len” (Pa­trick McKen­zie in­ter­view­ing)

rossry11 Jul 2024 22:49 UTC
20 points
2 comments1 min readLW link
(www.complexsystemspodcast.com)

Su­perba­bies: Put­ting The Pie­ces Together

sarahconstantin11 Jul 2024 20:40 UTC
215 points
37 comments10 min readLW link
(sarahconstantin.substack.com)

Sher­lock­ian Ab­duc­tion Master List

Cole Wyeth11 Jul 2024 20:27 UTC
50 points
63 comments33 min readLW link

Thoughts to ni­plav on lie-de­tec­tion, truth­fwl mechanisms, and wealth-inequality

11 Jul 2024 18:55 UTC
7 points
8 comments11 min readLW link

Games for AI Control

11 Jul 2024 18:40 UTC
43 points
0 comments5 min readLW link

Video In­tro to Guaran­teed Safe AI

11 Jul 2024 17:53 UTC
27 points
0 comments1 min readLW link
(youtu.be)

Effec­tive Empathy

Thac011 Jul 2024 15:14 UTC
4 points
1 comment1 min readLW link

AI #72: Deny­ing the Future

Zvi11 Jul 2024 15:00 UTC
45 points
8 comments41 min readLW link
(thezvi.wordpress.com)

The Best Bits From Build, Baby, Build

Maxwell Tabarrok11 Jul 2024 14:09 UTC
13 points
0 comments4 min readLW link
(www.maximum-progress.com)

[Question] What Other Lines of Work are Safe from AI Au­toma­tion?

RogerDearnaley11 Jul 2024 10:01 UTC
29 points
35 comments5 min readLW link

De­com­pos­ing Agency — ca­pa­bil­ities with­out desires

11 Jul 2024 9:38 UTC
146 points
32 comments12 min readLW link
(strangecities.substack.com)

Reli­able Sources: The Story of David Gerard

TracingWoodgrains10 Jul 2024 19:50 UTC
381 points
53 comments43 min readLW link

Manag­ing Emo­tional Po­ten­tial Energy

adamShimi10 Jul 2024 18:20 UTC
23 points
4 comments4 min readLW link
(epistemologicalfascinations.substack.com)

[EAFo­rum xpost] A break­down of OpenAI’s revenue

10 Jul 2024 18:09 UTC
57 points
5 comments1 min readLW link
(forum.effectivealtruism.org)

Solv­ing Pas­cal’s Wager us­ing dy­namic programming

Paul Wilczewski10 Jul 2024 18:09 UTC
1 point
0 comments5 min readLW link

Fluent, Cruxy Predictions

Raemon10 Jul 2024 18:00 UTC
85 points
14 comments14 min readLW link

An­titrust as Con­trol­led Creative Destruction

Martin Sustrik10 Jul 2024 16:40 UTC
14 points
2 comments2 min readLW link
(250bpm.substack.com)

New page: Integrity

Zach Stein-Perlman10 Jul 2024 15:00 UTC
91 points
3 comments1 min readLW link

AirBnB Baking

jefftk10 Jul 2024 12:50 UTC
7 points
1 comment1 min readLW link
(www.jefftk.com)

DIY RLHF: A sim­ple im­ple­men­ta­tion for hands on experience

10 Jul 2024 12:07 UTC
28 points
0 comments6 min readLW link

Use­ful­ness grounds truth

invertedpassion10 Jul 2024 7:58 UTC
0 points
0 comments4 min readLW link

On pass­ing Com­plete and Hon­est Ide­olog­i­cal Tur­ing Tests (CHITTs)

Aryeh Englander10 Jul 2024 4:01 UTC
11 points
2 comments1 min readLW link

[Question] Pon­der­ing how good or bad things will be in the AGI future

Sherrinford9 Jul 2024 22:46 UTC
11 points
9 comments2 min readLW link

Causal Graphs of GPT-2-Small’s Resi­d­ual Stream

David Udell9 Jul 2024 22:06 UTC
53 points
7 comments7 min readLW link

[Question] If AI starts to end the world, is suicide a good idea?

IlluminateReality9 Jul 2024 21:53 UTC
0 points
8 comments1 min readLW link

Ra­tion­al­ist Pu­rity Test

Gunnar_Zarncke9 Jul 2024 20:30 UTC
−9 points
5 comments1 min readLW link
(ratpuritytest.com)

That which can be de­stroyed by the truth, should be as­sumed to should be de­stroyed by it

Thac09 Jul 2024 19:39 UTC
5 points
0 comments3 min readLW link

AISN #38: Supreme Court De­ci­sion Could Limit Fed­eral Abil­ity to Reg­u­late AI Plus, “Cir­cuit Break­ers” for AI sys­tems, and up­dates on China’s AI industry

9 Jul 2024 19:28 UTC
5 points
0 comments5 min readLW link
(newsletter.safe.ai)

Sum­mer Tour Stops

jefftk9 Jul 2024 19:10 UTC
10 points
0 comments3 min readLW link
(www.jefftk.com)

Fix sim­ple mis­takes in ARC-AGI, etc.

Oleg Trott9 Jul 2024 17:46 UTC
9 points
9 comments1 min readLW link

Paper Sum­mary: The Effects of Com­mu­ni­cat­ing Uncer­tainty on Public Trust in Facts and Numbers

Jeffrey Heninger9 Jul 2024 16:50 UTC
42 points
2 comments2 min readLW link
(blog.aiimpacts.org)

UC Berkeley course on LLMs and ML Safety

Dan H9 Jul 2024 15:40 UTC
36 points
1 comment1 min readLW link
(rdi.berkeley.edu)

What and Why: Devel­op­men­tal In­ter­pretabil­ity of Re­in­force­ment Learning

Garrett Baker9 Jul 2024 14:09 UTC
67 points
4 comments6 min readLW link

Med­i­cal Roundup #3

Zvi9 Jul 2024 13:10 UTC
39 points
4 comments19 min readLW link
(thezvi.wordpress.com)

Con­sent across power differentials

Ramana Kumar9 Jul 2024 11:42 UTC
50 points
12 comments3 min readLW link