At­ten­tion-Fea­ture Tables in Gemma 2 Resi­d­ual Streams

J Bostock6 Aug 2024 22:56 UTC
2 points
0 comments14 min readLW link

[Question] What are the strate­gic im­pli­ca­tions if aliens and Earth civ­i­liza­tions pro­duce similar util­ities?

Maxime Riché6 Aug 2024 21:16 UTC
4 points
1 comment1 min readLW link

WTH is Cere­brolysin, ac­tu­ally?

6 Aug 2024 20:40 UTC
175 points
23 comments17 min readLW link

The Prag­matic Side of Cryp­to­graph­i­cally Box­ing AI

Bart Jaworski6 Aug 2024 17:46 UTC
6 points
0 comments9 min readLW link

In­fer­ence-Only De­bate Ex­per­i­ments Us­ing Math Problems

6 Aug 2024 17:44 UTC
31 points
0 comments2 min readLW link

[Question] Is an AI re­li­gion jus­tified?

p4rziv4l6 Aug 2024 15:42 UTC
−35 points
11 comments1 min readLW link

Startup Roundup #2

Zvi6 Aug 2024 13:30 UTC
45 points
0 comments32 min readLW link
(thezvi.wordpress.com)

Mechanis­tic Ano­maly De­tec­tion Re­search Update

6 Aug 2024 10:33 UTC
11 points
0 comments1 min readLW link
(blog.eleuther.ai)

Rea­son­ing is not search—a chess example

p.b.6 Aug 2024 9:29 UTC
4 points
3 comments2 min readLW link

Broadly hu­man level, cog­ni­tively com­plete AGI

p.b.6 Aug 2024 9:26 UTC
7 points
0 comments1 min readLW link

Does Evolu­tion­ary The­ory Im­ply Ge­netic Trib­al­ism?

Zero Contradictions6 Aug 2024 5:43 UTC
0 points
1 comment1 min readLW link
(thewaywardaxolotl.blogspot.com)

How I Learned To Stop Trust­ing Pre­dic­tion Mar­kets and Love the Arbitrage

orthonormal6 Aug 2024 2:32 UTC
197 points
30 comments3 min readLW link

John Schul­man leaves OpenAI for Anthropic

Sodium6 Aug 2024 1:23 UTC
57 points
0 comments1 min readLW link

Self-ex­plain­ing SAE features

5 Aug 2024 22:20 UTC
60 points
13 comments10 min readLW link

Value frag­ility and AI takeover

Joe Carlsmith5 Aug 2024 21:28 UTC
76 points
5 comments30 min readLW link

Ex­cur­sions into Sparse Au­toen­coders: What is monose­man­tic­ity?

Jakub Smékal5 Aug 2024 19:22 UTC
2 points
0 comments10 min readLW link

Madrid—ACX Mee­tups Every­where Fall 2024

Pablo Villalobos5 Aug 2024 18:36 UTC
4 points
0 comments1 min readLW link

LLMs stifle cre­ativity, elimi­nate op­por­tu­ni­ties for serendipi­tous dis­cov­ery and dis­rupt in­ter­gen­er­a­tional trans­fer of wisdom

Ghdz5 Aug 2024 18:27 UTC
6 points
2 comments7 min readLW link

Cir­cu­lar Reasoning

abramdemski5 Aug 2024 18:10 UTC
91 points
37 comments8 min readLW link

Fear of cen­tral­ized power vs. fear of mis­al­igned AGI: Vi­talik Bu­terin on 80,000 Hours

Seth Herd5 Aug 2024 15:38 UTC
65 points
22 comments5 min readLW link

Four Phases of AGI

Gabe M5 Aug 2024 13:15 UTC
11 points
3 comments13 min readLW link

AI Safety at the Fron­tier: Paper High­lights, July ’24

gasteigerjo5 Aug 2024 13:00 UTC
8 points
0 comments7 min readLW link
(aisafetyfrontier.substack.com)

Game The­ory and Society

Zero Contradictions5 Aug 2024 4:27 UTC
4 points
0 comments1 min readLW link
(thewaywardaxolotl.blogspot.com)

Near-mode think­ing on AI

Olli Järviniemi4 Aug 2024 20:47 UTC
128 points
8 comments5 min readLW link

Water­marks: Sign­ing, Brand­ing, and Boobytrapping

Shankar Sivarajan4 Aug 2024 20:41 UTC
2 points
0 comments1 min readLW link

Model­ling So­cial Ex­change: A Sys­tem­a­tised Method to Judge Friend­ship Quality

Wynn Walker4 Aug 2024 18:49 UTC
6 points
0 comments5 min readLW link

We’re not as 3-Di­men­sional as We Think

silentbob4 Aug 2024 14:39 UTC
37 points
16 comments5 min readLW link

You don’t know how bad most things are nor pre­cisely how they’re bad.

Solenoid_Entity4 Aug 2024 14:12 UTC
317 points
48 comments5 min readLW link

Can We Pre­dict Per­sua­sive­ness Bet­ter Than An­thropic?

Lennart Finke4 Aug 2024 14:05 UTC
22 points
5 comments4 min readLW link

[Question] What should we do about COVID in 2024?

ChristianKl4 Aug 2024 10:57 UTC
20 points
2 comments1 min readLW link

To­k­enized SAEs: In­fus­ing per-to­ken bi­ases.

4 Aug 2024 9:17 UTC
19 points
20 comments15 min readLW link

Thoughts On Democracy

Zero Contradictions4 Aug 2024 6:02 UTC
2 points
0 comments1 min readLW link
(zerocontradictions.net)

AI Align­ment through Com­par­a­tive Advantage

artemiocobb4 Aug 2024 0:32 UTC
−2 points
4 comments3 min readLW link

La­bel­ling, Vari­ables, and In-Con­text Learn­ing in Llama2

Joshua Penman3 Aug 2024 19:36 UTC
6 points
0 comments1 min readLW link
(colab.research.google.com)

[Question] Dan Hendrycks and EA

jeffreycaruso3 Aug 2024 13:33 UTC
−4 points
4 comments1 min readLW link

[Question] Why do Min­i­mal Bayes Nets of­ten cor­re­spond to Causal Models of Real­ity?

Dalcy3 Aug 2024 12:39 UTC
27 points
1 comment1 min readLW link

Why did ChatGPT say that? Prompt en­g­ineer­ing and more, with PIZZA.

Jessica Rumbelow3 Aug 2024 12:07 UTC
40 points
2 comments4 min readLW link

Co­op­er­a­tion and Align­ment in Del­e­ga­tion Games: You Need Both!

3 Aug 2024 10:16 UTC
8 points
0 comments14 min readLW link
(www.oliversourbut.net)

SRE’s re­view of Democracy

Martin Sustrik3 Aug 2024 7:20 UTC
48 points
2 comments3 min readLW link
(250bpm.substack.com)

The Case Against Libertarianism

Zero Contradictions3 Aug 2024 5:05 UTC
−4 points
1 comment1 min readLW link
(zerocontradictions.net)

We Don’t Just Let Peo­ple Die—So What Next?

James Stephen Brown3 Aug 2024 1:04 UTC
11 points
8 comments10 min readLW link

The EA case for Trump

Judd Rosenblatt3 Aug 2024 1:00 UTC
9 points
1 comment1 min readLW link
(www.secondbest.ca)

I didn’t think I’d take the time to build this cal­ibra­tion train­ing game, but with web­sim it took roughly 30 sec­onds, so here it is!

mako yass2 Aug 2024 22:35 UTC
24 points
2 comments5 min readLW link

Eval­u­at­ing Sparse Au­toen­coders with Board Game Models

2 Aug 2024 19:50 UTC
38 points
1 comment9 min readLW link

The Bit­ter Les­son for AI Safety Research

2 Aug 2024 18:39 UTC
57 points
5 comments3 min readLW link

Eth­i­cal De­cep­tion: Should AI Ever Lie?

Jason Reid2 Aug 2024 17:53 UTC
5 points
2 comments7 min readLW link

[Question] Re­quest for AI risk quotes, es­pe­cially around speed, large im­pacts and black boxes

Nathan Young2 Aug 2024 17:49 UTC
6 points
0 comments1 min readLW link

A Sim­ple Toy Co­her­ence Theorem

2 Aug 2024 17:47 UTC
74 points
22 comments7 min readLW link

All the Fol­low­ing are Distinct

Gianluca Calcagni2 Aug 2024 16:35 UTC
16 points
3 comments8 min readLW link

The ‘strong’ fea­ture hy­poth­e­sis could be wrong

lewis smith2 Aug 2024 14:33 UTC
221 points
17 comments17 min readLW link