RSS

NickGabs

Karma: 388

Steer­ing Llama-2 with con­trastive ac­ti­va­tion additions

Jan 2, 2024, 12:47 AM
125 points
29 comments8 min readLW link
(arxiv.org)

Science of Deep Learn­ing more tractably ad­dresses the Sharp Left Turn than Agent Foundations

NickGabsSep 19, 2023, 10:06 PM
23 points
2 comments6 min readLW link

An up­com­ing US Supreme Court case may im­pede AI gov­er­nance efforts

NickGabsJul 16, 2023, 11:51 PM
57 points
17 comments2 min readLW link

Em­piri­cal Ev­i­dence Against “The Longest Train­ing Run”

NickGabsJul 6, 2023, 6:32 PM
31 points
0 comments14 min readLW link

Pro­posal: labs should pre­com­mit to paus­ing if an AI ar­gues for it­self to be improved

NickGabsJun 2, 2023, 10:31 PM
3 points
3 comments4 min readLW link

AI Doom Is Not (Only) Disjunctive

NickGabsMar 30, 2023, 1:42 AM
12 points
0 comments5 min readLW link

We Need Holis­tic AI Macrostrategy

NickGabsJan 15, 2023, 2:13 AM
39 points
4 comments8 min readLW link

Take­off speeds, the chimps anal­ogy, and the Cul­tural In­tel­li­gence Hypothesis

NickGabsDec 2, 2022, 7:14 PM
16 points
2 comments4 min readLW link

Mis­cel­la­neous First-Pass Align­ment Thoughts

NickGabsNov 21, 2022, 9:23 PM
12 points
4 comments10 min readLW link

Distil­la­tion of “How Likely Is De­cep­tive Align­ment?”

NickGabsNov 18, 2022, 4:31 PM
24 points
4 comments10 min readLW link