RSS

Seth Herd

Karma: 3,087

I’ve been doing computational cognitive neuroscience research since getting my PhD in 2006, until the end of 2022. I’ve worked on computatonal theories of vision, executive function, episodic memory, and decision-making. I’ve focused on the emergent interactions that are needed to explain complex thought. I was increasingly concerned with AGI applications of the research, and reluctant to publish my best ideas. I’m incredibly excited to now be working directly on alignment, currently with generous funding from the Astera Institute. More info and publication list here.

[Question] What’s a bet­ter term now that “AGI” is too vague?

Seth Herd28 May 2024 18:02 UTC
15 points
8 comments2 min readLW link

An­thropic an­nounces in­ter­pretabil­ity ad­vances. How much does this ad­vance al­ign­ment?

Seth Herd21 May 2024 22:30 UTC
49 points
4 comments3 min readLW link
(www.anthropic.com)

In­struc­tion-fol­low­ing AGI is eas­ier and more likely than value al­igned AGI

Seth Herd15 May 2024 19:38 UTC
42 points
23 comments12 min readLW link

Goals se­lected from learned knowl­edge: an al­ter­na­tive to RL alignment

Seth Herd15 Jan 2024 21:52 UTC
40 points
17 comments7 min readLW link

After Align­ment — Dialogue be­tween RogerDear­naley and Seth Herd

2 Dec 2023 6:03 UTC
15 points
2 comments25 min readLW link

Cor­rigi­bil­ity or DWIM is an at­trac­tive pri­mary goal for AGI

Seth Herd25 Nov 2023 19:37 UTC
16 points
4 comments1 min readLW link

Sapi­ence, un­der­stand­ing, and “AGI”

Seth Herd24 Nov 2023 15:13 UTC
15 points
3 comments6 min readLW link

Alt­man re­turns as OpenAI CEO with new board

Seth Herd22 Nov 2023 16:04 UTC
5 points
3 comments1 min readLW link

OpenAI Staff (in­clud­ing Sutskever) Threaten to Quit Un­less Board Resigns

Seth Herd20 Nov 2023 14:20 UTC
52 points
28 comments1 min readLW link
(www.wired.com)

We have promis­ing al­ign­ment plans with low taxes

Seth Herd10 Nov 2023 18:51 UTC
31 points
9 comments5 min readLW link

Seth Herd’s Shortform

Seth Herd10 Nov 2023 6:52 UTC
6 points
17 comments1 min readLW link

Shane Legg in­ter­view on alignment

Seth Herd28 Oct 2023 19:28 UTC
66 points
20 comments2 min readLW link
(www.youtube.com)

The (par­tial) fal­lacy of dumb superintelligence

Seth Herd18 Oct 2023 21:25 UTC
27 points
5 comments4 min readLW link

Steer­ing sub­sys­tems: ca­pa­bil­ities, agency, and alignment

Seth Herd29 Sep 2023 13:45 UTC
22 points
0 comments8 min readLW link

AGI isn’t just a technology

Seth Herd1 Sep 2023 14:35 UTC
18 points
12 comments2 min readLW link

In­ter­nal in­de­pen­dent re­view for lan­guage model agent alignment

Seth Herd7 Jul 2023 6:54 UTC
53 points
26 comments11 min readLW link

Sim­pler ex­pla­na­tions of AGI risk

Seth Herd14 May 2023 1:29 UTC
8 points
9 comments3 min readLW link

A sim­ple pre­sen­ta­tion of AI risk arguments

Seth Herd26 Apr 2023 2:19 UTC
16 points
0 comments2 min readLW link

Ca­pa­bil­ities and al­ign­ment of LLM cog­ni­tive architectures

Seth Herd18 Apr 2023 16:29 UTC
83 points
18 comments20 min readLW link

Agen­tized LLMs will change the al­ign­ment landscape

Seth Herd9 Apr 2023 2:29 UTC
153 points
95 comments3 min readLW link