ryan_greenblatt comments on TurnTrout’s shortform feed

ryan_greenblatt 17 Dec 2023 2:58 UTC
10 points
2
[Somewhat off-topic]

Eventually, one person ventured a reply, spoken in a rather more tentative tone than they’d been using to pronounce that SGD would internalize coherent goals into language models. They named “Running a factory competently.”

I like thinking about the task “speeding up the best researchers by 30x” (to simplify, let’s only include research in purely digital (software only) domains).

To be clear, I am by no means confident that this can’t be done safely or non-agentically. It seems totally plausible to me that this can be accomplished without agency except for agency due to the natural language outputs of an LLM agent. (Perhaps I’m at 15% that this will in practice be done without any non-trivial agency that isn’t visible in natural language.)

(As such, this isn’t a good answer to the question of “I’d like to know what’s the least impressive task which cannot be done by a ‘non-agentic’ system, that you are very confident cannot be done safely and non-agentically in the next two years.”. I think there probably isn’t any interesting answer to this question for me due to “very confident” being a strong condition.)

I like thinking about this task because if we were able to speed up generic research on purely digital domains by this large of an extent, safety research done with this speed up would clearly obsolete prior safety research pretty quickly.

(It also seems likely that we could singularity quite quickly from this point if wanted to, so it’s not clear we’ll have a ton of time at this capability level.)