ryan_greenblatt comments on Buck’s Shortform

ryan_greenblatt 7 Jul 2024 19:11 UTC
LW: 4 AF: 4
0
AF

I am not sure what you mean by “30xing the rate of quality-weighted research output given ¹⁄₄ of the compute”. Is this compared to present systems?

I mean 30xing the rate of all current quality weighted safety research output (including the output of humans, which is basically all of such output at the moment).

I usually define transformative AI against this sort of benchmark.

I mostly don’t have any great ideas how to use these systems for alignment or control progress

FWIW, I feel like I do have decent ideas for how to use these systems for alignment progress which is plausibly scalable to much more powerful systems.

And also ideas for using these systems in a variety of other ways which help a bit, e.g. advancing the current control measures applied to these systems.

I’m also maybe somewhat more optimistic than you about pausing making more advanced AI than these already very powerful systems (for e.g. 10 years). Especially if there is clear evidence of serious misalignment in such systems.
- habryka 7 Jul 2024 19:22 UTC
  LW: 2 AF: 2
  0
  AF Parent
  I’m also maybe somewhat more optimistic than you about pausing making more advanced AI than these already very powerful systems. Especially if there is clear evidence of serious misalignment in such systems.
  Ah, to be clear, in as much as I do have hope, it does route through this kind of pause. I am generally pessimistic about that happening, but it is where a lot of my effort these days goes into.
  And then in those worlds, I do agree that a lot of progress will probably be made with substantial assistance from these early systems. I do expect it to take a good while until we figure out how to do that, and so don’t see much hope for that kind of work happening where humanity doesn’t substantially pause or slow down cutting-edge system development.