Milan W comments on Jeremy Gillen’s Shortform

Milan W 3 Sep 2024 19:16 UTC
1 point
0
I do not know of such a way. I find it unlikely that OpenAI’s next training run wil result in a model that could end humanity, but I can provide no guarantees about that.

You seem to be assuming that all models above a certain threshold of capabilites will either exercise strong optimization pressure on the world in pursuit of goals, or will be useless. Put another way, you seem to be conflating capabilities with actually exerted world-optimization pressures.

While I agree that given a wide enough deployment it is likely that a given model will end up exercising its capabilities pretty much to their fullest extent, I hold that it is in principle possible to construct a mind that desires to help and is able to do so, yet also deliberately refrains from applying too much pressure.