otto.barten comments on otto.barten’s Shortform

otto.barten 25 Jan 2024 13:08 UTC
3 points
Regulation proposal: make it obligatory to only have satisficer training goals. Try to get loss 0.001, not loss 0. This should stop an AI in its tracks even if it goes rogue. By setting the satisficers thoughtfully, we could theoretically tune the size of our warning shots.

In the end, someone is going to build an ASI with a maximizer goal, leading to a takeover, barring regulation or alignment+pivotal act. However, changing takeovers to warning shots is a very meaningful intervention, as it prevents takeover and provides a policy window of opportunity.