This seems especially unlikely to work given it only gives a probability. You know what you call someone whose superintelligent AI does what they want 95% of the time? Dead.
if you can get it to do what you want even 51% of the time and make that 51% independent on each sampling (it isn’t, so in practice you’d like some margin, but 95% is actually a lot of margin!) you can get arbitrarily good compliance by creating AI committees and taking a majority vote.
if you can get it to do what you want even 51% of the time and make that 51% independent on each sampling (it isn’t, so in practice you’d like some margin, but 95% is actually a lot of margin!) you can get arbitrarily good compliance by creating AI committees and taking a majority vote.
Yep, if you can make it a flat 51% that’s a victory condition but I would be shocked if that is how any of this works.