Thank you, Kaj. Those references are what I was looking for.
It looks like there might be a somewhat new idea here. Previous suggestions, as you mention, restrict output to a single bit; or require review by human experts. Using multiple AGI oracles to check each other is a good one, though I’d worry about acausal coordination between by the AGIs, and I don’t see that the safety is provable beyond checking that answers match.
This new variant gives the benefit of provable restrictions and the relative ease of implementing a narrow-AI proof system to check it. It’s certainly not the full solution to the FAI problem, but it’s a good addition to our lineup of partial or short-term solutions in the area of AI Boxing and Oracle AI.
I’ll get this feedback to the originator of this idea and see what can be made of it.
Thank you, Kaj. Those references are what I was looking for.
It looks like there might be a somewhat new idea here. Previous suggestions, as you mention, restrict output to a single bit; or require review by human experts. Using multiple AGI oracles to check each other is a good one, though I’d worry about acausal coordination between by the AGIs, and I don’t see that the safety is provable beyond checking that answers match.
This new variant gives the benefit of provable restrictions and the relative ease of implementing a narrow-AI proof system to check it. It’s certainly not the full solution to the FAI problem, but it’s a good addition to our lineup of partial or short-term solutions in the area of AI Boxing and Oracle AI.
I’ll get this feedback to the originator of this idea and see what can be made of it.