It occurs to me that a hybrid approach combining both HBO and LBO is possible. One simple thing we can do is build an HBO-IDA, and ask it to construct a “core for reasoning” for LBO, or just let the HBO-IDA agent directly act as the overseer for LBO-IDA. This potentially gets around a problem where LBO-IDA would work, except that unenhanced humans are not smart enough to build or act as the overseer in such a scheme. I’m guessing there are probably cleverer things we can do along these lines.
How does this address the security issues with HBO? Is the idea that only using the HBO system to construct a “core for reasoning” reduces the chances of failure by exposing it to less inputs/using it for less total time? I feel like I’m missing something...
I’d interpreted it as “using the HBO system to construct a “core for reasoning” reduces the chances of failure by exposing it to less inputs/using it for less total time”, plus maybe other properties (eg. maybe we could look at and verify an LBO overseer, even if we couldn’t construct it ourselves)
That makes sense; I hadn’t thought of the possibility that a security failure in the HBO tree might be acceptable in this context. OTOH, if there’s an input that corrupts the HBO tree, isn’t it possible that the corrupted tree could output a supposed “LBO overseer” that embeds the malicious input and corrupts us when we try to verify it? If the HBO tree is insecure, it seems like a manual process that verifies its output must be insecure as well.
One situation is: maybe an HBO tree of size 10^20 runs into a security failure with high probability, but an HBO tree of size 10^15 doesn’t and is sufficient to output a good LBO overseer.
It occurs to me that a hybrid approach combining both HBO and LBO is possible. One simple thing we can do is build an HBO-IDA, and ask it to construct a “core for reasoning” for LBO, or just let the HBO-IDA agent directly act as the overseer for LBO-IDA. This potentially gets around a problem where LBO-IDA would work, except that unenhanced humans are not smart enough to build or act as the overseer in such a scheme. I’m guessing there are probably cleverer things we can do along these lines.
How does this address the security issues with HBO? Is the idea that only using the HBO system to construct a “core for reasoning” reduces the chances of failure by exposing it to less inputs/using it for less total time? I feel like I’m missing something...
I’d interpreted it as “using the HBO system to construct a “core for reasoning” reduces the chances of failure by exposing it to less inputs/using it for less total time”, plus maybe other properties (eg. maybe we could look at and verify an LBO overseer, even if we couldn’t construct it ourselves)
That makes sense; I hadn’t thought of the possibility that a security failure in the HBO tree might be acceptable in this context. OTOH, if there’s an input that corrupts the HBO tree, isn’t it possible that the corrupted tree could output a supposed “LBO overseer” that embeds the malicious input and corrupts us when we try to verify it? If the HBO tree is insecure, it seems like a manual process that verifies its output must be insecure as well.
One situation is: maybe an HBO tree of size 10^20 runs into a security failure with high probability, but an HBO tree of size 10^15 doesn’t and is sufficient to output a good LBO overseer.