These are both cases of counterexample-guided techniques. The basic idea is to solve “exists x: forall y: P(x, y)” statements according to the following algorithm:
Choose some initial x, and initialize a set Y = {}.
Solve “exists y: not P(x, y)”. If unsolvable, you’re done. If not, take the discovered y and put it in Y.
Solve “exists x: forall y in Y: P(x, y)” and set the solution as your new x.
Go to step 2.
The reason this is so nice is because you’ve taken a claim with two quantifiers and written an algorithm that must only ever solve claims with one quantifier. (For step 3, you inline the “forall y in Y” part, because Y is a small finite set.)
The methodology laid out in this post is a counterexample-guided approach to solve the claim “exists alignment proposal: forall plausible worlds: The alignment proposal is safe in the world”
Examples from programming languages include CEGIS (counterexample guided inductive synthesis) and CEGAR (counterexample guided abstraction refinement).
Red-penning is a general problem-solving method that’s kinda similar to this research methodology.
These are both cases of counterexample-guided techniques. The basic idea is to solve “exists x: forall y: P(x, y)” statements according to the following algorithm:
Choose some initial x, and initialize a set Y = {}.
Solve “exists y: not P(x, y)”. If unsolvable, you’re done. If not, take the discovered y and put it in Y.
Solve “exists x: forall y in Y: P(x, y)” and set the solution as your new x.
Go to step 2.
The reason this is so nice is because you’ve taken a claim with two quantifiers and written an algorithm that must only ever solve claims with one quantifier. (For step 3, you inline the “forall y in Y” part, because Y is a small finite set.)
The methodology laid out in this post is a counterexample-guided approach to solve the claim “exists alignment proposal: forall plausible worlds: The alignment proposal is safe in the world”
Examples from programming languages include CEGIS (counterexample guided inductive synthesis) and CEGAR (counterexample guided abstraction refinement).