sounds pretty damn hard, but maybe research workers at google who worked on autoformalization (similar papers) might be able to help this happen at significant scale. if you, like me, find videos far easier to get through than text, here’s a talk on the topic that was given recently:
Intuitively, asking the LLM to translate from english to coq is a cousin of “ask the LLM to summarize the coq into english”, via security mindset. What assurances can we get about the translation process? This seems dicey and hard. I’m imagining that there’s an auditing game design out there that would give us some hope though.
most codegen techniques assume a margin of error until proven otherwise, that includes extraction from coq to ocaml or haskell and lots of program synthesis techniques I’ve heard about. Extraction (like that from coq to ocaml or haskell) is probably the furthest away from SGD I think you can get on “the codegen spectrum”.
I’m mostly interested in formal checking of statements not generated by strictly formal reasoning, though. the benefit of automatic formalization is to be sure a machine checked the interference pattern of implications of a claim I already knew I wanted to make. ie, the stronger version of fuzz testing.
But actually what I had in mind is something simpler that would not necessarily need such tools to be feasible. Basically akin to taking the main argument of each approach, as expressed in natural language, without worrying too much about all the baggage at finer levels of abstraction. But I guess this is not quite what the article is about ..
sounds pretty damn hard, but maybe research workers at google who worked on autoformalization (similar papers) might be able to help this happen at significant scale. if you, like me, find videos far easier to get through than text, here’s a talk on the topic that was given recently:
Intuitively, asking the LLM to translate from english to coq is a cousin of “ask the LLM to summarize the coq into english”, via security mindset. What assurances can we get about the translation process? This seems dicey and hard. I’m imagining that there’s an auditing game design out there that would give us some hope though.
yeah it only helps if you read the generated formal code. it’s almost useless otherwise.
most codegen techniques assume a margin of error until proven otherwise, that includes extraction from coq to ocaml or haskell and lots of program synthesis techniques I’ve heard about. Extraction (like that from coq to ocaml or haskell) is probably the furthest away from SGD I think you can get on “the codegen spectrum”.
fair point.
I’m mostly interested in formal checking of statements not generated by strictly formal reasoning, though. the benefit of automatic formalization is to be sure a machine checked the interference pattern of implications of a claim I already knew I wanted to make. ie, the stronger version of fuzz testing.
Thanks for the link!
But actually what I had in mind is something simpler that would not necessarily need such tools to be feasible. Basically akin to taking the main argument of each approach, as expressed in natural language, without worrying too much about all the baggage at finer levels of abstraction. But I guess this is not quite what the article is about ..