the gears to ascension comments on Speedrunning 4 mistakes you make when your alignment strategy is based on formal proof

the gears to ascension 16 Feb 2023 9:57 UTC
3 points
0
sounds pretty damn hard, but maybe research workers at google who worked on autoformalization (similar papers) might be able to help this happen at significant scale. if you, like me, find videos far easier to get through than text, here’s a talk on the topic that was given recently:
- Quinn 16 Feb 2023 12:37 UTC
  3 points
  2
  Parent
  Intuitively, asking the LLM to translate from english to coq is a cousin of “ask the LLM to summarize the coq into english”, via security mindset. What assurances can we get about the translation process? This seems dicey and hard. I’m imagining that there’s an auditing game design out there that would give us some hope though.
  - the gears to ascension 16 Feb 2023 17:03 UTC
    2 points
    0
    Parent
    yeah it only helps if you read the generated formal code. it’s almost useless otherwise.
    - Quinn 16 Feb 2023 17:45 UTC
      3 points
      0
      Parent
      most codegen techniques assume a margin of error until proven otherwise, that includes extraction from coq to ocaml or haskell and lots of program synthesis techniques I’ve heard about. Extraction (like that from coq to ocaml or haskell) is probably the furthest away from SGD I think you can get on “the codegen spectrum”.
      - the gears to ascension 16 Feb 2023 17:47 UTC
        2 points
        0
        Parent
        fair point.
        
        I’m mostly interested in formal checking of statements not generated by strictly formal reasoning, though. the benefit of automatic formalization is to be sure a machine checked the interference pattern of implications of a claim I already knew I wanted to make. ie, the stronger version of fuzz testing.
- fvncc 17 Feb 2023 1:12 UTC
  1 point
  0
  Parent
  Thanks for the link!
  
  But actually what I had in mind is something simpler that would not necessarily need such tools to be feasible. Basically akin to taking the main argument of each approach, as expressed in natural language, without worrying too much about all the baggage at finer levels of abstraction. But I guess this is not quite what the article is about ..