Vladimir_Nesov comments on All AGI Safety questions welcome (especially basic ones) [April 2023]

Vladimir_Nesov 8 Apr 2023 16:02 UTC
7 points
1

You cannot teach GPT by texts generated by GPT, because unlike the chess and go, you do not have the exact rules to tell you which generated ouputs are the new winning moves, and which are nonsense.

You can ask GPT which are nonsense (in various ways), with no access to ground truth, and that actually works to improve responses. This sort of approach was even used to fine-tune GPT-4 (see the 4-step algorithm in section 3.1 of the System Card part of GPT-4 report).
- Xor 8 Apr 2023 18:16 UTC
  1 point
  0
  Parent
  I checked out that section but what you are saying doesn’t follow for me. The section describes fine tuning compute and optimizing scalability, how does this relate to self improvement.
  There is a possibility I am looking in the wrong section, I was reading was about algorithms that efficiently were predicting how ChatGPT would scale. Also I didn’t see anything about a 4-step algorithm.
  Anyways could you explain what you mean or where I can find the right section?
  - Vladimir_Nesov 8 Apr 2023 18:32 UTC
    3 points
    0
    Parent
    You might be looking at the section 3.1 of the main report on page 2 (of the revision 3 pdf). I’m talking about page 64, which is part of section 3.1 of System Card and not of the main report, but still within the same pdf document. (Does the page-anchored link I used not work on your system to display the correct page?)
    - Xor 8 Apr 2023 18:46 UTC
      1 point
      0
      Parent
      Yes thanks, the page anchorage doesn’t work for me probably the device I am using. I just get page 1.
      
      That is super interesting it is able to find inconsistencies and fix them, I didn’t know that they defined them as hallucinations. What would expanding the capabilities of this sort of self improvement look like? It seems necessary to have a general understanding of what rational conversation looks like. It is an interesting situation where it knows what is bad and is able to fix it but wasn’t doing that anyways.
      - Vladimir_Nesov 8 Apr 2023 19:54 UTC
        5 points
        1
        Parent
        This is probably only going to become important once model-generated data is used for pre-training (or fine-tuning that’s functionally the same thing as continuing a pre-training run), and this process is iterated for many epochs, like with the MCTS things that play chess and Go. And you can probably just alpaca any pre-trained model you can get your hands on to start the ball rolling.
        
        The amplifications in the papers are more ambitious this year than the last, but probably still not quite on that level. One way this could change quickly is if the plugins become a programming language, but regardless I dread visible progress by the end of the year. And once the amplification-distillation cycle gets closed, autonomous training of advanced skills becomes possible.