Vladimir_Nesov comments on All AGI Safety questions welcome (especially basic ones) [April 2023]

Vladimir_Nesov 8 Apr 2023 18:32 UTC
3 points
0
You might be looking at the section 3.1 of the main report on page 2 (of the revision 3 pdf). I’m talking about page 64, which is part of section 3.1 of System Card and not of the main report, but still within the same pdf document. (Does the page-anchored link I used not work on your system to display the correct page?)
- Xor 8 Apr 2023 18:46 UTC
  1 point
  0
  Parent
  Yes thanks, the page anchorage doesn’t work for me probably the device I am using. I just get page 1.
  
  That is super interesting it is able to find inconsistencies and fix them, I didn’t know that they defined them as hallucinations. What would expanding the capabilities of this sort of self improvement look like? It seems necessary to have a general understanding of what rational conversation looks like. It is an interesting situation where it knows what is bad and is able to fix it but wasn’t doing that anyways.
  - Vladimir_Nesov 8 Apr 2023 19:54 UTC
    5 points
    1
    Parent
    This is probably only going to become important once model-generated data is used for pre-training (or fine-tuning that’s functionally the same thing as continuing a pre-training run), and this process is iterated for many epochs, like with the MCTS things that play chess and Go. And you can probably just alpaca any pre-trained model you can get your hands on to start the ball rolling.
    
    The amplifications in the papers are more ambitious this year than the last, but probably still not quite on that level. One way this could change quickly is if the plugins become a programming language, but regardless I dread visible progress by the end of the year. And once the amplification-distillation cycle gets closed, autonomous training of advanced skills becomes possible.