Ebenezer Dukakis comments on Current AIs Provide Nearly No Data Relevant to AGI Alignment

Ebenezer Dukakis 21 Dec 2023 16:09 UTC
3 points
0

...I mean if you want to do the equivalent of a modern large training run you’ll need trillions of tokens of expert-generated text. So that’s a million experts generating a million tokens each? So, basically a million experts working full-time for years? So something like a hundred billion dollars minimum just to pay them all, plus probably more for the bureaucratic infrastructure needed to ensure they aren’t slacking off or cheating or trying to poison your dataset?

Where are these numbers coming from? They seem way too high. My suggestion is to do a modern large training run in the standard way (next-token prediction), and then fine-tune on experts playing the role of a helpful/honest/harmless chatbot doing CoT. Basically replace RLHF with finetuning on expert chatbot roleplay. Maybe I’m betraying my ignorance here and this idea doesn’t make sense for some reason?

I was editing my comment a fair amount, perhaps you read an old version of it?

And, in terms of demonstrating feasibility, you don’t need to pay any experts to demonstrate the feasibility of this idea. Just take a bunch of ChatGPT responses that are known to be high quality, make a dataset out of them, and use them in the training pipeline I propose, as though they were written by human experts. Then evaluate the quality of the resulting model. If it’s nearly as good as the original ChatGPT, I think you should be good to go.
- Daniel Kokotajlo 17 Jan 2024 18:06 UTC
  2 points
  0
  Parent
  I said “if you want to do the equivalent of a modern large training run.” If your intervention is just a smaller fine-tuning run on top of a standard LLM, then that’ll be proportionately cheaper. And that might be good enough. But maybe we won’t be able to get to AGI that way.
  
  Worth a shot though.