DaemonicSigil comments on Counting arguments provide no evidence for AI doom

DaemonicSigil 28 Feb 2024 0:18 UTC
LW: 2 AF: 1
−1
AF

More generally, John Miller and colleagues have found training performance is an excellent predictor of test performance, even when the test set looks fairly different from the training set, across a wide variety of tasks and architectures.

Seems like figure 1 from Miller et al is a plot of test performance vs. “out of distribution” test performance. One might expect plots of training performance vs. “out of distribution” test performance to have more spread.
- Nora Belrose 28 Feb 2024 0:33 UTC
  LW: 1 AF: 1
  0
  AF Parent
  I doubt there would be much difference, and I think the alignment-relevant comparison is to compare in-distribution but out-of-sample performance to out-of-distribution performance. We can easily do i.i.d. splits of our data, that’s not a problem. You might think it’s a problem to directly test the model in scenarios where it could legitimately execute a takeover if it wanted to.
  - Donald Hobson 15 Mar 2024 21:15 UTC
    LW: 2 AF: 1
    0
    AF Parent
    Taking IID samples can be hard actually. Suppose you train an LLM on news articles. And each important real world event has 10 basically identical news articles written about it. Then a random split of the articles will leave the network being tested mostly on the same newsworthy events that were in the training data.
    This leaves it passing the test, even if it’s hopeless at predicting new events and can only generate new articles about the same events.
    When data duplication is extensive, making a meaningful train/test split is hard.
    If the data was perfect copy and paste duplicated, that could be filtered out. But often things are rephrased a bit.
  - Noosphere89 28 Feb 2024 1:00 UTC
    2 points
    0
    Parent
    I actually wish this is done sometime in the future, but I’m okay with focusing on other things for now.
    
    (specifically the Training vs Out Of Distribution test performance experiment, especially on more realistic neural nets.)
  - DaemonicSigil 28 Feb 2024 0:56 UTC
    2 points
    0
    Parent
    Fair enough for the alignment comparison, I was just hoping you could maybe correct the quoted paragraph to say “performance on the hold-out data” or something similar.
    
    (The reason to expect more spread would be that training performance can’t detect overfitting but performance on the hold-out data can. I’m guessing some of the nets trained in Miller et al did indeed overfit (specifically the ones with lower performance).)