the structure of jobs is shaped to accommodate human unreliability by making mistakes less fatal
Mm, so there’s a selection effect on the human end, where the only jobs/pursuits that exist are those which humans happen to be able to reliably do, and there’s a discrepancy between the things humans and AIs are reliable at, so we end up observing AIs being more unreliable, even though this isn’t representative of the average difference between the human vs. AI reliability across all possible tasks?
I don’t know that I buy this. Humans seem pretty decent at becoming reliable at ~anything, and I don’t think we’ve observed AIs being more-reliable-than-humans at anything? (Besides trivial and overly abstract tasks such as “next-token prediction”.)
My claim was more along the lines of if an unaided human can’t do a job safely or reliably, as was almost certainly the case 150-200 years ago, if not more years in the past, we make the jobs safer using tools such that human error is way less of a big deal, and AIs currently haven’t used tools that increased their reliability.
Remember, it took a long time for factories to be made safe, and I’d expect a similar outcome for driving, so while I don’t think 1 is everything, I do think it’s a non-trivial portion of the reliability difference.
I think (2) does play an important part here, and that the recent work on allowing AIs to notice and correct their mistakes (calibration training, backspace-tokens for error correction) are going to show some dividends once they make their way from the research frontier to actually deployed frontier models.
Mm, so there’s a selection effect on the human end, where the only jobs/pursuits that exist are those which humans happen to be able to reliably do, and there’s a discrepancy between the things humans and AIs are reliable at, so we end up observing AIs being more unreliable, even though this isn’t representative of the average difference between the human vs. AI reliability across all possible tasks?
I don’t know that I buy this. Humans seem pretty decent at becoming reliable at ~anything, and I don’t think we’ve observed AIs being more-reliable-than-humans at anything? (Besides trivial and overly abstract tasks such as “next-token prediction”.)
(2) seems more plausible to me.
My claim was more along the lines of if an unaided human can’t do a job safely or reliably, as was almost certainly the case 150-200 years ago, if not more years in the past, we make the jobs safer using tools such that human error is way less of a big deal, and AIs currently haven’t used tools that increased their reliability.
Remember, it took a long time for factories to be made safe, and I’d expect a similar outcome for driving, so while I don’t think 1 is everything, I do think it’s a non-trivial portion of the reliability difference.
More here:
https://www.lesswrong.com/posts/DQKgYhEYP86PLW7tZ/how-factories-were-made-safe
I think (2) does play an important part here, and that the recent work on allowing AIs to notice and correct their mistakes (calibration training, backspace-tokens for error correction) are going to show some dividends once they make their way from the research frontier to actually deployed frontier models.
Relevant links:
LLMs cannot find reasoning errors, but can correct them!
Physics of LLMs: learning from mistakes
Explanation of Accuracy vs Calibration vs Robustness
A Survey of Calibration Process for Black-Box LLMs