This reminds me of the magic black box described by Scott Alexander:
Imagine a black box which, when you pressed a button, would generate a scientific hypothesis. 50% of its hypotheses are false; 50% are true hypotheses as game-changing and elegant as relativity. Even despite the error rate, it’s easy to see this box would quickly surpass space capsules, da Vinci paintings, and printer ink cartridges to become the most valuable object in the world. Scientific progress on demand, and all you have to do is test some stuff to see if it’s true? I don’t want to devalue experimentalists. They do great work. But it’s appropriate that Einstein is more famous than Eddington. If you took away Eddington, someone else would have tested relativity; the bottleneck is in Einsteins. Einstein-in-a-box at the cost of requiring two Eddingtons per insight is a heck of a deal.
What if the box had only a 10% success rate? A 1% success rate? My guess is: still most valuable object in the world. Even an 0.1% success rate seems pretty good, considering (what if we ask the box for cancer cures, then test them all on lab rats and volunteers?) You have to go pretty low before the box stops being great.
But this scenario seems kind of unfair to me. We are definitely not at the point where LLMs can provide truly novel groundbreaking scientific insights on their own. Meanwhile, nobody would use a LLM calculator over a classical calculator if the former that gets math wrong 10% of the time.
This reminds me of the magic black box described by Scott Alexander:
But this scenario seems kind of unfair to me. We are definitely not at the point where LLMs can provide truly novel groundbreaking scientific insights on their own. Meanwhile, nobody would use a LLM calculator over a classical calculator if the former that gets math wrong 10% of the time.