My money is mostly on “It just takes a really long time to convert innovation into profitable, popular product”
A related puzzle piece IMO: Several years ago, all my friends used f.lux to reduce the amount that computer screens screwed up their circadian rhythm. It had to be manually installed. I was confused/annoyed why Apple didn’t do this automatically.
A couple years later, Apple did start doing it automatically (and more recently start shifting everything to darkmode at night)
Meanwhile: A couple years ago, we released shortform on LessWrong. There’s a fairly obvious feature missing, which is showing a user’s shortform on their User Profile. That feature is still missing a couple years later. It would take maybe a day to build, and a week to get reviewed and merged into production. There are other obvious missing features we haven’t gotten around to. The reason we haven’t gotten around to it is something like “well, there’s a lot of competing engineering work to do instead, and there’s a bunch of small priorities that make it hard to just set aside a day for doing it”.
I think Habryka believes this just isn’t the most important thing missing from LW and that keeping the eye on bigger bottlenecks/opportunities is more important. I think Jimrandomh thinks it’s important to make this sort of small feature improvement, but also there’s a bunch of other small feature improvements that need doing (as well as big feature improvements that take up a lot of cognitive attention)
There’s also a bit of organization dysfunction, and/or “the cost of information flow and decisionmaking flow is legitimately ‘real’”.
Something about all this is immensely dissatisfying to me, but it seems like a brute fact about how hard things are. LW is a small team. Apple is a much larger organization that probably pays much higher decisionmaking overhead cost.
I think the bridge from “GPT is really impressive” to “GPT successfully summarizes research reports for you” is a much harder problem than adding f.lux to Mac OS or adding shortform to a User Profile. Also, the teams capable of doing it are mostly working on doing the next cool research thing. Also, InstructGPT totally does exist, but each major productization is a lot of gnarly engineering work (and again the people with the depth of understanding to do it are largely busy)
Note that this is also where some of my “somewhat longer AGI timelines” beliefs come from (i.e 7 years seems more like the minimum to me, whereas I know a couple people listing that as more like a median).
It seems to me that most of the pieces of AGI exist already, but that actually getting from here to AGI will require a 2-3 steps, and each step probably turns out to require some annoying engineering steps.
I wonder if there’s also some basic business domain expertise that generalizes here but hasn’t been developed yet. “How to use software to replace humans with spreadsheets” is a piece of domain expertise the SaaS business community has developed to the point where it gets pretty reliably executed. I don’t know that we have widespread knowledge of how to reliably turn models into services/products.
Riffing on the idea that “productionizing a cool research result into a tool/product/feature that a substantial number of users find better than their next best alternative is actually a lot of work”: it’s a lot less work in larger organizations with existing users numbering in the millions (or billions). But, as noted, larger orgs have their own overhead.
I think this predicts that most of the useful products built around deep learning which come out of larger orgs will have certain characteristics, like “is a feature that integrates/enhances an existing product with lots of users” rather than “is a totally new product that was spun up incubator-style within the organization”. It plays to the strengths of those orgs—having both datasets and users, playing better with the existing org structure and processes, more incentive-aligned with the people who “make things happen”, etc.
A couple examples of what I’m thinking of:
substantial improvements in speech recognition—productionized as voice assistant technology, it’s now good enough that it’s sometimes easier to use one than to do something by hand, like setting a timer/alarm/reminder/etc while your hands are occupied with something else,
substantial improvements in image recognition—productionized as image search. I can search for “documents” in Google Photos, and it’ll pull up everything that looks like a document. I can more narrowly search “passport” and it’ll pull up pictures I took of my passport. I can search for “license plate” and it’ll pull up a picture I took of my license plate. I just tried searching for “animal” and it pulled up:
An animated gif of a dog with large glasses on it
Statues of men on horseback, as well as some sculptures of eagles
A bunch of fish in tanks
For structural reasons I’d expect “totally novel, standalone products” to come out of startups rather than larger organizations, but because they’re startups they lack many of the “hard things are easy” buttons that some larger orgs have.
My money is mostly on “It just takes a really long time to convert innovation into profitable, popular product”
A related puzzle piece IMO: Several years ago, all my friends used f.lux to reduce the amount that computer screens screwed up their circadian rhythm. It had to be manually installed. I was confused/annoyed why Apple didn’t do this automatically.
A couple years later, Apple did start doing it automatically (and more recently start shifting everything to darkmode at night)
Meanwhile: A couple years ago, we released shortform on LessWrong. There’s a fairly obvious feature missing, which is showing a user’s shortform on their User Profile. That feature is still missing a couple years later. It would take maybe a day to build, and a week to get reviewed and merged into production. There are other obvious missing features we haven’t gotten around to. The reason we haven’t gotten around to it is something like “well, there’s a lot of competing engineering work to do instead, and there’s a bunch of small priorities that make it hard to just set aside a day for doing it”.
I think Habryka believes this just isn’t the most important thing missing from LW and that keeping the eye on bigger bottlenecks/opportunities is more important. I think Jimrandomh thinks it’s important to make this sort of small feature improvement, but also there’s a bunch of other small feature improvements that need doing (as well as big feature improvements that take up a lot of cognitive attention)
There’s also a bit of organization dysfunction, and/or “the cost of information flow and decisionmaking flow is legitimately ‘real’”.
Something about all this is immensely dissatisfying to me, but it seems like a brute fact about how hard things are. LW is a small team. Apple is a much larger organization that probably pays much higher decisionmaking overhead cost.
I think the bridge from “GPT is really impressive” to “GPT successfully summarizes research reports for you” is a much harder problem than adding f.lux to Mac OS or adding shortform to a User Profile. Also, the teams capable of doing it are mostly working on doing the next cool research thing. Also, InstructGPT totally does exist, but each major productization is a lot of gnarly engineering work (and again the people with the depth of understanding to do it are largely busy)
Note that this is also where some of my “somewhat longer AGI timelines” beliefs come from (i.e 7 years seems more like the minimum to me, whereas I know a couple people listing that as more like a median).
It seems to me that most of the pieces of AGI exist already, but that actually getting from here to AGI will require a 2-3 steps, and each step probably turns out to require some annoying engineering steps.
I wonder if there’s also some basic business domain expertise that generalizes here but hasn’t been developed yet. “How to use software to replace humans with spreadsheets” is a piece of domain expertise the SaaS business community has developed to the point where it gets pretty reliably executed. I don’t know that we have widespread knowledge of how to reliably turn models into services/products.
Riffing on the idea that “productionizing a cool research result into a tool/product/feature that a substantial number of users find better than their next best alternative is actually a lot of work”: it’s a lot less work in larger organizations with existing users numbering in the millions (or billions). But, as noted, larger orgs have their own overhead.
I think this predicts that most of the useful products built around deep learning which come out of larger orgs will have certain characteristics, like “is a feature that integrates/enhances an existing product with lots of users” rather than “is a totally new product that was spun up incubator-style within the organization”. It plays to the strengths of those orgs—having both datasets and users, playing better with the existing org structure and processes, more incentive-aligned with the people who “make things happen”, etc.
A couple examples of what I’m thinking of:
substantial improvements in speech recognition—productionized as voice assistant technology, it’s now good enough that it’s sometimes easier to use one than to do something by hand, like setting a timer/alarm/reminder/etc while your hands are occupied with something else,
substantial improvements in image recognition—productionized as image search. I can search for “documents” in Google Photos, and it’ll pull up everything that looks like a document. I can more narrowly search “passport” and it’ll pull up pictures I took of my passport. I can search for “license plate” and it’ll pull up a picture I took of my license plate. I just tried searching for “animal” and it pulled up:
An animated gif of a dog with large glasses on it
Statues of men on horseback, as well as some sculptures of eagles
A bunch of fish in tanks
For structural reasons I’d expect “totally novel, standalone products” to come out of startups rather than larger organizations, but because they’re startups they lack many of the “hard things are easy” buttons that some larger orgs have.
This is what Elicit is working on, roughly.
I’d have gone with—it can take a long time for a society to adapt to a new technology.