Good question. As an overly-specific but illustrative example, let’s say that the things David and I have been working on the past month (broadly in the vein of retargeting) go unexpectedly well. What then?
The output is definitely not “1-5 concrete suggestions that you want AGI developers to implement”. The output is “David and John release a product to control an image generator/language model via manipulating its internals directly, this works way better than prompting, consumers love it and the major labs desperately play catch-up”.
More general background principle: alignment is a bottleneck to economic value for nets. If we’re able to do alignment qualitatively better, then that’s going to have a very big market. We will not be trying to convince the major labs to adopt our ideas, the major labs will be offering us money to let them (i.e. acquisition offers), and also trying to reverse-engineer whatever we’re doing. If we release the methods publicly, they’ll be widely adopted within weeks. Or we’ll just grab a big market share ourselves, that works too.
Our main job will be to get that first product to market, in order to legibly prove that the methods work in practice.
Good question. As an overly-specific but illustrative example, let’s say that the things David and I have been working on the past month (broadly in the vein of retargeting) go unexpectedly well. What then?
The output is definitely not “1-5 concrete suggestions that you want AGI developers to implement”. The output is “David and John release a product to control an image generator/language model via manipulating its internals directly, this works way better than prompting, consumers love it and the major labs desperately play catch-up”.
More general background principle: alignment is a bottleneck to economic value for nets. If we’re able to do alignment qualitatively better, then that’s going to have a very big market. We will not be trying to convince the major labs to adopt our ideas, the major labs will be offering us money to let them (i.e. acquisition offers), and also trying to reverse-engineer whatever we’re doing. If we release the methods publicly, they’ll be widely adopted within weeks. Or we’ll just grab a big market share ourselves, that works too.
Our main job will be to get that first product to market, in order to legibly prove that the methods work in practice.