Over the last 3 months, I’ve spent some time thinking about mech interp as a for profit service. I’ve pitched to one VC firm, interviewed for a few incubators/accelerators including ycombinator, sent out some pitch documents, co-founder dated a few potential cofounders, and chatted with potential users and some AI founders).
There are a few issues:
First, as you mention, I’m not sure if mech interp is yet ready to understand models. I recently interpreted a 1-layer model trained on a binary classification function https://www.lesswrong.com/posts/vGCWzxP8ccAfqsrS3/thoughts-about-the-mechanistic-interpretability-challenge-2 and am currently working on understanding a 1-layer language model (TinyStories-1Layer-21M). TinyStories is (much?) harder than the binary classification network (which took 24 focused days of solo research). This isn’t to say I or someone else won’t have an idea how 1 layer models work a few months from now. Once this happens, we might want to interpret multi-layer models before being ready to interpret models that are running in production.
Second, outsiders can observe that mech interp might not be far enough along to build a product around. The feedback I received from the VC firm and YC was that my ideas weren’t far enough along.
Third, I personally have not yet been able to find someone I’m excited to be cofounders with. Some people have different visions in terms of safety (some people just don’t care at all). Other people who I share a vision with, I don’t match with for other reasons.
Fourth, I’m not certain that I’ve yet found that ideal first customer—some people seem to think it’s nice to have, but frequently with language models, if you get a bad output, you can just run it again (keeping a human in the loop). To be clear, I haven’t given up on finding that ideal customer, and it could be something like government or that customer might not exist until AI models do something really bad.
Fifth, I’m unsure if I actually want to run a company. I love doing interp research and think I am quite good at it (among other things, having a software background, a PhD in Robotics, and solving puzzles). I consider myself a 10x+ engineer. At least right now, it seems like I can add more value by doing independent research rather than running a company.
For me, the first issue is the main one. Once interp is farther along, I’m open to put more time into thinking about the other issues. If anyone reading this is potentially interested in chatting, feel free to DM me.
Over the last 3 months, I’ve spent some time thinking about mech interp as a for profit service. I’ve pitched to one VC firm, interviewed for a few incubators/accelerators including ycombinator, sent out some pitch documents, co-founder dated a few potential cofounders, and chatted with potential users and some AI founders).
There are a few issues:
First, as you mention, I’m not sure if mech interp is yet ready to understand models. I recently interpreted a 1-layer model trained on a binary classification function https://www.lesswrong.com/posts/vGCWzxP8ccAfqsrS3/thoughts-about-the-mechanistic-interpretability-challenge-2 and am currently working on understanding a 1-layer language model (TinyStories-1Layer-21M). TinyStories is (much?) harder than the binary classification network (which took 24 focused days of solo research). This isn’t to say I or someone else won’t have an idea how 1 layer models work a few months from now. Once this happens, we might want to interpret multi-layer models before being ready to interpret models that are running in production.
Second, outsiders can observe that mech interp might not be far enough along to build a product around. The feedback I received from the VC firm and YC was that my ideas weren’t far enough along.
Third, I personally have not yet been able to find someone I’m excited to be cofounders with. Some people have different visions in terms of safety (some people just don’t care at all). Other people who I share a vision with, I don’t match with for other reasons.
Fourth, I’m not certain that I’ve yet found that ideal first customer—some people seem to think it’s nice to have, but frequently with language models, if you get a bad output, you can just run it again (keeping a human in the loop). To be clear, I haven’t given up on finding that ideal customer, and it could be something like government or that customer might not exist until AI models do something really bad.
Fifth, I’m unsure if I actually want to run a company. I love doing interp research and think I am quite good at it (among other things, having a software background, a PhD in Robotics, and solving puzzles). I consider myself a 10x+ engineer. At least right now, it seems like I can add more value by doing independent research rather than running a company.
For me, the first issue is the main one. Once interp is farther along, I’m open to put more time into thinking about the other issues. If anyone reading this is potentially interested in chatting, feel free to DM me.