Was on Vivek Hebbar’s team at MIRI, now working with Adrià Garriga-Alonso onvarious empirical alignment projects.
I’m looking for projects in interpretability, activation engineering, and control/oversight; DM me if you’re interested in working with me.
I have signed no contracts or agreements whose existence I cannot mention.
Jane at FakeLab has a background in interpretability but is currently wrangling data / writing internal tooling / doing some product thing because the company needs her to, because otherwise FakeLab would have no product and be unable to continue operating including its safety research. Steve has comparative advantage at Jane’s current job.
It seems net bad because the good effect of slowing down OpenAI is smaller than the bad effect of GM racing? But OpenAI is probably slowed down—they were already trying to build AGI and they have less money and possibly less talent. Thinking about the net effect is complicated and I don’t have time to do it here. The situation with joining a lab rather than founding one may also be different.