learn math or hardware
mesaoptimizer
Project proposal: EpochAI for compute oversight
Detailed MVP description: website with an interactive map that shows locations of high risk data centers globally, with relevant information appearing when you click on the icons on the map. Examples of relevant information: organizations and frontier labs that have access to this compute, the effective FLOPS of the data center, what time would it take to train a SOTA model in that datacenter).
High risk datacenters are datacenters that are capable of training current or next generation SOTA AI systems.
Why:
I’m unable to find a ‘single point of reference’ for information about the number and locations of datacenters that are high risk.
AFAICT Epoch focuses more on tracking SOTA model details instead of hardware related information.
This seems extremely useful for our community (and policy makers) to orient to compute regulation possibilities and its relative prioritization compared to other interventions
Thoughts? I’ve been playing around with the idea of building it, but have been uncertain about how useful this would be, since I don’t have enough interaction with the AI alignment policy people here. Posting it here is an easy test to see whether it is worth greater investment or prioritization.
Note: Uncertain as to whether dual-use issues exist here. I expect that datacenter builders and frontier labs probably have a very good model of the global compute distribution situation and this would significantly benefit regulatory efforts compared to helping increase the strategic allocation of training compute allocation.
Neuro-sama is a limited scaffolded agent that livestreams on Twitch, optimized for viewer engagement (so it speaks via TTS, it can play video games, etc.).
Schelling points in the AGI policy space
Well, at least a subset of the sequence focuses on this. I read the first two essays and was pessimistic of the titular approach enough that I moved on.
Here’s a relevant quote from the first essay in the sequence:
Furthermore, most of our focus will be on ensuring that your model is attempting to predict the right thing. That’s a very important thing almost regardless of your model’s actual capability level. As a simple example, in the same way that you probably shouldn’t trust a human who was doing their best to mimic what a malign superintelligence would do, you probably shouldn’t trust a human-level AI attempting to do that either, even if that AI (like the human) isn’t actually superintelligent.
Also, I don’t recommend reading the entire sequence, if that was an implicit question you were asking. It was more of a “Hey, if you are interested in this scenario fleshed out in significantly greater rigor, you’d like to take a look at this sequence!”
Evan Hubinger’s Conditioning Predictive Models sequence describes this scenario in detail.
There’s generally a cost to managing people and onboarding newcomers, and I expect that offering to volunteer for free is usually a negative signal, since it implies that there’s a lot more work than usual that would need to be done to onboard this particular newcomer.
Have you experienced otherwise? I’d love to hear some specifics as to why you feel this way.
I think we’ll have bigger problems than just solving the alignment problem, if we have a global thermonuclear war that is impactful enough to not only break the compute supply and improvement trends, but also destabilize the economy and geopolitical situation enough that frontier labs aren’t able to continue experimenting to find algorithmic improvements.
Agent foundations research seems robust to such supply chain issues, but I’d argue that gigantic parts of the (non-academic, non-DeepMind specific) conceptual alignment research ecosystem is extremely dependent on a stable and relatively resource-abundant civilization: LW, EA organizations, EA funding, individual researchers having the slack to do research, ability to communicate with each other and build on each other’s research, etc. Taking a group of researchers and isolating them in some nuclear-war-resistant country is unlikely to lead to an increase in marginal research progress in that scenario.
Thiel has historically expressed disbelief about AI doom, and has been more focused on trying to prevent civilizational decline. From my perspective, it is more likely that he’d fund an organization founded by people with accelerationist credentials, than by someone who was a part of a failed coup attempt that would look to him like it involved a sincere belief in an extreme difficulty of the alignment problem.
I’d love to read an elaboration of your perspective on this, with concrete examples, which avoids focusing on the usual things you disagree about (pivotal acts vs. pivotal processes, social facets of the game is important for us to track, etc.) and mainly focus on your thoughts on epistemology and rationality and how it deviates from what you consider the LW norm.
I started reading your meta-rationality sequence, but it ended after just two posts without going into details.
David Chapman’s website seems like the standard reference for what the post-rationalists call “metarationality”. (I haven’t read much of it, but the little I read made me somewhat unenthusiastic about continuing).
Note that the current power differential between evals labs and frontier labs is such that I don’t expect evals labs have the slack to simply state that a frontier model failed their evals.
You’d need regulation with serious teeth and competent ‘bloodhound’ regulators watching the space like a hawk, for such a possibility to occur.
I just encountered polyvagal theory and I share your enthusiasm for how useful it is for modeling other people and oneself.
Note that I’m waiting for the entire sequence to be published before I read it (past the first post), so here’s a heads up that I’m looking forward to seeing more of this sequence!
I think Twitter systematically underpromotes tweets with links external to the Twitter platform, so reposting isn’t a viable strategy.
Thanks for the link. I believe I read it a while ago, but it is useful to reread it from my current perspective.
trying to ensure that AIs will be philosophically competent
I think such scenarios are plausible: I know some people argue that certain decision theory problems cannot be safely delegated to AI systems, but if we as humans can work on these problems safely, I expect that we could probably build systems that are about as safe (by crippling their ability to establish subjunctive dependence) but are also significantly more competent at philosophical progress than we are.
Leopold’s interview with Dwarkesh is a very useful source of what’s going on in his mind.
What happened to his concerns over safety, I wonder?
He doesn’t believe in a ‘sharp left turn’, which means he doesn’t consider general intelligence to be a discontinuous (latent) capability spike such that alignment becomes significantly more difficult after it occurs. To him, alignment is simply a somewhat harder empirical techniques problem like capabilities work is. I assume he imagines in behavior similar to current RLHF-ed models even as frontier labs have doubled or quadrupled the OOMs of optimization power applied to the creation of SOTA models.
He models (incrementalist) alignment research as “dual use”, and therefore effectively models capabilities and alignment as effectively the same measure.
He also expects humans to continue to exist once certain communities of humans achieve ASI, and imagines that the future will be ‘wild’. This is a very rare and strange model to have.
He is quite hawkish—he is incredibly focused on China not stealing AGI capabilities, and believes that private labs are going to be too incompetent to defend against Chinese infiltration. He prefers that the USGOV would take over the AGI development such that they can race effectively against AGI.
His model for take-off relies quite heavily on “trust the trendline” and estimating linear intelligence increases with more OOMs of optimization power (linear with respect to human intelligence growth from childhood to adulthood). Its not the best way to extrapolate what will happen, but it is a sensible concrete model he can use to talk to normal people and sound confident and not vague—a key skill if you are an investor, and an especially key skill for someone trying to make it in the SF scene. (Note he clearly states in the interview that he’s describing his modal model for how things will go and he does have uncertainty over how things will occur, but desires to be concrete about what is his modal expectation.)
He has claimed that running a VC firm means he can essentially run it as a “think tank” too, focused on better modeling (and perhaps influencing) the AGI ecosystem. Given his desire for a hyper-militarization of AGI research, it makes sense that he’d try to steer things in this direction using the money and influence he will have and build, as a founder of n investment firm.
So in summary, he isn’t concerned about safety because he prices it in as something about as difficult (or slightly more difficult than) capabilities work. This puts him in an ideal epistemic position to run a VC firm for AGI labs, since his optimism is what persuades investors to provide him money since they expect him to attempt to return them a profit.
Oh, by that I meant something like “yeah I really think it is not a good idea to focus on an AI arms race”. See also Slack matters more than any other outcome.
If Company A is 12 months from building Cthulhu, we fucked up upstream. Also, I don’t understand why you’d want to play the AI arms race—you have better options. They expect an AI arms race. Use other tactics. Get into their OODA loop.
Unsee the frontier lab.
If you like The Dream Machine, you’ll also like Organizing Genius.