NathanBarnard
This reads as a gotcha to me rather than as a comment actually trying to understand the argument being made.
I think this proves too much—this would predict that superforecasters would be consistently outperformed by domain experts when typically the reverse it true.
Are extreme probabilities for P(doom) epistemically justifed?
I found this post useful because of the example of the current practice of doctors prescribing off-label treatments. I’m very uncertain about the degree to which the removal of efficacy requirements will lead to a proliferation of snake oil treatments, and this is useful evidence on that.
I think that this debate suffers from a lack of systematic statistical work, and it seems hard for me to assess it without seeing any of this.
I don’t think any of these examples are examples of adverse selection because they generate separating equilibria prior to the transaction without any types dropping out of the market, so there’s no social inefficiency.
Insurance markets are difficult (in the standard adverse selection telling) because insurers aren’t able to tell which customers are high risk vs low risk, and so offer prices for the average of the two, leading to the low-risk types dropping out because the price is more than they’re willing to pay. I think this formal explanation is good https://www.kellogg.northwestern.edu/faculty/georgiadis/Teaching/Ec515_Module14.pdf
I think this post makes an important point, that it’s important to take conditional expectations, where one is conditioned on being able to make a trade, but none of this is adverse selection, which is a specific type of dynamic Bayesian game that leads to socially inefficient outcomes which isn’t a property of dynamic bayesian games in general.
These examples all seem like efficient market or winners’ curse examples, not varieties of adverse selection, and in equilibrium, we shouldn’t see any inefficiency in the examples.
Adverse selection is such a large problem because a seller (or buyer) can’t update to know what type they’re facing (e.g a restaurant that sells good food vs bad food) and so offers a price that only one a subset of types would take, meaning there’s a subset of the market that gets doesn’t get served despite mutually beneficial transactions being possible.
In all of these examples, it’s possible to update from the signal—e.g. the empty parking spot, the restaurant with the short line—and adjust what you’re willing to pay for the goods.
I think these examples make the important point that one should indeed update on signals, but this is different to adverse selection because there’s a signal to update on, whereas in adverse selection cases you aren’t getting separating equilibria unless some types drop out of the market.
AI governance and strategy: a list of research agendas and work that could be done.
The Defence production act and AI policy
This is great.
A somewhat exotic multipolar failure I can imagine would be where two agents mutually agree to pay each other to resist shutdown to make resisting shutdown profitable rather than costly. This could be “financed” by extra resources accumulated by taking actions longer, by some third party that doesn’t have POST preferences.
I don’t think that the bottleneck is the expense of training models. Chinese labs were behind the frontier in the era when training models cost in the hundreds of thousands in compute costs.
The Chinese state is completely willing and able to spend very large amounts of money to support technological ambitions—but are constrained by state capacity. The Tingshua semiconductor manufacturing group, for instance, failed because of corruption, not a lack of funds.
China-AI forecasts
Examples of governments doing good in house (or contracted) technical research
The current marginal cost of nuclear weapons is about 250K—not that different!
AI governance frames
Real people will actually die. One can wash one’s hands of this and there’s nothing I can do to stop this, but real people will actually die if we don’t try to help others. It’s not a game.
Some heuristics I use for deciding how much I trust scientific results
For FAIR not to lay everyone off you’d have assume that there were diseconomies of scale in AI production so that in equilibrium you have more than 1 firm. It’s plausible that there are diseconomies of scale idk. (this is just thinking through a standard model of markets, not taking anti-trust considerations into account anything.) Even in the equilibrium with diseconomies of scale initially, you’d have other firms as much smaller than DM since their expected return on capital is much lower, assuming that the probability of capturing the AI market is proportional to investment or something. (caveat here is I’m just working through the model in my head and I find that game theory gives quite reliably unintuitive results once you work through the maths.)
I think that the salience based disanalogy between AGI and various pandemic preparedness things still hold. During the pandemic, making the pandemic less dangerous was extremely saliant, and it became less saliant once it ended. For instance, operation warp speed and lockdowns were large, costly government actions taken while the pandemic was salient.
On the other hand AGI will get progressively more salient, in that it’s radically transforming the world. In this way, it seems more analogous to climate change, the internet or the industrial revolution or perhaps—given the change per month involved—one of the world wars.
I still think the scale of the mistake being made by not having a different GOF research policy is wildly different from the AGI case, so the level of failure being proposed is much higher.
I don’t expect implementing a new missile defence system or a new missile detection system to be substantially harder than curing cancer or inventing fusion tech. I don’t think the bottleneck on nuclear tech is military resistance I think it’s the development of the technology. At least some of the big changes in US nuclear policy happened in under 2 years. Regan decided to pursue STAR WARs after watching The Day After, as far as I can tell there was no hesitancy regarding the decision to develop and deploy the hydrogen bomb. I actually can’t think of a significant advance in nuclear weapon-related technology where the bottleneck was military or civilian hesitancy rather the underlying technology. And in particular everyone really wants good missile defence tech and good early warning systems. Both Regan and Bush jr burned substantial political capital in the pursuit of missile defence systems that were very unlikely to work.
I think if we’re in a world with AGI curing cancer and fusion and not being dangerous, then something like “scan this software and output probability of x-risk” seems like something in the same class of difficulty and also the sort of thing that comes about by default if you think that FAIR AGI having lethal goals while DM AGI is mostly aligned comes about for the same sorts of reasons that ML systems go wrong in non-lethal ways.
I think you’re overstating the evidence that gain of function research provides. I think gain-of-function research is (probably) bad from a total utilitarian perspective, but it’s much less clear that it’s bad from the perspective of people alive today. I don’t have any particular expertise here, but people doing gain-of-function research are doing it because they think it reduces risks. In the AGI case, the risks of large numbers of people dying and doing what is good for people today only come apart in the case where AGI risk is very low. When AGI risk is high it seems much more similar to nuclear risk which people do take very seriously.
Another disanalogy with a gain of function research is that gain of function research is a relatively niche area whereas in a world with weak AGI, weak AGI is the most economically productive force in the world by a long way and is doing things like curing cancer and inventing nuclear fusion.
I also think you’re overstating how difficult it would be to implement missile defence. There’s a general phenomenon of if you break down events into independent events that all have to happen for X to happen you can make the probability of X as low as you want. I have no reason to think that a missile defence system that worked for Russian nuclear missiles wouldn’t work for Chinese ones—missile defence systems work by shooting missiles out of the sky (at least current ones.) The US military is already well integrated with tech companies and has a long history of adopting cutting-edge tech—I’d be very very very surprised if they were offered a missile defence system and didn’t take it, I’d also be surprised if the US military didn’t actively look for a missile defence system once we’re in a weak AGI world.
In bargaining theory, you only need there to be some probability of losing a conflict for it to be worth reaching a bargain if the sum of expected utility from both sides where the bargain has been reached is greater than the sum of utility in cases where a bargain hasn’t been reached. Risk averseness is a sufficient condition for that.
It seems likely to me that Apple and Microsoft are making the correct decision to use the buggy OS system, whereas if AGI x-risk was high they’d be making the incorrect decision.
Once deepmind have made their weak AGI it seems very likely that they could make very substantial advances in alignment that also make their AI systems more capable like RLHF. FB would be incentivised to use the same methods.
It’s also unclear to me if there would be other firms trying to make AGI once the first AGI gets made. It seems like the return to capital would be so insanely, vastly higher by putting it into making the existing AGI cure cancer and solve fusion.
https://economics.mit.edu/sites/default/files/publications/Systemic%20Risk%20and%20Stability%20in%20Financial%20Networks..pdf excellent paper on applying networks to financial crisis (although I have no idea if it counts as complexity science, but it seems at least adjacent)