If an AI system or systems goes wrong in the near term and causes harm to humans in a way which is consistent with or supportive of alignment being a big deal, what might it look like?
I’m asking because I’m curious about potential fire-alarm scenarios (including things which just help to make AI risks salient to the wider public), and also looking to operationalise a forecasting question which is currently drafted as
By 2032, will we see an event precipitated by AI that causes at least 100 deaths and/or at least $1B 2021 USD in economic damage?
to allow a clear and sensible resolution.
Boeing MCAS (https://en.wikipedia.org/wiki/Maneuvering_Characteristics_Augmentation_System) is blaimed by more than 100 deaths. How much “AI” would a similar system need to include for a similar tragedy to count as “an event precipitated by AI”?
Great point—I’m not sure if that contained aspects which are similar enough to AI to resolve such a question. This source doesn’t think it counts as AI (though it doesn’t provide much of an argument for this) and I can’t find reference to machine learning or AI on the MCAS page, though clearly one could use AI tools to develop an automated control system like this and I don’t feel well positioned to judge whether it should count.
To clarify—I do not think MCAS specifically is an AI based system, I was just thinking of a hypothetical future similar system that does include a weak AI component, but where, similarly to ACAS the issue is not so much with the flaw in AI itself, but in how it is being used in a larger system.
In other words, I think your test needs to make a distinction between a situation where one needed a trustworthy AI, and the actual AI was unintentionally/unexpectedly untrustworthy vs a situation where perhaps the AI performed reasonably well, but the use of AI was problematic, causing a disaster anyway.
Such scenarios are at best smoke, not fire alarms.
The article convincingly makes the weaker claim that there’s no guarantee of a fire alarm, and provides several cases which support this. I don’t buy the claim (which the article also tries to make) that there is no possible fire alarm, and such a claim seems impossible to prove anyway.
Whether it’s smoke or a fire alarm, that doesn’t really address the specific question I’m asking, in any case.
AI systems find ways to completely manipulate some class of humans, e.g. by making them addicted. Arguably, this is already happening on a wider scale to a smaller amount – people becoming “addicted” to algorithmically generated feeds.
Maybe the question could be concretized to the amount of time people spend on their devices on average?
That seems like a different question which is partially entangled with AI but not necessarily, as more screen time doesn’t necessarily need to be caused by AI, and the harms are harder to evaluate (even the sign of the value of “more screen time” is probably disputed).