Andrew_Critch comments on My May 2023 priorities for AI x-safety: more empathy, more unification of concerns, and less vilification of OpenAI

Andrew_Critch 24 May 2023 0:46 UTC
18 points
12
Do you have a success story for how humanity can avoid this outcome? For example what set of technical and/or social problems do you think need to be solved? (I skimmed some of your past posts and didn’t find an obvious place where you talked about this.)
I do not, but thanks for asking. To give a best efforts response nonetheless:

David Dalrymple’s Open Agency Architecture is probably the best I’ve seen in terms of a comprehensive statement of what’s needed technically, but it would need to be combined with global regulations limiting compute expenditures in various ways, including record-keeping and audits on compute usage. I wrote a little about the auditing aspect with some co-authors, here
https://cset.georgetown.edu/article/compute-accounting-principles-can-help-reduce-ai-risks/
… and was pleased to see Jason Matheny advocating from RAND that compute expenditure thresholds should be used to trigger regulatory oversight, here:
https://www.rand.org/content/dam/rand/pubs/testimonies/CTA2700/CTA2723-1/RAND_CTA2723-1.pdf
My best guess at what’s needed is a comprehensive global regulatory framework or social norm encompassing all manner of compute expenditures, including compute expenditures from human brains and emulations but giving them special treatment. More specifically-but-less-probably, what’s needed is some kind of unification of information theory + computational complexity + thermodynamics that’s enough to specify quantitative thresholds allowing humans to be free-to-think-and-use-AI-yet-unable-to-destroy-civilization-as-a-whole, in a form that’s sufficiently broadly agreeable to be sufficiently broadly adopted to enable continual collective bargaining for the enforceable protection of human rights, freedoms, and existential safety.

That said, it’s a guess, and not an optimistic one, which is why I said “I do not, but thanks for asking.”
It confuses me that you say “good” and “bullish” about processes that you think will lead to ~80% probability of extinction. (Presumably you think democratic processes will continue to operate in most future timelines but fail to prevent extinction, right?) Is it just that the alternatives are even worse?
Yes, and specifically worse even in terms of probability of human extinction.
- Wei Dai 24 May 2023 20:34 UTC
  22 points
  10
  Parent
  In a previous comment you talked about the importance of “the problem of solving the bargaining/cooperation/mutual-governance problem that AI-enhanced companies (and/or countries) will be facing”. I wonder if you’ve written more about this problem anywhere, and why you didn’t mention it again in the comment that I’m replying to.
  
  My own thinking about ‘the ~50% extinction probability I’m expecting from multi-polar interaction-level effects coming some years after we get individually “safe” AGI systems up and running’ is that if we’ve got “safe” AGIs, we could ask them to solve the “bargaining/cooperation/mutual-governance problem” for us but that would not work if they’re bad at solving this kind of problem. Bargaining and cooperation seem to be in part philosophical problems, so this fits into my wanting to make sure that we’ll build AIs that are philosophically competent.
  
  ETA: My general feeling is that there will be too many philosophical problems like these during and after the AI transition, and it seems hopeless to try to anticipate them all and solve them individually ahead of time (or solve them later using only human intelligence). Instead we might have a better chance of solving the “meta” problem. Of course buying time with compute regulation seems great if feasible.
  
  Yes, and specifically worse even in terms of probability of human extinction.
  
  Why? I’m also kind of confused why you even mention this issue in this post, like are you thinking that you might potentially be in a position to impose your views? Or is this a kind of plea for others who might actually face such a choice to respect democratic processes?