localdeity comments on Interacting with a Boxed AI

localdeity 2 Apr 2022 3:04 UTC
2 points
best way to use a sharply limited number of bits of information from a probably-Friendly superhuman AI
I figure we’d want it to answer restricted (essentially multiple-choice) questions. Come to think of it, if necessary we could play games of 20 Questions with it.
If the number of bits we can get is restricted, then we’d want to get from it the most valuable-but-safe bits possible. Very valuable bits that we don’t have would fall into two categories: (a) those we know are valuable, but haven’t acquired because it’s very expensive, and (b) those we don’t know are valuable. Finding (b) involves more free-form questions that could be dangerous. But (a) seems relatively simple and easy to extract value from, for certain values of “a superhuman AI” that include “able to simulate relevant systems”. For example:
“Given these descriptions of the SARS-CoV-2 virus and the human immune system, are you able to make decent predictions on how well humans’ immune systems would respond to the virus and vaccines? Respond Y/N.”
“Y”
“Here are 10 proposed vaccine formulations our researchers came up with. If we use a two-dose regimen with the doses separated by up to 12 weeks, then what is the best combination of vaccine and dose interval for minimizing chance of death from a SARS-CoV-2 infection on a random day up to 1 year after the first dose? Your answer should be of the form “x y”, where x is the index of the vaccine (from 1 to 10) and y is the dose interval in weeks (1 to 12).”
“6 8”
One would follow this up with an actual clinical trial before giving the vaccine to everyone, of course, and maybe also try out the humans’ best guess if you’re suspicious of the machine. But it seems clear that, by these steps, you’d have a lot of likely upside if the machine is good, and a relatively small downside (wasted effort and delaying the normal solution) if the machine is treacherous.
Generally, if the machine can tell you the results of expensive experiments, and you can verify them much more cheaply than doing the entire set of experiments yourself, then that’s a good use case. (This is essentially the real-world analogue of “nondeterministic polynomial” problems from computing.)
What links here?
- localdeity's comment on We will be around in 30 years by mukashi (7 Jun 2022 21:37 UTC; 4 points)