as I had understood Stuart’s article, the point was not to address decision theory, which is a mathematical subject, but instead that he hypothesized a scenario in which “the AI” was used to forecast possible future events, with humans in the loop doing the actual evaluation based on simulations realized in high detail, to the point that the future-world simulation would be as thorough as a film might be today, at which point it could appeal to people on a gut level and bypass their rational faculties
It’s true that Stuart wrote about Oracle AI in his Siren worlds post, but I thought that was mostly just to explain the idea of what a Siren world is. Later on in the post he talks about how Paul Christiano’s take on indirect normativity has a similar problem. Basically the problem can occur if an AI tries to model a human as accurately as possible, then uses the model directly as its utility function and tries to find a feasible future world that maximizes the utility function.
It seems plausible that even if the AI couldn’t produce a high resolution simulation of a Siren world W, it could still infer (using various approximations and heuristics) that with high probability its utility function assigns a high score to W, and choose to realize W on that basis. It also seems plausible that an AI eventually would have enough computing power to produce high resolution simulations of Siren worlds, e.g., after it has colonized the galaxy, so the problem could happen at that point if not before.
but also have a bunch of other extra-scary features above and beyond other scenarios of people being irrational, just because.
What extra-scary features are you referring to? (Possibly I skipped over the parts you found objectionable since I was already familiar with the basic issue and didn’t read Stuart’s post super carefully.)
It’s true that Stuart wrote about Oracle AI in his Siren worlds post, but I thought that was mostly just to explain the idea of what a Siren world is. Later on in the post he talks about how Paul Christiano’s take on indirect normativity has a similar problem. Basically the problem can occur if an AI tries to model a human as accurately as possible, then uses the model directly as its utility function and tries to find a feasible future world that maximizes the utility function.
It seems plausible that even if the AI couldn’t produce a high resolution simulation of a Siren world W, it could still infer (using various approximations and heuristics) that with high probability its utility function assigns a high score to W, and choose to realize W on that basis. It also seems plausible that an AI eventually would have enough computing power to produce high resolution simulations of Siren worlds, e.g., after it has colonized the galaxy, so the problem could happen at that point if not before.
What extra-scary features are you referring to? (Possibly I skipped over the parts you found objectionable since I was already familiar with the basic issue and didn’t read Stuart’s post super carefully.)