I think you get it mostly right, and then you just make a different conclusion.
The part where you agree is:
We do not have a scientific understanding of how to tell a superintelligent machine to [solve problem X, without doing something horrible as a side effect], because we cannot describe mathematically what “something horrible” actually means to us...
And the conclusion that AI safety people make is:
...and that is a problem, because in the following years, machines smarter than humans are likely to come, and they may do things with horrible side effects that their human operators will not predict.
While your conclusion seems to be:
...therefore people should be ashamed for talking about this topic.
So, if you want to be a proper Popperian, you probably need to sit and wait until actual superintelligent machines are made and actually start doing horrible things, and then (assuming that you survive) you can collect and analyze examples of the horrible things happening, propose falsifiable hypotheses on how to avoid these specific horrible things happening again, do the proper experiments, measure the p-values, and publish in respected scientific journals. This is how respectable people would approach the problem.
The alternative is to do the parts that you can do now… and handwave the rest of it, hoping that later someone else will fill in the missing parts. For example, you can collect examples of surprising things that current (not superintelligent) machines are making when solving problems. And the handwavy part is ”...and now imagine this, but extrapolated for a superintelligence”.
Or you can make a guess about which mathematical problems may turn out to be relevant for AI safety (although you cannot be sure you guessed right), and then work on those mathematical problems rigorously. In which case the situation is like: “yeah, this math problem is solved okay from the scientific perspective, it’s just its relevance for AI safety that is dubious”.
I am not familiar with the AI safety research, so I cannot provide more information about it. But my impression is that it is similar to a combination of what I just described: examples of potential problems (with non-superintelligent machines), and mathematical details which may or may not be relevant.
The problem with “pop Popperianism” is that it describes what to do when you already have a scientific hypothesis fully formed. It does not concern itself with how to get to that point. Yes, the field of AI safety is currently mostly trying to get to that point. That is the inevitable first step.
We do not have a scientific understanding of how to tell a superintelligent machine to “solve problem X, without doing something horrible as a side effect”, because we cannot describe mathematically what “something horrible” actually means to us...
Where is this quote from? I don’t see it in the article or in the author’s other contributions.
Sorry, I used the quote marks just as… brackets, kind of?
(Is that a too non-standard usage? What is the proper way to put a clearly separated group of words into a sentence, without making it seem like a quotation? Sometimes connecting-by-hyphens does the job, but it seems weird when the text gets longer.)
EDIT: Okay, replaced by actual brackets. Sorry for all the confusion I caused.
I expect most readers of your original comment indeed misinterpreted those quotes to be literal when they’re anything but. Maybe edit the original comment and add a bunch of “(to paraphrase)”s or “as I understand you”s?
I think in this case brackets is pretty good. I agree with Martin that it’s good to avoid using quote marks when it might be mistaken for a literal quote.
FWIW, I have a tendency to do quote-grouping for ideas sometimes too, but it’s pretty tough to read unless your reader has a lot of understanding in what you’re doing. Although it’s both ugly and unclear, I prefer to use square brackets because people at least know that I’m doing something weird, though it still kinda looks like I’m [doing some weird paraphrasing thing].
I couldn’t click upvote hard enough. I’m always having this mental argument with hypothetical steelmanned opponents about stuff and AI Safety is sometimes one of the subjects. Now I’ve got a great piece of text to forward to these imaginary people I’m arguing with!
“pseudoscience” is a kind of word that is both too broad and loaded with too many negative connotations. It encompasses both (say) intelligent design with it’s desired results built-in and AI safety striving towards …something. The word doesn’t seem useful in determining which you should take seriously.
I feel like I’ve read a post before about distinguishing between insert-some-pseudoscience-poppycock-here, and a “pseudoscience” like AI safety. Or, someone should write that post!
We do not have a scientific understanding of how to tell a superintelligent machine to “solve problem X, without doing something horrible as a side effect”, because we cannot describe mathematically what “something horrible” actually means to us...
Similar to how utility theory (from von Neumann and so on) is excellent science/mathematics despite our not being able to state what utility is. AI Alignment hopes to tell us how to align AI, not the target to aim for. Choosing the target is also a necessary task, but it’s not the focus of the field.
I think you get it mostly right, and then you just make a different conclusion.
The part where you agree is:
And the conclusion that AI safety people make is:
While your conclusion seems to be:
So, if you want to be a proper Popperian, you probably need to sit and wait until actual superintelligent machines are made and actually start doing horrible things, and then (assuming that you survive) you can collect and analyze examples of the horrible things happening, propose falsifiable hypotheses on how to avoid these specific horrible things happening again, do the proper experiments, measure the p-values, and publish in respected scientific journals. This is how respectable people would approach the problem.
The alternative is to do the parts that you can do now… and handwave the rest of it, hoping that later someone else will fill in the missing parts. For example, you can collect examples of surprising things that current (not superintelligent) machines are making when solving problems. And the handwavy part is ”...and now imagine this, but extrapolated for a superintelligence”.
Or you can make a guess about which mathematical problems may turn out to be relevant for AI safety (although you cannot be sure you guessed right), and then work on those mathematical problems rigorously. In which case the situation is like: “yeah, this math problem is solved okay from the scientific perspective, it’s just its relevance for AI safety that is dubious”.
I am not familiar with the AI safety research, so I cannot provide more information about it. But my impression is that it is similar to a combination of what I just described: examples of potential problems (with non-superintelligent machines), and mathematical details which may or may not be relevant.
The problem with “pop Popperianism” is that it describes what to do when you already have a scientific hypothesis fully formed. It does not concern itself with how to get to that point. Yes, the field of AI safety is currently mostly trying to get to that point. That is the inevitable first step.
Where is this quote from? I don’t see it in the article or in the author’s other contributions.
Sorry, I used the quote marks just as… brackets, kind of?
(Is that a too non-standard usage? What is the proper way to put a clearly separated group of words into a sentence, without making it seem like a quotation? Sometimes connecting-by-hyphens does the job, but it seems weird when the text gets longer.)
EDIT: Okay, replaced by actual brackets. Sorry for all the confusion I caused.
I expect most readers of your original comment indeed misinterpreted those quotes to be literal when they’re anything but. Maybe edit the original comment and add a bunch of “(to paraphrase)”s or “as I understand you”s?
I think in this case brackets is pretty good. I agree with Martin that it’s good to avoid using quote marks when it might be mistaken for a literal quote.
FWIW, I have a tendency to do quote-grouping for ideas sometimes too, but it’s pretty tough to read unless your reader has a lot of understanding in what you’re doing. Although it’s both ugly and unclear, I prefer to use square brackets because people at least know that I’m doing something weird, though it still kinda looks like I’m [doing some weird paraphrasing thing].
I couldn’t click upvote hard enough. I’m always having this mental argument with hypothetical steelmanned opponents about stuff and AI Safety is sometimes one of the subjects. Now I’ve got a great piece of text to forward to these imaginary people I’m arguing with!
“pseudoscience” is a kind of word that is both too broad and loaded with too many negative connotations. It encompasses both (say) intelligent design with it’s desired results built-in and AI safety striving towards …something. The word doesn’t seem useful in determining which you should take seriously.
I feel like I’ve read a post before about distinguishing between insert-some-pseudoscience-poppycock-here, and a “pseudoscience” like AI safety. Or, someone should write that post!
Similar to how utility theory (from von Neumann and so on) is excellent science/mathematics despite our not being able to state what utility is. AI Alignment hopes to tell us how to align AI, not the target to aim for. Choosing the target is also a necessary task, but it’s not the focus of the field.
It is not a quote but a paraphrasing of what the OP might agree on about AI security.