As I read through the Agenda, I can hear Anna Salamon telling me something along the lines of: if you think something is a rational course of action, the antecedents to that course must neccersarily be rational or you are wrong. She doesn’t explain it like that and I cant first that poplar thread but whatever...
Now reviewing the research agenda, there are some things which concern me about their way of doing problem solving. I’d appreciate anyone’s input, challenges, clarification and additions:
..We focus on research that cannot be safely delegates to machines
nice sound bite. No quarrel with this. Just wanted to point it out
No AI problem (including the problem of error-tolerant agent design itself) can be safely delegated to a highly intelligent agent that has incentives to manipulate or decieve its programmers
for the same reason, I won’t delegate trust to design friendly AI up to strangers at MIRI alone ;)
It would be risky to delegate a crucial task before attaining a solid theoretical understanding of exactly what task is being delegated.
this is the critical assumption behind MIRI’s approach. Is there any reason to believe this is the case?
It may be possible to use our understanding of ideal Bayesian inference to task a highly intelligent system with developing increasingly e ective approximations of a Bayesian reasoner, but it would be far more dicult to delegate the task of \ nding good ways to revise how con dent you are about claims” to an intelligent system before gaining a solid understanding of probability theory. The theoretical understanding is useful to ensure that the right questions are being asked.
shouldn’t establishing this be the very first item in the research agenda, before jumping in to problems they assume are solveable. In fact, the abscence of evidence for them being solveable should be evidence of absence...no?
When constructing intelligent systems which learn and interact with all the complexities of reality, it is not sucient to verify that the algorithm behaves well in test settings. Additional work is necessary to verify that the system will continue working as intended in application.
has it been demonstrated anywhere that formalisms are optimal for exception handling?
Because the stakes are so high, testing combined with a gut-level intuition that the system will continue to work outside the test environment is insucient, even if the testing is extensive.
Is this a legitimate forced choice between pure mathematics and gut level intuition + testing?
MIRI alleges a formal understanding is neccersary for robust AI control, then defines formality as follows:
What constitutes a formal understanding? It seems essential to us to have both (1) an understanding of precisely what problem the system is intended to solve; and (2) an understanding of precisely why this practical system is expected to solve that abstract problem. The latter must wait for the development of practical smarter-than-human systems, but the former is a theoretical research problem that we can already examine.
The goal of much of the research outlined in this agenda is to ensure, in the domain of superintelligence alignment|where the stakes are incredibly high|that theoretical understanding comes first.
Okay, show me some data from a very well designed experimenting suggesting theory should come first for the safe development of technology
Honestly, all the MIRI maths and formal logic fetishism got me impressed and awe struck. But I feel like their methodological integrity isn’t tight. I reckon they need some quality statisticians and experiment designers to step in. On the other hand, MIRI operates a very very good ship. They market well, fundraise well, movement build well, community build well, they design well, they write okay now (but not in the past!), they get shit done even and they bring together very very good abstract reasoners. And, they have been instrumental, through LessWrong, in turning my life around.
In good faith,
Clarity, still trying to be the in-house red team and failing slightly less at it one post at a time.
I mostly agree, but: You can affect “problem difficulty” by selecting harder or easier problems. It would still be right not to be discouraged about MIRI’s prospects if (1) the hard problems they’re attacking are hard problems that absolutely need to be solved or (2) the hardness of the problems they’re attacking is a necessary consequence of the hardness (or something) of other problems that absolutely need to be solved. But it might turn out, e.g., that the best road to safe AI takes an entirely different path from the ones MIRI is exploring, in which case it would be a reasonable criticism to say that they’re expending a lot of effort attacking intractably hard problems rather than addressing the tractable problems that would actually help.
You should worry more about whether MIRI’s way of doing problems is a good way of solving hard problems, not how hard the problems are.
Problem difficulty is a constant you cannot affect, social structure is a variable.
As I read through the Agenda, I can hear Anna Salamon telling me something along the lines of: if you think something is a rational course of action, the antecedents to that course must neccersarily be rational or you are wrong. She doesn’t explain it like that and I cant first that poplar thread but whatever...
Now reviewing the research agenda, there are some things which concern me about their way of doing problem solving. I’d appreciate anyone’s input, challenges, clarification and additions:
nice sound bite. No quarrel with this. Just wanted to point it out
for the same reason, I won’t delegate trust to design friendly AI up to strangers at MIRI alone ;)
this is the critical assumption behind MIRI’s approach. Is there any reason to believe this is the case?
shouldn’t establishing this be the very first item in the research agenda, before jumping in to problems they assume are solveable. In fact, the abscence of evidence for them being solveable should be evidence of absence...no?
has it been demonstrated anywhere that formalisms are optimal for exception handling?
Is this a legitimate forced choice between pure mathematics and gut level intuition + testing?
MIRI alleges a formal understanding is neccersary for robust AI control, then defines formality as follows:
So first, why aren’t they disproving Rice’s theorem?
Okay, show me some data from a very well designed experimenting suggesting theory should come first for the safe development of technology
Honestly, all the MIRI maths and formal logic fetishism got me impressed and awe struck. But I feel like their methodological integrity isn’t tight. I reckon they need some quality statisticians and experiment designers to step in. On the other hand, MIRI operates a very very good ship. They market well, fundraise well, movement build well, community build well, they design well, they write okay now (but not in the past!), they get shit done even and they bring together very very good abstract reasoners. And, they have been instrumental, through LessWrong, in turning my life around.
In good faith, Clarity, still trying to be the in-house red team and failing slightly less at it one post at a time.
Lots of this going on in the big wide world. Consider looking in more places to deal with selection bias issues.
thanks for the lead :) I’ll get on to it.
I mostly agree, but: You can affect “problem difficulty” by selecting harder or easier problems. It would still be right not to be discouraged about MIRI’s prospects if (1) the hard problems they’re attacking are hard problems that absolutely need to be solved or (2) the hardness of the problems they’re attacking is a necessary consequence of the hardness (or something) of other problems that absolutely need to be solved. But it might turn out, e.g., that the best road to safe AI takes an entirely different path from the ones MIRI is exploring, in which case it would be a reasonable criticism to say that they’re expending a lot of effort attacking intractably hard problems rather than addressing the tractable problems that would actually help.
MIRI would say they don’t have the luxury of choosing easier problems. They think they are saving the world from an imminent crisis.
They might well do, but others (e.g., Clarity) might not be persuaded.
We’ll see :)