bigjeff5 comments on Arguments for optimism on AI Alignment (I don’t endorse this version, will reupload a new version soon.)

bigjeff5 17 Oct 2023 20:27 UTC
7 points
0
The reason Person A in scenario 2 has the intuition that Person B is very wrong is because there are dozens, if not hundreds of examples where people claimed no vulnerabilities and were proven wrong. Usually spectacularly so, and often nearly immediately. Consider the fact that the most robust software developed by the most wealthy and highly motivated companies in the world, who employ vast teams of talented software engineers, have monthly patch schedules to fix their constant stream vulnerabilities, and I think it’s pretty easy to immediately discount anybody’s claim of software perfection without requiring any further evidence.
All the evidence Person A needs is the complete and utter lack of anybody having achieved such a thing in the history of software to discount Person B’s claims.
I’ve never heard of an equivalent example for AI. It just seems to me like Scenario 2 doesn’t apply, or at least it cannot apply at this point in time. Maybe in 50 years we’ll have the vast swath of utter failures to point to, and thus a valid intuition against someone’s 9-9′s confidence of success, but we don’t have that now. Otherwise people would be pointing out examples in these arguments instead of vague unease regarding problem spaces.
- Daniel Kokotajlo 18 Oct 2023 14:36 UTC
  4 points
  1
  Parent
  Well, no one has built an AGI yet, and if your plan is to wait until we have years of experience with unaligned AGIs before it’s OK to start worrying about the problem, that’s a bad plan.
  
  Also, there are things which are not AGI but which are similar in various ways (software, deep neural nets, rocket navigation mechanisms, prisons, childrearing strategies, tiger-training-strategies) which provide ample examples of unseen errors.
  
  Also, like I said, there ARE plenty of POCs for AGI risk.