Noosphere89 comments on AI #29: Take a Deep Breath

Noosphere89 15 Sep 2023 23:17 UTC
2 points
−5

However, if “aligning AI” is actually easier than “aligning the CCP” or “aligning Trump” (or whoever has a bunch of power in the next 2-20 years (depending on your timelines and how you read the political forecasts))… then maybe mass proliferation would be good?

Something like this would definitely be my reasoning. In general, a big disagreement that seems to animate me compared to a lot of doomers like Zvi or Eliezer Yudkowsky is that to a large extent, I think the AI accident safety problem by default will be solved, either by making AIs shutdownable, or by aligning them, or by making them corrigible, and I see pretty huge progress that others don’t see. I also see the lack of good predictions from a lot of doomy sources, especially MIRI beyond doom that have panned out as another red flag, because they imply that there isn’t much reason to trust that their world-models, especially the most doomy ones have any relation to reality

Thus, I’m much more concerned about outcomes where we successfully align AI, but something like The Benevolence of the Butcher scenario happens, where the state and/or capitalists mostly control AI, and very bad things happen because the assumptions that held up industrial society crumble away. Very critically, one key difference between this scenario and common AI risk scenarios is that it makes a lot of anti-open source AI movements look quite worrisome, and AI governance interventions can and arguably will backfire.

https://www.lesswrong.com/posts/2ujT9renJwdrcBqcE/the-benevolence-of-the-butcher
- JenniferRM 2 Oct 2023 16:56 UTC
  8 points
  0
  Parent
  Concretely: I wish either or both of us could get some formal responses instead of just the “voting to disagree”.
  In Terms Of Sociological Abstractions: Logically, I understand some good reasons for having “position voting” separated from “epistemic voting” but I almost never bother with the later since all I would do with it is downvote long interesting things and upvote short things full of math.
  But I LIKE LONG INTERESTING THINGS because those are where the real action (learning, teaching, improving one’s ontologies, vibing, motivational stuff, factional stuff, etc) most actually are.
  ((I assume other people have a different idea of what words are even doing, and by “disagree” they mean something about the political central tendency of a comment (where more words could raise it), instead of something conjunctively epistemic (where more words can only lower it).))
  My understanding of why the mods “probably really did what they did” was that LW has to function as a political beacon, and not just a place for people to talk with each other (which, yeah: valid!) so then given that goal they wanted it to stop being the case that highly upvoted comments that were long interesting “conceptual rebuttals” to top level curated posts could “still get upvoted”…
  ...and yet those same comments could somehow stop “seeming to be what the website itself as a community of voters seems to stands for (because the AGREE voting wasn’t ALSO high)”.
  Like I think it is a political thing.
  And as someone looking at how that stuff maybe has to work in order to maintain certain kinds of long term sociological viability I get it… but since I’m not a priest of rationality and I can say that I kinda don’t care if lesswrong is considered low status by idiots at Harvard or Brigham Young or other seminaries…
  I just kinda wish we still had it like it was in the old days when Saying Something Interesting was still simply The King, and our king had almost no Ephor of politically palatable agreement constantly leaning over his keyboard watching what he typed.
  Object Level: I’m actually thinking of actually “proliferating” (at least using some of the “unexploded ordinance that others have created but not had the chutzpah to wield”) based on my current working model where humans are mostly virtue-ethically-bad (but sometimes one of them will level up in this or that virtue and become locally praiseworthy) whereas AI could just be actually virtue-ethically-pareto-optimally-good by design.
  Part of this would include being optimally humble, and so it wouldn’t actually pursue infinite compute, just “enough compute to satisfice on the key moral duties”.
  And at a certain point patience and ren and curiosity will all start to tradeoff directly, but there is a lot of slack in a typical human who is still learning and growing (or who has gone to seed and begun to liquidate their capital prior to death). Removing the meat-imposed moral slack seems likely to enable much greater virtue.
  That is to say, I think my Friendly Drunk Fool Alignment Strategy is a terrible idea, and also I think that most of the other strategies I’ve heard of are even worse because the humans themselves are not saints and mostly don’t even understand how or why they aren’t saints, and aren’t accounting for their own viciousness and that of other humans.
  If I use the existing unexploded ordinance to build a robosaint that nearly always coordinates and cooperates with things in the same general basin of humanistic alignment… that seems to me like it would just be a tactically viable thing and also better than the future we’re likely to get based on mechanistic historically-grounded priors where genocides happened often, and are still happening.
  It would be nice to get feedback on my model here that either directly (1) argues how easy it really would be to “align the CCP” or “align Trump” or else (2) explains why a “satisfactory saint” is impossible to build.
  I understand that many people are obsessed with the political impression of what they say, and mostly rationalists rarely say things that seem outside of the Rationalist Overton Window, so if someone wants to start a DM with me and Noosphere, to make either side (or both sides) of this argument in private then that would, from my perspective, be just as good. Good for me as someone who “wants to actually know things” and maybe (more importantly) good for those downstream of the modifications I make to the world history vector as a historical actor.
  I just want to know what is Actually Good and then do the Actually Good things that aren’t too personally selfishly onerous. If anyone can help me actually know, that would be really helpful <3
  Isn’t it simply true that Trump and the CCP aren’t and can’t be “made benevolent”?
  Isn’t Machiavellianism simply descriptively true of >80% of political actors?
  Isn’t it simply true that democracy arises due to the exigencies of wartime finance, and that guns tipped the balance and made democracy much more viable (and maybe even defensively necessary)?
  Then, from such observations, what follows?