Moebius314 comments on On A List of Lethalities

Moebius314 Jun 14, 2022, 9:15 AM
10 points

I am not as convinced that there don’t exist pivotal acts that are importantly easier than directly burning all GPUs (after which I might or might not then burn most of the GPUs anyway). There’s no particular reason humans can’t perform dangerous cognition without AGI help and do some pivotal act on their own, our cognition is not exactly safe. But if I did have such an idea that I thought would work I wouldn’t write about it, and it most certainly wouldn’t be in the Overton window. Thus, I do not consider the failure of our public discourse to generate such an act to be especially strong evidence that no such act exists.

Given how central the execution of a pivotal act seems to be to our current best attempt at an alignment strategy (see point 6 of EY’s post) I was confused about finding very little discussion about possible approaches here in the forum. Does the quote above already fully explain this (since all promising approaches are too far out of the Overton window to discuss publically)? Or has no one gotten around to initializing such a conversation? Or, quite possibly, have I overlooked extensive discussions in this direction?

It seems to me that having a long document with the 20 most commonly proposed approaches to such a pivotal act together with an analysis of their strengths and weaknesses, possibilities to give comments etc could be quite valuable for people who want to start thinking about such approaches. Also there is always a possibility of someone just having a really great idea (or maybe person A having a flawed idea containing the seed of a great idea, that inspires person B to propose a fix). Would other people also find this useful?

On the other hand, given possible counter-indications of such public discourse (proposals outside the Overton window representing a PR problem, or some proposals only being feasible without being publically announced), are there other strategies for reaping the benefits of many people with different backgrounds thinking about this problem? Things that come to mind: maybe a non-public essay contest where people can hand in a description of a possible pivotal act together with their own analysis concerning its feasibility. Those could be read by a panel of trusted experts (trusted both to have some competence in their judgement and in their confidentiality). Then harmless but insightful ones could be released for the public. Dangerous and/or non-insightful ones could be returned to their creators with a brief description why they are deemed a bad idea. And finally promising ones could be brought to the attention of people with ressources to further pursue them.
- anonymousaisafety Jun 14, 2022, 7:16 PM
  0 points
  Parent
  “Representing a PR problem” is an interesting choice of words. I wonder why that would be. Surely all pivotal acts that safeguard humanity long into the far future are entirely rational in explanation. Can you offer a reason for why a pivotal act would be a PR problem, or why someone would not want to tell people their best idea for such an act and would use the phrase “outside the Overton window” instead?
  - Moebius314 Jun 15, 2022, 8:25 AM
    1 point
    Parent
    
    Surely all pivotal acts that safeguard humanity long into the far future are entirely rational in explanation.
    
    I agree that in hindsight such acts would appear entirely rational and justified, but to not represent a PR problem, they must appear justified (or at least acceptable) to a member of the general public/a law enforcement official/a politician.
    
    Can you offer a reason for why a pivotal act would be a PR problem, or why someone would not want to tell people their best idea for such an act and would use the phrase “outside the Overton window” instead?
    
    To give one example: the oft-cited pivotal act of “using nanotechnology to burn all GPUs” is not something you could put as the official goal on your company website. If the public seriously thought that a group of people pursued this goal and had any chance of even coming close to achieving it, they would strongly oppose such a plan. In order to even see why it might be a justified action to take, one needs to understand (and accept) many highly non-intuitive assumptions about intelligence explosions, orthogonality, etc.
    
    More generally, I think many possible pivotal acts will to some degree be adversarial since they are literally about stopping people from doing or getting something they want (building an AGI, reaping the economic benefits from using an AGI, etc). There might be strategies for such an act which are inside the overton window (creating a superhuman propaganda-bot that convinces everyone to stop), but all strategies involving anything resembling force (like burning the GPUs) will run counter to established laws and social norms.
    
    So I can absolutely imagine that someone has an idea about a pivotal act which, if posted publically, could be used in a PR campaign by opponents of AI alignment (“look what crazy and unethical ideas these people are discussing in their forums”). That’s why I was asking what the best forms of discourse could be that avoid this danger.