Sammy Martin comments on Pivotal Acts might Not be what You Think they are

Sammy Martin 6 Nov 2023 14:14 UTC
6 points
0
I don’t like the term pivotal act because it implies without justification that the risk elimination has to be a single action. Depending on the details of takeoff speed that may or may not be a requirement but if the final speed is months or longer then almost certainly there will be many actions taken by humans + AI of varying capabilities that together incrementally reduce total risk to low levels. I talk about this in terms of ‘positively transformative AI’ as the term doesn’t bias you towards thinking this has to be a single action, even if nonviolent.
Seeing the risk reduction as a single unitary action, like seeing it as a violent overthrow of all the world’s governments, also makes the term seem more authoritarian, crazy, fantastical and off-putting to anyone involved in real world politics so I’d recommend that in our thinking we make both the change you suggest and stop thinking of it as necessarily one action.
- Johannes C. Mayer 7 Nov 2023 12:35 UTC
  1 point
  0
  Parent
  I think you are correct, for a particular notion of pivotal act. One that I think is different from Eliezer’s notion. It’s certainly different from my notion.
  
  I find it pretty strange to say that the problem is that a pivotal act is a single action. Everything can be framed in terms of a single action.
  
  For any sequence of actions, e.g.[X, Y, Z] I can define a new action ω := [X, Y, Z], which executes X, then Y, and then Z. You can do the same for plans. The difference between plans and action sequences is that plans can have things like conditionals. For example, choosing the next sequence of actions based on the current state of the environment. You could also say that a plan is a function that tells you what to do. Most often this function takes in your model of the world.
  
  So really you can see anything you could ever do as a single plan that you execute. If there are multiple steps involved you simply give a new name to to all these steps, such that you now have only a single thing. That is how I am thinking about it. After this definition, we can have a pivotal act that is composed of many small actions that are distributed across a large timespan.
  
  The usefulness of the concept of a pivotal act comes from the fact that a pivotal act needs to be something that saves us with a very high probability. It’s not important at all that it happens suddenly, or that it is a single action. So your criticism seems to miss the mark. You are attacking the concept of a pivotal act for having properties that it simply does not have.
  
  “Upload a human” is something that requires many steps dispersed throughout time. We just use the name “Upload a human” such that we don’t need to specify all of these individual steps in detail. That would be impossible right now anyway, as we don’t know exactly how to do it.
  
  So if you provide a plan that is composed of many actions distributed throughout time, that will save us with a very high probability, I would count this as a pivotal act.
  
  Note that being a pivotal act is a property of a plan in relation to the territory. There can be a plan P that saves us when executed. But I might fail to predict this. I.e. it is possible to misestimate is_pivotal_act(P). So one reason for having relatively simple, abstract plans like “Upload a human”, is that these plans specify a world state, with easily visible properties. In the “Upload a human” example we would have a superintelligent human. Then we can evaluate the is_pivotal_act property, and based on that we have created a superintelligent human. I am heavily simplifying here, but I think you get the idea.
  
  I think your “positively transformative AI” just does not capture what a pivotal act is about (I haven’t read the article, I am guessing based on the name). You could have positive transformative AI, that makes things increasingly better and better, by a lot. And then somebody builds a misaligned AGI and everybody dies. One doesn’t exclude the other.