johnswentworth comments on Shutting down all competing AI projects might not buy a lot of time due to Internal Time Pressure

johnswentworth 3 Oct 2024 16:30 UTC
10 points
7
Terminological note: something which does not buy ample time is not a pivotal act. Eliezer introduced the term to mean a specific thing, which he summarized as:
(as of late 2021) researchers use ‘pivotal’ and ‘pivotal act’ to refer to good events that upset the current gameboard—events that decisively settle a win, or drastically increase the probability of a win.
That same page also talks a bunch about how sticking to that definition is important, because there will predictably be lots of pressure to water the term down.
What links here?
- ThomasCederborg's comment on Shutting down all competing AI projects might not buy a lot of time due to Internal Time Pressure by ThomasCederborg (EA Forum; 11 Oct 2024 1:19 UTC; 1 point)
- Martin Randall 5 Oct 2024 0:39 UTC
  6 points
  3
  Parent
  Something which might not buy ample time can still be a pivotal act. From the Arbital page that you link to:
  
  Example 3: Suppose a behaviorist genie is restricted from modeling human minds in any great detail, but is still able to build and deploy molecular nanotechnology. Moreover, the AI is able to understand the instruction, “Build a device for scanning human brains and running them at high speed with minimum simulation error”, and is able to work out a way to do this without simulating whole human brains as test cases. The genie is then used to upload a set of, say, fifty human researchers, and run them at 10,000-to-1 speeds.
  
  This accomplishment would not of itself save the world or destroy it—the researchers inside the simulation would still need to solve the alignment problem, and might not succeed in doing so.
  
  But it would (positively) upset the gameboard and change the major determinants of winning, compared to the default scenario where the fifty researchers are in an equal-speed arms race with the rest of the world, and don’t have practically-unlimited time to check their work. The event where the genie was used to upload the researchers and run them at high speeds would be a critical event, a hinge where the optimum strategy was drastically different before versus after that pivotal act.
  
  The Limited AI (LAI) scenario in this post is equivalent to this example and therefore qualifies as a Pivotal Act under the Arbital Guarded Definition. Additionally, looking at your specific quote, the LAI would “drastically increase the probability of a win”.
  What links here?
  - ThomasCederborg's comment on Shutting down all competing AI projects might not buy a lot of time due to Internal Time Pressure by ThomasCederborg (11 Oct 2024 1:14 UTC; 3 points)
- ThomasCederborg 4 Oct 2024 15:11 UTC
  4 points
  0
  Parent
  I will change the title.
  However: you also seem to be using the term Pivotal Act as a synonym for removing all time pressure from competing AI projects (which the AI in my post does). Example 3 of the arbital page that you link to also explicitly refers to an act that removes all time pressure from competing AI projects as a Pivotal Act. This usage is also present in various comments by you, Yudkowsky, and others (see links and quotes below). And there does not seem to exist any other established term for an AI that: (i): completely removes all time pressure from competing AI projects by uploading a design team and giving them infinite time to work, (ii): keeps the designers calm, rational, sane, etc indefinitely (with all definitional issues of those terms fully solved), and (iii): removes all risks from scenarios where someone fails to hit an alignment target. What other established term exists for such an AI? I think people would generally refer to such an AI as a Pivotal Act AI. And as demonstrated in the post: such an AI might not buy a lot of time.
  Maybe using the term Pivotal Act as a synonym for an act that removes all time pressure from competing AI projects is a mistake? (Maybe the scenario in my post should be seen as showing that this usage is a mistake?). But it does seem to be a very well established way of using the term. And I would like to have a title that tells readers what the post is about. I think the current title probably did tell you what the post is about, right? (that the type of AI actions that people tend to refer to as Pivotal Acts might not buy a lot of time in reality)
  In the post I define new terms. But if I use a novel term in the title before defining the this term, the title will not tell you what the post is about. So I would prefer to avoid doing that.
  But I can see why you might want to have Pivotal Act be a protected term for something that is actually guaranteed to buy a lot of time (which I think is what you would like to do?). And perhaps it is possible to maintain (or re-establish?) this usage. And I don’t want to interfere with your efforts to do this. So I will change the title.
  If we can’t find a better solution I will change the title to: Internal Time Pressure. It does not really tell you what the post will be about. But at least it is accurate and not terminologically problematic. And even though the term is not commonly known, Internal Time Pressure is actually the main topic of the post (Internal Time Pressure is the reason that the AI mentioned above, that does all the nice things mentioned, might not actually buy a lot of time).
  Regarding current usage of the term Pivotal Act:
  It seems to me like you and many others are actually using the term as a shorthand for an AI that removes time pressure from competing AI projects. I can take many examples of this usage just from the discussion that faul_sname links to in the other reply to your comment.
  In the second last paragraph of part 1 of the linked post, Andrew_Critch writes:
  Overall, building an AGI development team with the intention to carry out a “pivotal act” of the form “forcibly shut down all other A(G)I projects” is probably going to be a rough time, I predict.
  No one seems to be challenging that usage of Pivotal Act (even though many other parts of the post are challenged). And it is not just this paragraph. The tl;dr also treats a Pivotal Act as interchangeable with: shut down all other AGI projects, using safe AGI. There are other examples in the post.
  In this comment on the post, it seems to me that Scott Alexander is using a Pivotal Act AI as a direct synonym for an AI capable of destroying all competing AI projects.
  In this comment it seems to me like you are using Pivotal Act interchangeably with shutting down all competing AI projects. In this comment, it seems to me that you accept the premise that uploading a design team and running them very quickly would be a Pivotal Act (but you question the plan on other grounds). In this comment, it seems to me that you are equating successful AI regulation with a Pivotal Act (but you question the feasibility of regulation).
  In this comment, Yudkowsky seems to me to be accepting the premise that preventing all competing AI projects would count as a Pivotal Act. He says that the described strategy for preventing all competing AI projects is not feasible. But he also says that he will change the way he speaks about Pivotal Acts if the strategy actually does work (and this strategy is to shut down competing AI projects with EMPs. The proposed strategy does nothing else to buy time, other than shutting down competing AI projects). (It is not an unequivocal case of using Pivotal Act as a direct synonym for reliably shutting down all competing AI projects. But it really does seem to me like Yudkowsky is treating Pivotal Act as a synonym for: preventing all competing AI projects. Or at least that he is assuming that preventing all competing AI projects would constitute a Pivotal Act).
  Consider also example 3 in the arbital page that you link to. Removing time pressure from competing AI projects by uploading a design team is explicitly defined as an example of a Pivotal Act. And the LAI in my post does exactly this. And the LAI in my post also does a lot of other things that increase the probability of a win (such as keeping the designers sane and preventing them from missing an aimed for alignment target).
  This usage points to a possible title along the lines of: The AI Actions that are Commonly Referred to as Pivotal Acts, are not Actually Pivotal Acts (or: Shutting Down all Competing AI Projects is not Actually a Pivotal Act). This is longer and less informative about what the post is about (the post is about the need to start ATA work now, because there might not be a lot time to do ATA work later, even if we assume the successful implementation of a very ambitious AI, whose purpose was to buy time). But this title would not interfere with an effort to maintain (or re-establish?) the meaning of Pivotal Act as a synonym for an act that is guaranteed to buy lots of time (which I think is what you are trying to do?). What do you think about these titles?
  
  PS:
  (I think that technically the title probably does conform to the specific text bit that you quote. It depends on what the current probability of a win is. And how one defines: drastically increase the probability of a win. But given the probability that Yudkowsky currently assigns to a win, I expect that he would agree that the launch of the described LAI would count as drastically increasing the probability of a win. (In the described scenario, there are many plausible paths along which the augmented humans actually do reach the needed levels of ATA progress in time. They are however not guaranteed to do this. The point of the post is that doing ATA now increases the probability of this happening). The statement that the title conforms to the quoted text bit is however only technically true in an uninteresting sense. And the title conflicts with your efforts to guard the usage of the term. So I will change the title as soon as a new title has been settled on. If nothing else is agreed on, I will change the title to: Internal Time Pressure)
  What links here?
  - ThomasCederborg's comment on Shutting down all competing AI projects might not buy a lot of time due to Internal Time Pressure by ThomasCederborg (11 Oct 2024 1:14 UTC; 3 points)
  - faul_sname 4 Oct 2024 21:13 UTC
    4 points
    3
    Parent
    
    Shutting Down all Competing AI Projects is not Actually a Pivotal Act
    
    This seems like an excellent title to me.
  - Martin Randall 5 Oct 2024 0:46 UTC
    2 points
    0
    Parent
    Please do not change the title. You have used the phase correctly from both a prescriptive and a descriptive approach to language. A title such as “Shutting Down all Competing AI Projects is not Actually a Pivotal Act” would be an incorrect usage and increase confusion.
- ThomasCederborg 11 Oct 2024 1:14 UTC
  3 points
  0
  Parent
  I changed the title from: ``A Pivotal Act AI might not buy a lot of time″ to: ``Shutting down all competing AI projects might not buy a lot of time due to Internal Time Pressure″.
  As explained by Martin Randall, the statement: ``something which does not buy ample time is not a pivotal act″ is false (based on the Arbital Guarded Definition of Pivotal Act). Given your ``Agreed react″ to that comment, this issue seems to be settled. In the first section of the present comment, I explain why I still think that the old title was a mistake. The second section outlines a scenario that better illustrates that a Pivotal Act AI might not buy a lot of time.
  Why the old title was a mistake
  The old title implied that launching the LAI was a very positive event. With the new title, launching the LAI may or may not have been a positive event. This was the meaning that I intended.
  Launching the LAI drastically increased the probability of a win by shutting down all competing AI projects. It however also increased risks from scenarios where someone successfully hits a bad alignment target. This can lead to a massively worse than extinction outcome (for example along the lines of the outcome implied by PCEV). In other words: launching LAI may or may not have been a positive event. Thus, launching the LAI may or may not have been a Pivotal Act according to the Arbital Guarded Definition (which requires the event to be very positive).
  The old title does not seem to be incompatible with the actual text of the post. But it is incompatible with my intended meaning. I didn’t intend to specify whether or not LAI was a positive event. Because the argument about the need for Alignment Target Analysis (ATA) goes through regardless of whether or not launching LAI was a good idea. Regardless of whether or not launching LAI was a positive event, ATA work needs to start now to reduce risks. Because in both cases, ATA progress is needed to reduce risks. And in both cases, there is not a lot of time to do ATA later. (ATA is in fact more important in scenarios where launching the LAI was in fact a terrible mistake)
  As I show in my other reply: there is a well established convention of using the term Pivotal Act as a shorthand for shutting down all competing AI projects. As can be seen by looking at the scenario in the post: this might not buy a lot of time. That is how I was using the term when I picked the old title.
  A scenario that better illustrates why a Pivotal Act AI might not buy a lot of time
  This section outlines a scenario where an unambiguous Pivotal Act is instantly followed by a very severe time crunch. It is possible to see that a Pivotal Act AI might not buy a lot of time by looking at the scenario in the post. But the present section will outline a scenario that better illustrates this fact. (In other words: this section outlines a scenario for which the old title would actually be a good title.) In this new scenario, a Pivotal Act dramatically reduces the probability of extinction by shutting down all unauthorised AI projects. It also completely removes the possibility of anything worse than extinction. Right after the Pivotal Act, there is a frenzied race against the clock to make enough progress on ATA before time runs out. Failure results in a significant risk of extinction.
  Consider the case where Dave launches Dave’s AI (DAI). If DAI had not been launched, everyone would have almost certainly been killed by some other AI. DAI completely and permanently shuts down all competing AI projects. DAI also reliably prevents all scenarios where designers fail to hit the alignment target that they are aiming at. Due to Internal Time Pressure, a Sovereign AI must then be launched very quickly (discussions of Internal Time Pressure can be found here, and here, and here). There is very little time to decide what alignment target to aim at. (The point made in this section is not sensitive to who gave Dave permission to launch DAI. Or sensitive to who DAI will defer to for the choice of alignment target. But for the sake of concreteness, let’s say that the UN security council authorised DAI. And that DAI defers to a global electorate regarding the choice of alignment target).
  By the time Dave launches DAI, work on ATA has already progressed very far. There already exist many alignment targets that would in fact lead to an unambiguous win (somehow, describing these outcomes as a win is objectively correct). Only one of the many proposed alignment targets still has an unnoticed problem. And this problem is not nearly as severe as the problem with PCEV. People take the risks of unnoticed problems very seriously. But due to severe Internal Time Pressure, there is not much they can do with this knowledge. The only option is to use their limited time to analyse all alignment targets that are being considered. (many very optimistic assumptions are made regarding both DAI and the level of ATA progress. This is partly to make sure that readers will agree that the act of launching DAI should count as a Pivotal Act. And partly to show that ATA might still be needed, despite these very optimistic assumptions).
  The only alignment target that is not a clear win, is based on maximising the sum of re-normalised utility functions. The proposed AI includes a proposed way of mapping a human to a utility function. This always results in a perfect representation of what the human wants. (And there are no definitional issues with this mapping). These functions are then renormalised to have the same variance (as discussed here). Let’s write VarAI for this AI. VarAI maximises the sum of the renormalised functions. The aggregation method described above has a problem that is obvious in retrospect. If that problem is explained, then it is clear that VarAI is an unacceptable alignment target. However, in this scenario, no one has noticed this problem. The question is now whether or not anyone will notice the problem (before an alignment target needs to be settled on).
  Due to the problem with this aggregation method, VarAI would create a Soft Maximisation version of a Suffering Reducing AI (SMSRAI) as a successor AI (if VarAI is successfully implemented and pointed at the population in this thought experiment). So, if the proponents of VarAI happens to win the political fight, then the result would be SMSRAI. SMSRAI would work to reduce both average suffering and population size (without doing anything drastic). The quality of most peoples lives would increase very dramatically. Many people would choose to spend centuries living a very nice life. But after some conversations with SMSRAI, few would choose to live for millennia. No one would choose to live for many millennia. Humanity would very gradually go extinct in a few dozen generations. By the time it happens, no one would find this problematic in any way. Changes in attitudes are very gradual, and no one ever feels pressured or manipulated by SMSRAI (for example: if you ask SMSRAI to leave you alone, it backs off for a couple of centuries before politely asking again if you want to talk).
  If anyone finds the problem with the aggregation method in time, then VarAI will be dropped as a potential alignment target. (Once the problem is explained, it is retrospectively obvious why VarAI is completely unacceptable as an alignment target). If no one finds this problem in time however, it is possible that VarAI will win the political fight.
  In order to avoid accidental interference, Dave designed DAI to not discuss ATA. (As a general strategy for reducing risks from definitional issues, Dave actually made sure that DAI only does things that are absolutely necessary for the completion of a carefully chosen set of well defined tasks). So the only way to remove the possibility of extinction, is to reach the needed level of ATA understanding before it’s too late. (If the problem with this aggregation method is not explained to them, their only path to understanding is to try doing ATA).
  Despite severe Internal Time Pressure, the possibility of an unnoticed problem in an alignment target is taken very seriously. So a last judge safety add on is implemented. The last judge is extrapolated using a different extrapolation method than the one used in VarAI (in order to give the last judge a chance to veto a bad extrapolation method. The VarAI extrapolation definitions are in fact somehow objectively correct. But there is no way of verifying that fact). The result of the last judge extrapolation method turns out to be a very anxious mind. The result is a mind that is in general not comfortable with objecting to things (such as the extrapolation method that it is the output of, or the outcome implied by an alignment target). This mind is very reluctant to veto a scenario where no one is suffering, and where almost everyone are very happy with all aspects of how things turn out (SMSRAI very gradually, over many generations, ``helps people realise″ that the outcome is actually a good outcome. And people genuinely are having a very nice time, for a lot longer than most people expected). So the off switch is not triggered.
  If Dave had not launched DAI, all humans would very likely have been killed very soon by some other AI. So I think a lot of people would consider Launching DAI to be a Pivotal Act. (It completely upset the game board. It drastically increased the probability of a win. It was a very positive event according to a wide range of value systems). But if someone wants humanity to go on existing (or wants to personally live a super long life), then there is not a lot of time to find the problem with VarAI (because without sufficient ATA progress, there still exists a significant probability of extinction). So, launching DAI was a Pivotal Act. And launching DAI did not result in a lot of time to work on ATA. Which demonstrates that a Pivotal Act AI might not buy a lot of time.
  One can use this scenario as an argument in favour of starting ATA work now. It is one specific scenario that exemplifies a general class of scenarios: scenarios where starting ATA work now, would further reduce an already small risk of a moderately bad outcome. It is a valid argument. But it is not the argument that I was trying to make in my post. I was thinking of something a lot more dangerous. I was imagining a scenario where a bad alignment target is very likely to get successfully implemented unless ATA progresses to the needed levels of insight before it is too late. And I was imagining an alignment target that implied a massively worse than extinction outcome (for example along the lines of the outcome implied by PCEV). I think this is a stronger argument in favour of starting work on ATA now. And this interpretation was ruled out by the old title (which is why I changed the title).
  (a brief tangent: if someone expects everything to turn out well. But would like to work on ATA in order to further reduce a small probability of something going moderately bad. Then I would be very happy to collaborate with such a person in a future ATA project. Having very different perspectives in an ATA project sounds like a great idea. An ATA project is very different from a technical design project where a team is trying to get something implemented that will actually work. There is really no reason for people to have similar worldviews or even compatible ontologies. It is a race against time to find a conceptual breakthrough of an unknown type. It is a search for an unnoticed implicit assumption of an unknown type. So genuinely different perspectives sounds like a great idea)
  In summary: ``A Pivotal Act AI might not buy a lot of time″ is in fact a true statement. And it is possible to see this by looking at the scenario outlined in the post. But it was a mistake to use this statement as the title for this post. Because it implies things about the scenario that I did not intend to imply. So I changed the title and outlined a scenario that is better suited for illustrating that a Pivotal Act AI might not buy a lot of time.
  PS:
  I upvoted johnswentworth’s comment. My original title was a mistake. And the comment helped me realise my mistake. I hope that others will post similar comments on my posts in the future. The comment deserves upvotes. But I feel like I should ask about these agreement votes.
  The statement: ``something which does not buy ample time is not a pivotal act″ is clearly false. Martin Randall explained why the statement is false (helpfully pulling out the relevant quotes from the texts that johnswentworth cited). And then johnswentworth did an ``Agreed reaction″ on Martin Randall’s explanation of why the statement is false. After this however, johnswentworth’s comment (with the statement that had already been determined to be false) was agree voted to plus 7. That seemed odd to me. So I wanted to ask about it. (My posts sometimes question deeply entrenched assumptions. And johnswentworth’s comment sort of looks like criticism (at least if one only skims the post and the discussion). So maybe there is no great mystery here. But I still wanted to ask about this. Mostly in case someone has noticed an object level error in my post. But I am also open to terminology feedback)
  What links here?
  - ThomasCederborg's comment on Shutting down all competing AI projects might not buy a lot of time due to Internal Time Pressure by ThomasCederborg (EA Forum; 11 Oct 2024 1:19 UTC; 1 point)
- faul_sname 3 Oct 2024 17:05 UTC
  3 points
  1
  Parent
  It does strike me that, to OP’s point, “would this act be pivotal” is a question whose answer may not be knowable in advance. See also previous discussion on pivotal act intentions vs pivotal acts (for the audience, I know you’ve already seen it and in fact responded to it).
  What links here?
  - ThomasCederborg's comment on Shutting down all competing AI projects might not buy a lot of time due to Internal Time Pressure by ThomasCederborg (4 Oct 2024 15:11 UTC; 4 points)

johnswentworth comments on Shutting down all competing AI projects might not buy a lot of time due to Internal Time Pressure

Regarding current usage of the term Pivotal Act:

Why the old title was a mistake

A scenario that better illustrates why a Pivotal Act AI might not buy a lot of time

PS: