I don’t think it’s such a stark choice. I think odds are the lead researcher takes the infinite power, and it turns out okay to great. Corrigibility seems like the safest outer alignment plan, and it’s got to be corrigible to some set of people in particular. I think giving one random person near infinite power will work out way better than intuition suggests. I think it’s not power that corrupts, but rather the pursuit of power. I think unlimited power will lead to an ordinary, non-sociopathic person to progressively focus more on their empathy for others. I think they’ll ultimately use that power to let others do whatever they want that doesn’t take away others’ freedom to do what they want. And that’s the best outer alignment result, in my opinioin.
Alexander Wales in the end of his series ‘Worth the Candle’ does a lovely job of laying out what a genuinely kind person given omnipotence could do to make the world a nice place for everyone. It’s a lovely vision, but I think relying on this in practice seems a lot less trustworthy to me than having a bureaucratic process with checks & balances in charge. I mean, I still think it’ll ultimately have to be some relatively small team in charge of a model corrigible to them, if we’re in a singleton scenario. I have a lot more faith in ‘small team with bureaucratic oversight’ than some individual tech bro selected semi-randomly from the set of researchers at big AI labs who might be presented with the opportunity to ‘get the jump’ on everyone else.
I’m curious why you trust a small group of government bros a lot more than one tech bro. I wouldn’t strongly prefer either, but I’d prefer Sam Altman or Demis Hassabis to a randomly chosen bureaucrat. I don’t totally trust those guys, but I think it’s pretty likely they’re not total sociopaths or idiots.
By the opportunity to get the jump on everyone else, do you mean beating other companies to AGI, or becoming the one guy your AGI takes orders from?
I meant stealing control of an AGI within the company before the rest of the company catches on. I don’t necessarily mean that I’d not want Sam or Demis involved in the ruling council, just that I’d prefer if there was like… an assigned group of people to directly operate the model, and an oversight committee with reporting rules reporting to a larger public audience. Regulations and structure, rather than the whims of one person.
I don’t think it’s such a stark choice. I think odds are the lead researcher takes the infinite power, and it turns out okay to great. Corrigibility seems like the safest outer alignment plan, and it’s got to be corrigible to some set of people in particular. I think giving one random person near infinite power will work out way better than intuition suggests. I think it’s not power that corrupts, but rather the pursuit of power. I think unlimited power will lead to an ordinary, non-sociopathic person to progressively focus more on their empathy for others. I think they’ll ultimately use that power to let others do whatever they want that doesn’t take away others’ freedom to do what they want. And that’s the best outer alignment result, in my opinioin.
Alexander Wales in the end of his series ‘Worth the Candle’ does a lovely job of laying out what a genuinely kind person given omnipotence could do to make the world a nice place for everyone. It’s a lovely vision, but I think relying on this in practice seems a lot less trustworthy to me than having a bureaucratic process with checks & balances in charge. I mean, I still think it’ll ultimately have to be some relatively small team in charge of a model corrigible to them, if we’re in a singleton scenario. I have a lot more faith in ‘small team with bureaucratic oversight’ than some individual tech bro selected semi-randomly from the set of researchers at big AI labs who might be presented with the opportunity to ‘get the jump’ on everyone else.
I’m curious why you trust a small group of government bros a lot more than one tech bro. I wouldn’t strongly prefer either, but I’d prefer Sam Altman or Demis Hassabis to a randomly chosen bureaucrat. I don’t totally trust those guys, but I think it’s pretty likely they’re not total sociopaths or idiots.
By the opportunity to get the jump on everyone else, do you mean beating other companies to AGI, or becoming the one guy your AGI takes orders from?
I meant stealing control of an AGI within the company before the rest of the company catches on. I don’t necessarily mean that I’d not want Sam or Demis involved in the ruling council, just that I’d prefer if there was like… an assigned group of people to directly operate the model, and an oversight committee with reporting rules reporting to a larger public audience. Regulations and structure, rather than the whims of one person.