Program Den comments on Aligned with what?

Program Den Jan 16, 2023, 5:12 PM
3 points
2 votes
Overall karma indicates overall quality.
0
0 votes
Agreement karma indicates agreement, separate from overall quality.
It must depend on levels of intelligence and agency, right? I wonder if there is a threshold for both of those in machines and people that we’d need to reach for there to even be abstract solutions to these problems? For sure with machines we’re talking about far past what exists currently (they are not very intelligent, and do not have much agency), and it seems that while humans have been working on it for a while, we’re not exactly there yet either.

Seems like the alignment would have to be from micro to macro as well, with constant communication and reassessment, to prevent subversion.

Or, what was a fine self-chunk [arbitrary time ago], may not be now. Once you have stacks of “intelligent agents” (mesa or meta or otherwise) I’d think the predictability goes down, which is part of what worries folks. But if we don’t look at safety as something that is “tacked on after” for either humans or programs, but rather something innate to the very processes, perhaps there’s not so much to worry about.
- Trinley Goldenberg Jan 16, 2023, 7:57 PM
  3 points
  2 votes
  Overall karma indicates overall quality.
  0
  0 votes
  Agreement karma indicates agreement, separate from overall quality.
  Parent
  Well, the same alignment issue happens with organizations, as well as within an individual with different goals and desires. It turns out that the existing “solutions” to these abstractly similar problems look quite different because the details matter a lot. And I think AGI is actually more dissimilar to any of these than they are to each other.
  - Program Den Jan 17, 2023, 4:59 AM
    1 point
    1 vote
    Overall karma indicates overall quality.
    0
    0 votes
    Agreement karma indicates agreement, separate from overall quality.
    Parent
    Do we all have the same definition of what AGI is? Do you mean being able to um, mimic the things a human can do, or are you talking full on Strong AI, sentient computers, etc.?
    
    Like, if we’re talking The Singularity, we call it that because all bets are off past the event horizon.
    
    Most the discussion here seems to sort of be talking about weak AI, or the road we’re on from what we have now (not even worthy of actually calling “AI”, IMHO— ML at least is a less overloaded term) to true AI, or the edge of that horizon line, as it were.
    
    When you said “the same alignment issue happens with organizations, as well as within an individual with different goals and desires” I was like “yes!” but then you went on to say AGI is dissimilar, and I was like “no?”.
    
    AGI as we’re talking about here is rather about abstractions, it seems, so if we come up with math that works for us, to prevent humans from doing Bad Stuff, it seems like those same checks and balances might work for our programs? At least we’d have an idea, right?
    
    Or, maybe, we already have the idea, or at least the germination of one, as we somehow haven’t managed to destroy ourselves or the planet. Yet. 😝