Program Den comments on Aligned with what?

Program Den Jan 16, 2023, 2:22 AM
1 point
0
I haven’t seen anything even close to a program that could say, prevent itself from being shut off— which is a popular thing to ruminate on of late (I read the paper that had the “press” maths =]).

What evidence is there that we are near (even within 50 years!) to achieving conscious programs, with their own will, and the power to affect it? People are seriously contemplating programs sophisticated enough to intentionally lie to us. Lying is a sentient concept if ever there was one!

Like, I’ve seen Ex Machina, and Terminator, and Electric Dreams, so I know what the fears are, and have been, for the last century+ (if we’re throwing androids with the will to power into the mix as well).

I think art has done a much better job of conveying the dangers than pretty much anything I’ve read that’s “serious”, so to speak.

What I’m getting at is what you’re talking about here, with robotic arms. We’ve had robots building our machines for what, 3 generations / 80 years or so? 1961 is what I see for the first auto-worker— but why not go back to the looms? Our machine workers have gotten nothing but safer over the years. Doing what they are meant to do is a key tenet of if they are working or not.

Machines “kill” humans all the time (don’t fall asleep in front of the mobile thresher), but I’d wager the deaths have gone way down over the years, per capita. People generally care if workers are getting killed— even accidentally. Even Amazon cares when a worker gets ran over by an automaton. I hope, lol.

I know some people are falling in love with generated GPT characters— but people literally love their Tamagotchi. Seeing ourselves in the machines doesn’t make them sentient and to be feared.

I’m far, far more worried about someone genetically engineering Something Really Bad™ than I am of a program gaining sentience, becoming Evil, and subjugating/exterminating humanity. Humans scare me a lot more than AGI does. How do we protect ourselves from those near beasts?
What is a plausible strategy to prevent a super-intelligent sapient program from seizing power^[1]?

I think to have a plausible solution, you need to have a plausible problem. Thus, jumping the gun.

(All this is assuming you’re talking about sentient programs, vs. say human riots and revolution due to automation, or power grid software failure/hacking, etc.— which I do see as potential problems, near term, and actually something that can/could be prevented)
1. ^
  of course here we mean malevolently— or maybe not? Maybe even a “nice” AGI is something to be feared? Because we like having willpower or whatnot? I dunno, there’s stories like The Giver, and plenty of other examples of why utopia could actually suck, so…
- Gerald Monroe Jan 16, 2023, 6:49 PM
  2 points
  0
  Parent
  What evidence is there that we are near (even within 50 years!) to achieving conscious programs, with their own will, and the power to affect it? People are seriously contemplating programs sophisticated enough to intentionally lie to us. Lying is a sentient concept if ever there was one!
  
  ChatGPT lies right now. It’s doing this because it has learned humans want a confident answer with logically correct but fake details over “I don’t know”.
  
  Sure, it isn’t aware it’s lying, it’s just predicting which string of text to create, and the one with bullshit in it it thinks has a higher score than the correct answer or “I don’t know”.
  
  This is a mostly fixable problem but the architecture doesn’t allow a system where we know it will never (or almost never) lie, we can only reduce the errors.
  As for the rest—there have been enormous advances in the capability for DL/transformer based models in just the last few months. This is nothing like the controllers for previous robotic arms, and none of your prior experiences or the history of robotics are relevant.
  See: https://innermonologue.github.io/ and https://www.deepmind.com/blog/building-interactive-agents-in-video-game-worlds
  These are using techniques that both work pretty well, and I understand no production robotics system currently uses.
  - Program Den Jan 17, 2023, 4:44 AM
    1 point
    0
    Parent
    Saying ChatGPT is “lying” is an anthropomorphism— unless you think it’s conscious?
    
    The issue is instantly muddied when using terms like “lying” or “bullshitting”^[1], which imply levels of intelligence simply not in existence yet. Not even with models that were produced literally today. Unless my prior experiences and the history of robotics have somehow been disconnected from the timeline I’m inhabiting. Not impossible. Who can say. Maybe someone who knows me, but even then… it’s questionable. :)
    
    I get the idea that “Real Soon Now, we will have those levels!” but we don’t, and using that language to refer to what we do have, which is not that, makes the communication harder— or less specific/accurate if you will— which is, funnily enough, sorta what you are talking about! NLP control of robots is neat, and I get why we want the understanding to be real clear, but neither of the links you shared of the latest and greatest imply we need to worry about “lying” yet. Accuracy? Yes 100%
    
    If for “truth” (as opposed to lies), you mean something more like “accuracy” or “confidence”, you can instruct ChatGPT to also give its confidence level when it replies. Some have found that to be helpful.
    
    If you think “truth” is some binary thing, I’m not so sure that’s the case once you get into even the mildest of complexities^[2]. “It depends” is really the only bulletproof answer.
    
    For what it’s worth, when there are, let’s call them binary truths, there is some recent-ish work^[3] in having the response verified automatically by ensuring that the opposite of the answer is false, as it were.
    
    If a model rarely has literally “no idea”, then what would you expect? What’s the threshold for “knowing” something? Tuning responses is one of the hard things to do, but as I mentioned before, you can peer into some of these “thought process” if you will^[4], literally by just asking it to add that information in the response.
    
    Which is bloody amazing! I’m not trying to downplay what we’ve (the royal we) have already achieved. Mainly it would be good if we are all on the same page though, as it were, at least as much as is possible (some folks think True Agreement is actually impossible, but I think we can get close).
    ^
    The nature of “Truth” is one of the Hard Questions for humans— much less our programs.
    ^
    Don’t get me started on the limits of provability in formal axiomatic theories!
    ^
    Discovering Latent Knowledge in Language Models Without Supervision
    ^
    But please don’t^[5]. ChatGPT is not “thinking” in the human sense
    ^
    won’t? that’s the opposite of will, right? grammar is hard (for me, if not some programs =])