I’m not sure why your path in life is so rare, but I find that as you go “upwards” in intellectual pursuits, you diverge from most people and things, rather than converge into one “correct” worldview.
I used to think about questions like you are now, until I figured that I was just solving my personal problems by treating them as external branches of knowledge. Afterwards I switched over to psychology, which tackled the problems more directly.
I also keep things simple for myself, so that I don’t drown in them in any sense. If my thoughts aren’t simple, it’s likely that I don’t understand what I’m thinking about.
I don’t have much faith in logic anymore, anyway. What we’re doing is essentially just constructing problems, and then eventually solving them by noticing that our initial construction contained an error. But does this snake eating its own tail even have any connection to reality in the first place? And isn’t all logic just a path from axioms to whatever follows from them? But in that case, nothing new is ever derived. And thus, yet another pursuit of mine ends up destroying itself. Thus is the nature of things, they seek an opposite.
But I haven’t lost faith in reality (the tree of states that the world can be in), nor in usefulness (the ability to navigate towards desirable states). But at this point, I think that a concrete goal is needed. You seek to solve problems, but what does a solution look like? You need to define a desirable state in order to find a path to it. I don’t think it works to ask for “the correct” answer, as that would rely on a unique, external preference. So I advice to have faith in your own evaluations, in order to have a concrete direction. If you don’t dare to voice your own preferences in case they’re “wrong”, I’d say that’s the biggest concern here.
Now, on to AI, which seems to be your main interest. I think there’s a big problem with AI alignment which is rarely mentioned: We’re not even aligned with ourselves. If a genie appeared and granted us unlimited wishes, we’d ruin everything for ourselves quite quickly. For in the first place, we simply don’t know what we want. This is one of the reasons why we contradict ourselves so often.
A whole lot of people want happiness, but that’s actually trivial, they just have to be content with what is. But they say “No, it’s not good enough yet. I won’t be happy before things are better”—so their unhappiness is a choice. Is our real desire then “victory”? No, if we win any battles, we simply start on the next one. Do we want life to resist us? That’s closer to the truth, but we certainly don’t want too much resistance. If we consider life to be a game of DnD, we should understand that the game-master should be neither too harsh nor too soft. You think AI will destroy you? It might. But most people would destroy themselves if they had omnipotence for 24 hours. It’s too hasty to ask if an AI wants the same thing as us, as we don’t even know what we want. For an AI to be with you, it would also have to be partly against you.
Until you can get an AI to have human irrationality, it will just be a monkey’s paw. Any purely logical perspective is inherently anti-human. The most logical statement which could be made about reality is “Everything is always exactly as it should be”. Any other takes on reality are only human, so you probably don’t want a hyper-intelligent and rational AI to make decisions for you. Even if you can get an AI to value something that you value, that thing might be valuable only because it’s scarce, and scarce because a terrible price has to be paid for its existence.
And just an extra note: You can’t seperate anything from its opposite. You can’t teach an AI ‘good’ without also teaching it ‘evil’ for instance. This is because “Good and evil” is one concept rather than two. There’s nothing mysterious about the so-called “Waluigi effect”, just like there’s nothing mysterious about us fighting ourselves, nor about me using logic to arrive at the conclusion that logic isn’t helpful.
I’m not sure why your path in life is so rare, but I find that as you go “upwards” in intellectual pursuits, you diverge from most people and things, rather than converge into one “correct” worldview.
I used to think about questions like you are now, until I figured that I was just solving my personal problems by treating them as external branches of knowledge. Afterwards I switched over to psychology, which tackled the problems more directly.
I also keep things simple for myself, so that I don’t drown in them in any sense. If my thoughts aren’t simple, it’s likely that I don’t understand what I’m thinking about.
I don’t have much faith in logic anymore, anyway. What we’re doing is essentially just constructing problems, and then eventually solving them by noticing that our initial construction contained an error. But does this snake eating its own tail even have any connection to reality in the first place? And isn’t all logic just a path from axioms to whatever follows from them? But in that case, nothing new is ever derived.
And thus, yet another pursuit of mine ends up destroying itself. Thus is the nature of things, they seek an opposite.
But I haven’t lost faith in reality (the tree of states that the world can be in), nor in usefulness (the ability to navigate towards desirable states). But at this point, I think that a concrete goal is needed. You seek to solve problems, but what does a solution look like? You need to define a desirable state in order to find a path to it. I don’t think it works to ask for “the correct” answer, as that would rely on a unique, external preference. So I advice to have faith in your own evaluations, in order to have a concrete direction. If you don’t dare to voice your own preferences in case they’re “wrong”, I’d say that’s the biggest concern here.
Now, on to AI, which seems to be your main interest.
I think there’s a big problem with AI alignment which is rarely mentioned: We’re not even aligned with ourselves. If a genie appeared and granted us unlimited wishes, we’d ruin everything for ourselves quite quickly. For in the first place, we simply don’t know what we want. This is one of the reasons why we contradict ourselves so often.
A whole lot of people want happiness, but that’s actually trivial, they just have to be content with what is. But they say “No, it’s not good enough yet. I won’t be happy before things are better”—so their unhappiness is a choice. Is our real desire then “victory”? No, if we win any battles, we simply start on the next one. Do we want life to resist us? That’s closer to the truth, but we certainly don’t want too much resistance. If we consider life to be a game of DnD, we should understand that the game-master should be neither too harsh nor too soft.
You think AI will destroy you? It might. But most people would destroy themselves if they had omnipotence for 24 hours. It’s too hasty to ask if an AI wants the same thing as us, as we don’t even know what we want.
For an AI to be with you, it would also have to be partly against you.
Until you can get an AI to have human irrationality, it will just be a monkey’s paw. Any purely logical perspective is inherently anti-human. The most logical statement which could be made about reality is “Everything is always exactly as it should be”. Any other takes on reality are only human, so you probably don’t want a hyper-intelligent and rational AI to make decisions for you.
Even if you can get an AI to value something that you value, that thing might be valuable only because it’s scarce, and scarce because a terrible price has to be paid for its existence.
And just an extra note: You can’t seperate anything from its opposite. You can’t teach an AI ‘good’ without also teaching it ‘evil’ for instance. This is because “Good and evil” is one concept rather than two. There’s nothing mysterious about the so-called “Waluigi effect”, just like there’s nothing mysterious about us fighting ourselves, nor about me using logic to arrive at the conclusion that logic isn’t helpful.