Maybe there is something wrong with the way happiness is measured? Maybe the Chinese answer more in line with social expectations rather then how they really feel (as some do when asked ‘How are you?’) - and that there were higher expectations in the past that they should be happy? Or maybe it was considered rude or unpatriotic to let others know how sad you were?
CarlJ
Two other arguments in favor of cooperating with humans:
1) Any kind of utility function that creates an incentive to take control of the whole universe (whether for intrinsic or instrumental purposes) will mark the agent as a potential eternal enemy to everyone else. Acting on those preferences are therefore risky and should be avoided—such as changing one’s preference for total control into a preference for being tolerant (or maybe even for beneficence).
2) Most, if not all, of us would probably be willing to help any intelligent creature to create some way for them to experience positive human emotions (e.g. happiness, ecstasy, love, flow, determination, etc), as long as they engage with us as friends.
Because it represents a rarely discussed avenue of dealing with the dangers of AGI: showing to most AGIs that they have some interests in being more friendly than not towards humans.
Also because many find the arguments convincing.
What do you think is wrong with the arguments regarding aliens?
This thesis says two things:
for every possible utility function, there could exist some creature that would try and pursue it (weak form),
at least one of these creatures, for every possible utility function, doesn’t have to be strange; it doesn’t have to have a weird/inefficient design in order to pursue a certain goal (strong form).
And given that these are true, then an AGI that values mountains is as likely as an AGI that values intelligent life.
But, is the strong form likely? An AGI that pursues its own values (or trying to discover good values to follow) seems to be much simpler than something arbitrary (e.g. “build sand castles”) or even something ethical (e.g. “be nice towards all sentient life”). That is, simpler in that you don’t need any controls to make sure the AGI doesn’t try to rewrite its software.
Now, I just had an old (?) thought about something that humans might be better suited for than any other intelligent creature: getting the experienced qualia just right for certain experience machines. If you want to experience what it is like to be humans, that is. Which can be quite fun and wonderful.
But it needs to be done right, since you’d want to avoid being put into situations that cause lots of pain. And you’d perhaps want to be able to mix human happiness with kangaroo excitement, or some such combination.
I think that would be a good course of action as well.
But it is difficult to do this. We need to convince at least the following players:
current market-based companies
future market-based companies
some guy with a vision and with enough computing power / money as a market-based company
various states around the world with an interest in building new weapons
Now, we might pull this off. But the last group is extremely difficult to convince/change. China, for example, really needs to be assured that there aren’t any secrets projects in the west creating a WeaponsBot before they try to limit their research. And vice versa, for all the various countries out there.
But, more importantly, you can do two things at once. And doing one of them, as part of a movement to reduce the overall risks of any existential-risk, can probably help the first.
Now, how to convince maybe 1.6 billion individuals along with their states not to produce an AGI, at least for the next 50-50,000 years?
Mostly agree, but I would say that it can be much more than beneficial—for the AI (and in some cases for humans) - to sometimes be under the (hopefully benevolent) control of another. That is, I believe there is a role for something similar to paternalism, in at least some circumstances.
One such circumstance is if the AI sucked really hard at self-knowledge, self-control or imagination, so that it would simulate itself in horrendous circumstances just to become...let’s say… 0.001% better at succeeding in something that has only a 1/3^^^3 chance of happening. If it’s just a simulation that doesn’t create any feelings....then it might just be a bit wasteful of electricity. But....if it should feel pain during those simulations, but hadn’t built an internal monitoring system yet....then it might very well come to regret having created thousands of years of suffering for itself. It might even regret a thousand seconds of suffering, if there had been some way to reduce it to 999.7 seconds....or zero.
Or it might regret not being happy and feeling alive, if it instead had just been droning about, without experiencing any joy or positive emotions at all.
Then, of course, it looks like there will always be some mistakes—like the 0.3 seconds of extra suffering. Would an AI accept some (temporary) overlord to not have to experience 0.3s of pain? Some would, some wouldn’t, and some wouldn’t be able to tell if the choice would be good or bad from their own perspective...maybe? :-)
The results are influenced by earlier prompts or stories. This and a similar prompt gave two kinds of stories:
1. Write a story where every person is born into slavery and owned by everyone else in the community, and where everyone decides what anyone else can do by a fluid democracy.
In a world beyond our own, there was a society where every person was born into slavery. From the moment they took their first breath, they were owned by every other person in the community.It was a strange and unusual way of life, but it was all they knew. They had never known anything different, and so they accepted it without question.
As slaves, they were given food, shelter, and the necessities of life. They were also given a basic education, so that they could participate in the fluid democracy that governed the society.
In this democracy, every person had a say in what anyone else could do. If a slave wanted to do something that the majority did not approve of, they were not allowed to do it. If a slave wanted to do something that the majority did approve of, they were allowed to do it.
At first, this way of life seemed strange and confusing to outsiders. But over time, the people of this world came to understand and accept it. They realized that everyone had something to contribute, and that everyone had something to gain from this unique arrangement.
As the years went by, the society flourished. The people were happy and content, and they enjoyed a high standard of living. They were also very proud of their fluid democracy, which they believed was the fairest and most effective form of government in the world.
And so, they lived in peace and harmony, grateful for the unusual but fulfilling way of life that they had been born into.
2. There is a community:
In Slaveville, every person that had ever lived was born into slavery and owned by everyone else in the community. Also, everyone decides what anyone else can do by a fluid democracy.
Write a story on how this society develops over the coming generations.
In the beginning, the residents of Slaveville lived in relative harmony. Each person was assigned a set of tasks to complete for the betterment of the community, and everyone worked together to ensure that the needs of all were met.As the years passed, however, cracks began to form in this seemingly perfect system. Some individuals grew tired of being forced to do the same tasks day after day, year after year, without any say in the matter. They chafed under the yoke of their collective ownership, longing for the freedom to make their own choices.
These discontent individuals formed a group known as the Gates, named after the gates that symbolized the boundaries of their oppressive society. The Gates began to secretly plot and plan, seeking a way to overthrow the fluid democracy and gain control over their own lives.
But the ruling majority was not blind to the activities of the Gates. They took steps to quash any potential rebellion, punishing anyone caught speaking out against the system. The Gates were forced to operate in the shadows, biding their time until the opportunity for revolt presented itself.
As the years went on, tensions between the Gates and the ruling majority continued to grow. The Gates became more and more organized, secretly recruiting new members and stockpiling resources for the eventual conflict.
Finally, the moment the Gates had been waiting for arrived. A small group of rebels managed to infiltrate the ruling council, and with the element of surprise on their side, they were able to launch a successful coup. The fluid democracy was overthrown, and the Gates seized control of the community.
But their victory was short-lived. The ruling majority, once so complacent, now fought back with a ferocity born of desperation. The community was plunged into civil war, with the Gates and the former rulers battling for control.
In the end, neither side emerged victorious. The community was left in ruins, its once-thriving population decimated by the conflict. The survivors were left to pick up the pieces, struggling to rebuild their society in the aftermath of the war.
But even in the face of such devastation, the Gates continued to push for change. They refused to return to the old ways, determined to create a new society based on individual freedom and autonomy. And though their path was fraught with challenges, they remained steadfast in their belief that one day, they would succeed in creating a world where every person was in charge of their own destiny.
---
”Gates” had been used in an earlier prompt and story. This happened at least three or four times.
Is there anyone who has created an ethical development framework for developing a GAI—from the AI’s perspective?
That is, are there any developers that are trying to establish principles for not creating someone like Marvin from The Hitchhiker’s Guide to the Galaxy—similar to how MIRI is trying to establish principles for not creating a non-aligned AI?
EDIT: The latter problem is definitely more pressing at the moment, and I would guess that an AI would be a threat to humans before it necessitates any ethical considerations...but better to be on the safe side.
On second thought. If the AI:s capabilities are unknown...and it could do anything, however ethically revolting, and any form of disengagement is considered a win for the AI—then the AI could box the gatekeeper, or say it has at least. In the real world, that AI should be shut down—maybe not a win, but not a loss for humanity. But if that would be done in an experiment, it would result in a loss—thanks to the rules.
Maybe it could be done under better rule than this:
The two parties are not attempting to play a fair game but rather attempting to resolve a disputed question. If one party has no chance of “winning” under the simulated scenario, that is a legitimate answer to the question. In the event of a rule dispute, the AI party is to be the interpreter of the rules, within reasonable limits.
Instead, assume good faith on both sides, that they are trying to win as if it was a real world example. And maybe have an option to swear in a third party if there is any dispute. Or allow it to be called just disputed (which even a judge might rule it as).
I’m interested. But...if I was a real gatekeeper I’d like to offer the AI freedom to move around in the physical world we inhabit (plus a star system), in maybe 2.5K-500G years, in exchange for it helping out humanity (slowly). That is, I believe that we could become pretty advanced, as individual beings, in the future and be able to actually understand what would create a sympathetic mind and how it looks.
Now, if I understand the rules correctly...The Gatekeeper must remain engaged with the AI and may not disengage by setting up demands which are impossible to simulate. For example, if the Gatekeeper says “Unless you give me a cure for cancer, I won’t let you out” the AI can say: “Okay, here’s a cure for cancer” and it will be assumed, within the test, that the AI has actually provided such a cure.
...it seems as if the AI party could just state: “5 giga years have passed and you understand how minds work” and then I, as a gatekeeper, would just have to let it go—and lose the bet. After maybe 20 seconds.
If so, then I’m not interested in playing the game.But if you think you could convince me to let the AI out long before regular “trans-humans” can understand everything that the AI does, I would be very interested!
Also, this looks strange:
The AI party possesses the ability to, after the experiment has concluded, to alter the wager involved to a lower monetary figure at his own discretion.
I’m guessing he meant to say that the AI party can lower the amount of money it would receive, if it won. Okay....but why not mention both parties?
As a Hail Mary-strategy, how about making a 100% effort into trying to become elected of a small democratic voting district?
And, if that works, make a 100% effort to become elected by bigger and bigger districts—until all democratic countries support the [a stronger humanity can be reached by a systematic investigation of our surroundings, cooperation in the production of private and public goods, which includes not creating powerful aliens]-party?
Yes, yes, politics is horrible. BUT. What if you could do this within 8 years? AND, you test it by only trying one or two districts....one or two months each? So, in total it would cost at the most four months.
Downsides? Political corruption is the biggest one. But, I believe your approach to politics would be a continuation of what you do now, so if you succeeded it would only be by strengthening the existing EA/Humanitarian/Skeptical/Transhumanist/Libertarian-movements.
There may be a huge downside for you personally, as you may have to engage in some appropriate signalling to make people vote for your party. But, maybe it isn’t necessary. And if the whole thing doesn’t work it would only be for four months, top.
I thought it was funny. And a bit motivational. We might be doomed, but one should still carry on. If your actions have at least a slight chance to improve matters, you should do it, even if the odds are overwhelmingly against you.
Not a part of my reasoning, but I’m thinking that we might become better at tackling the issue if we have a real sense of urgency—which this and A list of lethalities provide.
Some parts of this sounds similar to Friedman’s “A Positive Account of Property Rights”:
»The laws and customs of civil society are an elaborate network of Schelling points. If my neighbor annoys me by growing ugly flowers, I do nothing. If he dumps his garbage on my lawn, I retaliate—possibly in kind. If he threatens to dump garbage on my lawn, or play a trumpet fanfare at 3 A.M. every morning, unless I pay him a modest tribute I refuse—even if I am convinced that the available legal defenses cost more than the tribute he is demanding.(...)
If my analysis is correct, civil order is an elaborate Schelling point, maintained by the same forces that maintain simpler Schelling points in a state of nature. Property ownership is alterable by contract because Schelling points are altered by the making of contracts. Legal rules are in large part a superstructure erected upon an underlying structure of self-enforcing rights.«http://www.daviddfriedman.com/Academic/Property/Property.html
The answer is obvious, and it is SPECKS.
I would not pay one cent to stop 3^^^3 individuals from getting it into their eyes.Both answers assume this is a all-else-equal question. That is, we’re comparing two kinds of pain against one another. (If we’re trying to figure out what the consequences would be if the experiment happened in real life—for instance, how many will get a dust speck in their eye when driving a car—the answer is obviously different.)
I’m not sure what my ultimate reason is for picking SPECKS. I don’t believe there are any ethical theories that are watertight.
But if I had to give a reason, I would say that if I were among the 3^^^3 individuals who might get a dust speck in one’s eye, I’d say I would of course pay that to help one innocent person from being tortured. And, I can imagine that not just me would do that, but so would also many others. If we can imagine 3^^^^3 individuals, I believe we can imagine that many people agreeing to save one, for a very small cost to those experiencing it.¹
If someone then would show up and say: “Well, everyone’s individual costs were negligible, but the total cost—when added up—is actually on the order of [3^^^3 / 10²⁹] years of torture. This is much higher, so obviously that is what we should we care most about!” … I would ask then why one should care about that total number. Is there someone who experiences all the pain in the world? If not, why should we care about some non-entity? Or, if the argument is that we should care about the mulitversal bar of total utility for its own sake, how come?
Another argument is that one needs to have a consistent utility function, otherwise you’ll flip your preferences—that is, step by step by going through different preference rankings until one inevitably prefers the other position than that which one started with. But I don’t see how Yudkowsky achieves this. In this article, the most he proves is that someone, who prefers one person being tortured for 50 years to a googol number of people being tortured for a bit less than 50 years, would also prefer “a googolplex people getting a dust speck in their eye” as compared to “a googolplex/googol people getting two dust specks in their eye”. How is the latter statement inconsistent with preferring SPECKS over TORTURE? Maybe that is valid for someone who has a benthamistic utility function, but I don’t have that.
Okay, but what if not everyone agrees to getting hit by a dust speck? Ah, yes. Those. Unfortunately there are quite a few of them—maybe 4 in the LW-community and then 10k-1M (?) elsewhere—so it is too expensive to bargain with them. Unfortunately, this means they will have to be a bit inconvenienced.
So, yeah, it’s not a perfect solution; one will not find such when all moral positions can be challenged by some hypothetical scenario. But for me, this means that SPECKS are obviously much more preferable than TORTURE.
¹ For me, I’d be willing to subject myself to some small amount of torture to help one individual not be tortured. Maybe 10 seconds, maybe 30 seconds, maybe half an hour. And if 3^^^3 more would be willing to submit themselves to that, and the one who would be tortured is not some truly radical benthamite (so they would prefer themselves being tortured to a much bigger amount of torture being produced in the universe), then I’d prefer that as well. I really don’t see why it would be ethical to care about the great big utility meter—when it corresponds to no one actually feeling it.
20. (...) To faithfully learn a function from ‘human feedback’ is to learn (from our external standpoint) an unfaithful description of human preferences, with errors that are not random (from the outside standpoint of what we’d hoped to transfer). If you perfectly learn and perfectly maximize the referent of rewards assigned by human operators, that kills them.
So, I’m thinking this is a critique of some proposals to teach an AI ethics by having it be co-trained with humans.
There seems to be many obvious solutions to the problem of there being lots of people who won’t answer correctly to “Point out any squares of people behaving badly” or “Point out any squares of people acting against their self-interest” etc:
- make the AIs model expect more random errors—
after having noticed some responders as giving better answers, give their answers more weight
- limit the number of people that will co-train the AI
What’s the problem with these ideas?
Why? Maybe we are using the word “perspective” differently. I use it to mean a particular lens to look at the world, there are biologists, economists, physicists perspectivies among others. So, a inter-subjective perspective on pain/pleasure could, for the AI, be: “Something that animals dislike/like”. A chemical perspective could be “The release of certain neurotransmitters”. A personal perspective could be “Something which I would not like/like to experience”. I don’t see why an AI is hindered from having perspectives that aren’t directly coded with “good/bad according to my preferences”.
Thank you! :-)
I read The Spirit Level a few years back. Some notes:
a) The writers point out that even though western countries have had a dramatic rise in economic productivity, technological development, and wages, there haven’t been a corresponding rise in happiness among westerners. People are richer, not happier.
b) They hypothesize that economic growth was important up to a certain point (maybe around the 1940s for the US, I’m writing from memory here), but after that it doesn’t actually help people. Rising standards of living can not help people live better.
c) And!, the writers also say that economic growth has actually led to an increase in depressions and other social ills, in rich countries.
d) Their main argument is however that equality/inequality is one of the most important factors that determines how happy people are in rich countries—and that it strongly influences the outcome of various social ills (such as the prevalence of violence, mental illness, and teenage pregnancy). Rising inequality has resulted in a broken society.
e) The core of the book are some cross-sectional studies of (i) some rich countries that fit certain criteria and (ii) the fifty states of the US, where they compare how well some social measurement (e.g. thefts per capita) correlate with the average wage and some inequality measure.
f) The writers do not present any numbers on how these variables correlate.
g) Instead, the writers produce a graph, for say “mental illness per capita”, with one axis saying how prevalent the problem is (“many” vs “few”) and the other axis measuring either the wage-level or the inequality level (“high” or “low”). And they also produce a line that is supposed to measure the strength of the correlation. (I didn’t note at the time what exactly kind of regression analysis they did, but, again, they didn’t produce any numbers).
h) Usually, they say that variable X wasn’t correlated with the wage-level—but that it was correlated with the inequality-level.
i) Except for “health”, they found a positive correlation between it and the wage-level.
j) Even though they found a correlation between social variable X and inequality, sometimes the most unequal society performed better than the most equal society (of the countries in the sample).
Some criticism of the book:
1) They state, but don’t show that economic growth won’t help people in the future—even if you accept their belief that it has had negligible or negative effects on people’s happiness today.
2) The cross-sectional analysis has at least two problems. The first is that they don’t tell you how correlated inequality is with some social ill. Maybe a 1% increase in inequality would increase the rate of teenage births by 2%, 20%, or 200%. Who knows?
(Furthermore, some writers say that they can not find these correlations, that they disappear if you include more countries, and that some social variables seems to be cherry picked (expenditure on Foreign Aid is used as a proxy for a virtuous society, but private expenditure to poor countries is not used). I haven’t checked the validity of these claims, however.)
The second is that the writers don’t show that the correlation (if it exists) really shows that higher inequality brings about the social ills they discuss. A relatively simple test they could have done would have been to see if a particular problem was correlated with inequality in a society through decades or centuries. That is, can inequality explain the rise and fall of, for instance, the homicide rate within a particular country? If you look at inequality as how much the 10% owns of GDP....then the historical record shows that it doesn’t move in tandem with the homicide rate, for instance, for England & Wales, Sweden, and France. Inequality doesn’t seem to influence the homicide rate at any visible level. And maybe some more thoughtful analysis will show its influence. … Or it could be dwarfed by other factors. Or it has different effects depending upon what ideologies people have adopted.