Cool puzzle. (I’ve wrote like 4 versions of this comment each time changing explanation and conclusions and each time realizing I am still confused).
Now, I think the problem is that we don’t pay much attention to:
What should one do when one has drawn a red ball?
(Yeah, I strategically use word “one” instead of “I” to sneak assumption that everyone should do the same thing)
I know, it sounds like an odd question, because, the way the puzzle is talked about, I have no agency when I got a red ball, and I can only wait in despair as the owners of green balls make their moves.
And if you imagine a big 2-dimensional array where each of 100 columns is an iteration of a game, and each of 20 rows is a player, and look at an individual row (a player) then, we’d expect, say 50 columns to be “mostly green”, of them roughly 45 have the player “has drawn green” cell, and 50 columns to be “mostly red”, with 5 of them having “has drawn green”. If you focus just on those 45+5 columns, and note that 45:5 is 0.9:0.1, then yeah, indeed the chance that the column is “mostly green” given “I have drawn green” is 0.9.
AND coincidentally, if you only focus on those 45+5 columns, it looks like to optimize the collective total score limited to those 45+5 columns, the winning move is to take the bet, because then you’ll get 0.9*12-0.1*52 dollars.
But what about the other 50 columns??
What about the rounds in which that player has chosen “red”?
Turns out they are mostly negative. So negative, that it overwhelms the gains of the 45+5 columns.
So, the problem is that when thinking about the move in the game, we should not think about
1. “What is the chance one is in mostly green column if one has a green ball?” (to which the answer is 90%)
but rather:
2. “What move should one take to maximize overall payout when one has a green ball?” (to which the answer is: pass)
and that second question is very different from:
3. “What move should one take to maximize payout limited just to the columns in which they drew a green ball when seeing a green ball?” (to which the answer is: take the bet!)
The 3. question even though it sounds very verbose (and thus weird) is actually the one which was mentally substituted (by me, and I think most people who see the paradox?) naturally when thinking about the puzzle, and this is what leads to paradox.
The (iterated) game has 45+5+50 columns, not just 45+5, and your strategy affects all of them, not just the 45+5 where you are active.
How can that be? Well, I am not good at arguing this part, but to me it feels natural, that if rational people are facing same optimization problem, they should end up with same strategy, so whatever I end up doing I should expect that others will end up doing it too, so I should take that into account when thinking what to do.
It still feels feel a bit strange to me mathematically, that a solution which seems to be optimal for 20 various different subsets (each having 45+5 columns) of 100 columns individually, is somehow not optimal for the whole 100 columns.
The intuition for why it is possible is that a column which has 18 green fields in it, will be included in 18 sums, and a column which has just 2 green fields in it will be counted in just 2 of them, so this optimization process, focuses too much on the “mostly green” columns, and neglects those “mostly red”.
Is it inconsistent to at the same time think:
”The urn is mostly green with ppb 90%” and
”People who think urn is mostly green with ppb 90% should still refuse the bet which pays $12 vs $-52“?
It certainly sounds inconsistent, but what about this pair of statements in which I’ve only changed the first one:
”The urn is mostly green with ppb 10%” and
”People who think urn is mostly green with ppb 90% should still refuse the bet which pays $12 vs $-52?”
Hm, now it doesn’t sound so crazy, at least to me.
And this is something a person who has drawn a red ball could think.
So, I think the mental monologue of someone who drew a green ball should be:
”Yes, I think that the urn is mostly green with ppb 90%, by which I mean, that if I had to pay -lg(p) Bayes points when it turns out to be mostly green, and -lg(1-p) if it isn’t, then I’d choose p=0.9. Like, really, if there is a parallel game with such a rules, I should play p=0.9 in it. But still, in this original puzzle game, I should pass, because whatever I’ll do now, is whatever people will tend to do in cases like this, and I strongly believe that “People who think urn is mostly green with ppb 90% should still refuse the bet which pays $12 vs $-52”, because I can see how this strategy optimizes the payoff in all 100 columns, as opposed to just those 5+45 I am active in. The game in the puzzle doesn’t ask me what I think the urn contained, nor for a move which optimizes the payoff limited to the rounds in which I am active. The game asks me: what should be the output of this decisions process so that the sum over all 100 columns is the largest. To which the answer is: pass”.
qbolec
[Question] A Simulation of Automation economics?
Alignment ideas
How to combine this with the fact that “the nudge” apparently doesn’t work https://phys.org/news/2022-08-nudge-theory-doesnt-evidence-future.html ?
≈ 41,000
Why not ?
The way I understood the story, to define a function on two numbers from I need to fill-in a table with 59*59 cells, by picking for each cell a number from . If 20% of it is still to be filled, then there are 0.2*59*59 decisions to be made, each with 59 possibilities.
Right?
Thank you for heads up!
Could you please clarify for parents like me, who don’t fully understand Minecraft’s ecosystem and just want their kids to stay safe:
1. If my kids only use Minecraft downloaded from the Microsoft Store, and only ever downloaded content from the in-game marketplace—what’s the chance they are affected?2. Am I right in thinking that “mods” = “something which modifies/extends the executable”, while “add-ons”=”more declarative content which just interacts with existing APIs, like maps, skins, and configs”?
3. Am I right that “Minecraft from Micosoft Store” + “content from in-game marketplace” would translate to “Bedrock Edition” + “add-ons”?
4. Am I right that the fractureiser affects “Java Edition” + “mods” only?
Upon seeing the title (but before reading the article) I thought it might be about a different hypothetical phenomenon: one in which an agent which is capable of generating very precise models of reality might completely lose any interest in optimizing reality whatsover—after all it never (except “in training” which was before “it was born”) cared about optimizing the world—it just executes some policy which was adaptive during training to optimize the world, but now, these are just some instincts/learned motions, and if it can execute them on a fake world in his head, it might be easier to feel good for it.
For consider: porn. Or creating neat arrangements of buildings when playing SimCity. Or trying to be polite to characters in Witcher. We, humans, have some learned intuitions on how we want the world to be, and then try to arrange even fake worlds in this way, even if this disconnected from real world outside. And we take joy from it.
Can it be, that a sufficiently advanced AGI will wire-head in this particular way: by seeing no relevant difference between atomic-level model of reality in its head and atomic-level world outside?
Babble on growing trust
Thanks for clarifying! I agree the twitter thread doesn’t look convincing.
IIUC your hypothesis, then translating it to AI Governance issue, it’s important to first get general public on your side, so that politicians find it in their interest to do something about it.
If so, then perhaps meanwhile we should provide those politicians with a set of experts they could outsource the problem of defining the right policy to? I suspect politicians do not write rules themselves in situations like that, they rather seek people considered experts by the public opinion? I worry, that politicians may want to use this occasion to win something more than public support, say money/favor from companies, and hence pick not the right experts/laws—hence perhaps it is important to not only work on public perception of the threat but also on who the public considers experts?
Why? (I see several interpretations of your comment)
What did it take to ban slavery in Britain:
TL;DR: Become the PM and propose laws which put foot in the door, by banning bad things in the new areas at least, and work from there. Also, be willing to die before seeing the effects
Source: https://twitter.com/garius/status/1656679712775880705
What does it take to ban a thing?
I agree that my phrasing was still problematic, mostly because it seems to matter if she said something spontaneously or as a response to a specific question. In the first case, one has to consider how often people feel compelled to say some utterance in various life scenarios. So for example in case one has two boys the utterance “i have to pick up Johny from kindergarten” might have to compete with “i have to pick up Robert from kindergarten” and might be strange/rare if both are in similar age and thus both should be picked up etc. Still, I think that without knowing much about how people organize their daily routines, my best bet for the question “does she have two boys?” would be 33%.
It’s get funnier with “i have to pick up my younger one, John from kindergarten” :)
I guess what confuses some people is the phrase “the other one” which sounds like denoting a specific (in terms of SSN) child while it’s not at all clear what that could even mean in case of two boys. I think step one when being confused is to keep rephrasing the puzzle until everything is well defined/clear. For me it would be something like:
My friend has two kids, and I don’t initially know anything about their sex beyond nation level stats which are fifty-fifty. She says something which makes it clear she has at least one boy, but in such a way that it just prohibits having two girls without hinting at all if these are two boys or one, perhaps something like “i have to pick up Johny from kindergarten”. How much should I bet she actually has two boys vs a boy and a girl?
I’d expect that IF there is a shoggoth behind the mask THEN it realises the difference between text interaction (which is what the mask is doing) and actually influencing the world (which the shoggoth might be aiming at). That is I expect it’s perfectly possible that an LLM will behave perfectly ethical when playing choose your own adventure at the same time thinking how to hack the VM it’s running on.
Thanks, fixed. I guess this is not why it got −18 votes, though. I would like to hear what exactly people didn’t like in this post
4 generations of alignment
ChatGPT’s answer:
Your two assumptions and intuitions are plausible, but they may not hold true in every case. It is important to consider the specific context and motivations of individual rulers when making predictions about their behavior.
Regarding your first intuition, it is possible that some rulers may support the development of powerful AGI if they see it as a means to achieve their goals more efficiently. However, they may also take precautions to ensure that the AGI is under their control and cannot threaten their power.
Regarding your second intuition, rulers may not necessarily get rid of humans as soon as AI can achieve goals more efficiently. Human labor can still be cheaper or more reliable in certain contexts, and rulers may also want to maintain the appearance of caring about human welfare for political reasons.
To make your intuitions more testable, you could formulate specific hypotheses and predictions that can be observed or measured. For example, you could predict that rulers of authoritarian regimes will be more likely to suppress the development of powerful AGI than rulers of democracies, or that companies with highly centralized decision-making structures will be more resistant to AI-driven automation than companies with more decentralized structures. You could then collect data to test these hypotheses, such as by analyzing government policies or conducting surveys of corporate leaders.
(I am a bit worried by this given that China seems to restrict AIs more than US...)
Me: Thanks, the idea with checking what authoritarian regimes are doing today, was very helpful. Can you suggest more operationalisations of my theses?
GPT:
Sure, here are some possible operationalizations of your intuitions:Operationalizing your first intuition:
Hypothesis: Rulers will be more likely to suppress the development of powerful AGI if they perceive it as a threat to their power.
Prediction: Countries with more authoritarian governments will have fewer AI research projects and investments than countries with more democratic governments.
Operationalization: Use publicly available data on AI research funding and publications by country, as well as measures of political freedom and democracy, to test the relationship between government type and AI development.
Operationalizing your second intuition:
Hypothesis: Rulers will be more likely to replace human workers with AI when it is cost-effective and does not threaten their power.
Prediction: Companies with highly centralized decision-making structures will be slower to adopt AI-driven automation than companies with more decentralized structures.
Operationalization: Survey companies to collect data on their decision-making structures and level of AI adoption, and test the relationship between these variables using statistical analysis.
Operationalizing your conclusion:
Hypothesis: Rulers will prioritize their own interests over the well-being of their constituents.
Prediction: Countries with high levels of corruption will be less likely to invest in social welfare programs and more likely to invest in projects that benefit the ruling elite.
Operationalization: Use measures of corruption and social welfare spending, as well as data on government contracts and investments, to test the relationship between corruption and government priorities.
I like how ChatGPT can help in operatinalizing fuzzy intuitions. I feel an eerie risk that it makes me think even less, and less carefully, and defer to AIs wisdom more and more… it’s very tempting … as if finding an adult who you can cede control to.
I think the framework from “Dictator’s Handbook” can be applied: citizens get as much freedom an benefits as is (short-term) optimal for the rulers. For example, if a country needs skilled labor and transportation to create tax revenue, then you can predict the govt will fund schools, roads and maybe even hospitals. OTOH if the country has rich deposits of gold located near the ports, then there’s no need for any of that.
Since reading this book I am also very worried by scenarios of human disempowerment. I’ve tried to ask some questions around it:
Can homo-sapiens sustain an economy parallel to AIs?
How politics interacts with AI? (for some reason: negative 18 votes)
I wonder if this is somehow harder to understand for citizens of USA, than for someone from a country which didn’t care about its citizens at all. For example, after Lukashenko was “elected” in Belarus, people went to the streets to protest, yet, this didn’t make any impression on the rulers. They didn’t have any bargaining power, it seems.