They provide more surprising information, as I understand
avturchin
For an unaligned AI, it is either simulating alternative histories (which is the focus of this post) or creating material for blackmail.
For an aligned AI:
a) It may follow a different moral theory than our version of utilitarianism, in which existence is generally considered good despite moments of suffering.
b) It might aim to resurrect the dead by simulating the entirety of human history exactly, ensuring that any brief human suffering is compensated by future eternal pleasure.
c) It could attempt to cure past suffering by creating numerous simulations where any intense suffering ends quickly, so by indexical uncertainty, any person would find themselves in such a simulation.
I don’t think both list compensate each other: take, for example, medicine: there are 1000 ways to die and 1000 ways to be cured – but we eventually die.
I meant that if I know only the total number of the seconds which passed from the beginning of the year (around 15 million for today of this year) – and I want to predict the total number of seconds in each year. No information about months.
As most people are born randomly and we know it, we can use my date of birth as random. If we have any suspicions about non randomness, we have to take them into account.
After the AI war, there will be one AI winner and Singleon, which has all the same risk of causing s-risks, at first approximation. So AI war just adds probability to any s-risk chance from Singleton.
It gives additional meaning to pause AI movement – simulation has to wait.
What interesting ideas can we suggest to the Paperclipper simulator so that it won’t turn us off?
One simple idea is a “pause AI” feature. If we pause the AI for a finite (but not indefinite) amount of time, the whole simulation will have to wait.
Trying to break out of simulation is a different game than preventing x-risks in base world, and may have even higher utility if we expect almost inevitable extinction.
This is true only if we assume that a base reality for our civilization exists at all. But knowing that we are in a simulation shifts the main utility of our existence, which Nesov wrote about above.
For example, if in some simulation we can break out, this would be a more important event than what is happening in the base reality where we likely go extinct anyway.
And as the proportion of simulations is very large, even a small chance to break away from inside a simulation, perhaps via negotiation with its owners, has more utility than focusing on base reality.
This post by EY is about breaking out of a simulationhttps://www.lesswrong.com/posts/5wMcKNAwB6X4mp9og/that-alien-message
I think your position can be oversimplified as follows: ‘Being in a simulation’ makes sense only if it has practical, observable differences. But as most simulations closely match the base world, there are no observable differences. So the claim has no meaning.
However, in our case, this isn’t true. The fact that we know we are in a simulation ‘destroys’ the simulation, and thus its owners may turn it off or delete those who come too close to discovering they are in a simulation. If I care about the sudden non-existence of my instance, this can be a problem.
Moreover, if the alien simulation idea is valid, they are simulating possible or even hypothetical worlds, so there are no copies of me in base reality, as there is no relevant base reality (excluding infinite multiverse scenarios here).
Also, being in an AI-testing simulation has observable consequences for me: I am more likely to observe strange variations of world history or play a role in the success or failure of AI alignment efforts.
If I know that I am simulated for some purpose, the only thing that matters is what conclusions I prefer the simulation owners will make. But it is not clear to me now, in the case of an alien simulation, what I should want.
One more consideration is what I call meta-simulation: a simulation in which the owners are testing the ability of simulated minds to guess that they are in a simulation and hack it from inside.
TLDR: If I know that I am in simulation, simulation+owners is my base reality that matters.
I want to share a few considerations:
AI war may eventually collapse to two blocks fighting each other – S.Lem wrote about this in 1959.
AI war makes s-risks more likely as non-aligned AI may take humans hostage to influence aligned AI.
AI war may naturally evolve as a continuation of the current drone warfare with automated AI-powered control systems.
I think that SIA is generally* valid but it uses all its power to prove that I live in the infinite universe where all possible observers exist. After that we have to use SSA to find in which region of the multiverse I am more likely to be located.
*I think that the logically sound version of SIA is “if I am in a unique position, generated by some random process, then there were many attempts to create me” – like many earth-like-but-lifeless planets are in the galaxy.
Another point is that the larger number of short civilizations can compensate for their “shortness.” We can live in the region of multiverse where there are many short civilizations and almost all of them die off.
May be we better take equation (2) from the original Gott’s work https://gwern.net/doc/existential-risk/1993-gott.pdf:
1 / 3 t < T < 3t with 50 per cent confidence,
in which T is the total number of buses and t is the number of buses above observed bus number T0. In our case, T is between 2061 and 6184 with 50 per cent probability.
It is a correct claim, and saying that the total number of buses is double of the observed bus number is an oversimplification of that claim which we use only to point in the direction of the full Gott’s equation.
There is a way to escape this by using the universal doomsday argument. In it, we try not to predict the exact future of the Earth, but the typical life expectancy of Earth-like civilizations, that is, the proportion of long civilizations to short ones.
If we define a long civilization as one which has 1000 times more observers, the fact that we find ourselves early means that short civilizations are at least 1000 times more numerous.
In short, it is SSA, but applied to a large set of civilizations.
In last line there should be
therefore p(city has less than 2992 buses | bus has number 1546) = 0.5
For example, If I use self-sampling to estimate the number of seconds in the year, I will get a correct answer of around several tens of millions. But using word generator will never output a word longer than 100 letters.
I didn’t understand your idea here:
It’s not more wrong for a person whose parents specifically tried to give birth at this date than for a person who just happened to be born at this time without any planning. And even in this extreme situation your mistake is limited by two orders of magnitude. There is no such guarantee in DA.
Gott started this type of confusion than he claimed that Berlin wall will stay 14 more years and it actually did exactly that. A better claim would be “first tens of hours with some given credence”
It was discussed above in comments – see buses example. In short, I actually care about periods, and 50 per cent is for “between 15 and 30” hours and other 50 per cent is for “above 30 hours”.
Using oneself as a random sample is a very rough way to get an idea about what order of magnitude some variable is. If you determine that the day duration is 2 hours, it is still useful information as you know almost for sure now that it is not 1 millisecond or 10 years. (And if I perform 10 experiments like this, one on average will be an order of magnitude off). We can also adjust the experiment by taking into account that people are sleeping at night, so they read LW only during the day, evening, or early morning. So times above 12 or below 2 are more likely.
You are right that the point of the experiments here is not to learn the real time of the day, but to prove that I can treat myself as a random sample in general and after that use this idea in domains where I do not have any information.
I think the basis to treat myself as a random sample is the following:
I am (or better to say my properties are) randomly selected from the LW-readers population.
There is some bias in that selection but I assume that it is not large and I still can get the order of magnitude right even if I do not calculate the exact bias.
The sample size is sufficient if I want to learn the order of magnitude of some variable or if the difference between two hypotheses is sufficiently large. (If I take only one ball from a vase with 1000 balls, of which only one is green and 999 red, or from an alternative vase with 999 green and one red, I can identify the vase with high credence.)
The main AI safety risk is not from LLM models, but from specific prompts and the following “chat windows” and specific agents which start from such prompts.
Moreover, a powerful enough prompt may be model-agnostic. For example, my sideloading prompt is around 200K tokens in its minimal version and works on most models, producing similar results in similarly intelligent models.
Self-evolving prompt can be written; I experimented with small versions, and it works.