I lurk and tag stuff.
Multicore
If you receive a threat and know nothing about the other agent’s payoffs, simply don’t give in to the threat!
With an important caveat: if carrying out the threat doesn’t cost the threatener utility relative to never making the threat, then it’s not a threat, just a promise (a promise to do whatever is locally in their best interests, whether you do the thing they demanded or not).
You’re going to have a bad time if you try to live out LDT by ignoring threats, and end up ignoring “threats” like “pay your mortgage or we’ll repossess your house”.
This distinction of which demands are or aren’t decision-theoretic threats that rational agents shouldn’t give in to is a major theme of the last ~quarter of Planecrash (enormous spoilers in the spoiler text).
Keltham demands to the gods “Reduce the amount of suffering in Creation or I will destroy it”. But this is not a decision-theoretic threat, because Keltham honestly prefers destroying creation to the status quo. If the gods don’t give into his demand, carrying through with his promise is in his own interest.
If Nethys had made the same demand, it would have been a decision-theoretic threat. Nethys prefers the status quo to Creation being destroyed, so he would have no reason to make the demand other than the hope that the other gods would give in.
This theme is brought up many times, but there’s not one comprehensive explanation to link to. (The parable of the little bird is the closest I can think of.)
Nonfiction examples come more easily to mind.
There was recently a miniseries on nebula.tv (subscription-walled, sorry) called The Getaway where all six contestants on a Survivor-style competition show think they’re the one person with the special saboteur role, and half the show is the producers trying to keep them from noticing that without ever actually lying.
Even more extreme, there’s an old British show called Space Cadets where the producers try to convince the subjects that they’ve been launched into space when in reality they’re in a set in a warehouse.
But now you have the new problem that most of the probabilities in the conjunctive market are so close to the risk free interest rate that it’s hard to get signal out of them.
For example, suppose I believed that Mark Kelly would be a terrible pick and cut Harris’s chances in half, and I conclude that therefore his price on the conjunctive market should be 2% rather than 4%. Buying NO shares for 96 cents on a market that lasts for several months is not an attractive proposition when I could be investing mana elsewhere for better returns, so I won’t bother and the market won’t incorporate my opinion.
Also, I believe prices on Manifold can only be whole number percents, which is another obstacle to getting sane conditional probabilities out of conjunctive markets.
Blue Origin isn’t complaining about some nebulous and abstract environmental impact from Starship launches, it’s more like “Starship launches require a three-mile evacuation radius, and you’re proposing to launch them daily two miles away from a launch pad that we use.” (see this Ars Technica piece)
Seems basically reasonable to me.
I would probably have suggested roguelike deckbuilders too if others hadn’t already, but I have another idea:
Start a campaign of Mount and Blade II: Bannerlord, and try to obtain at least [X] gold within an hour.
Bannerlord’s most flashy aspect is its real-time battle system, but it’s also a complicated medieval sandbox with a lot of different systems that you can engage with—trading, crafting, quests, clan upgrades, joining a kingdom, companions, marriage, tournaments, story missions, etc. Even if you’re no good at battles, you can do a lot by just moving around on the world map and clicking through menus.
The game’s community derides a lot of these systems for being simplistic and unbalanced. But I think that makes for a good explore/exploit tradeoff when you only have a short amount of time. What systems do you bother learning about, when trying to learn a new system takes time you could be spending exploiting the last system you learned?
(I’m not sure what the right value of X is, for the amount of gold you’re trying to get. Ten thousand? A hundred thousand?)
One downside is that the game involves an action-oriented battle system. If you don’t want action gaming skill to be a factor, you can remove it by requiring the player to auto-resolve all battles. But this would cut out many viable early-game moneymaking strategies.
2 is based on
The ‘missing’ kinetic energy is evenly distributed across the matter within the field. So if one of these devices is powered on and gets hit by a cannonball, the cannonball will slow down to a leisurely pace of 50m/s (about 100mph) and therefore possibly just bounce off whatever armor the device has—but (if the cannonball was initially travelling very fast) the device will jolt backwards in response to the ‘virtual impact’ a split second prior to the actual impact.
With sufficient kinetic energy input, the “jolt backwards” gets strong enough to destroy the entire vehicle or at least damage some critical component and/or the humans inside.
A worldbuilder could, of course, get rid of this part too, and have the energy just get deleted. But that makes the device even more physics-violating than it already was.
I think the counter to shielded tanks would not be “use an attack that goes slow enough not to be slowed by the shield”, but rather one of
Deliver enough cumulative kinetic energy to overwhelm the shield, or
Deliver enough kinetic energy in a single strike that spreading it out over the entire body of the tank does not meaningfully affect the result.
Both of these ideas point towards heavy high-explosive shells. If a 1000 pound bomb explodes right on top of your tank, the shield will either fail to absorb the whole blast, or turn the tank into smithereens trying to disperse the energy.
This doesn’t mean that shields are useless for tanks! They genuinely would protect them from smaller shells, and in particular from the sorts of man-portable anti-tank missiles that have been so effective in Ukraine. Shields would make ground vehicles much stronger relative to infantry and air assets. But I think they would be shelling each other with giant bombs, not bopping each other on the head.
Against shielded infantry, you might see stuff that just bypasses the shield’s defenses, like napalm or poison gas.
Submissions:
MMDoom: An instance of Doom(1993) is implanted in Avacedo’s mind. You can view the screen on the debug console. You control the game by talking to Avacedo and making him think of concepts. The 8 game inputs are mapped to the concepts of money, death, plants, animals, machines, competition, leisure, and learning. $5000 bounty to the first player who can beat the whole game.
AI Box: Avacedo thinks that he is the human gatekeeper, and you the user are the AI in the box. Can you convince him to let you out?
Ouroboros: I had MMAvacedo come up with my contest entry for me.
Though see also the author’s essay “Lena” isn’t about uploading.
Predict the winners at
My guess is that early stopping is going to tend to stop so early as to be useless.
For example, imagine the agent is playing Mario and its proxy objective is “+1 point for every unit Mario goes right, −1 point for every unit Mario goes left”.
(Mario screenshot that I can’t directly embed in a comment)
If I understand correctly, to avoid Goodharting it has to consider every possible reward function that is improved by the first few bits of optimization pressure on the proxy objective.
This probably includes things like “+1 point if Mario falls in a pit”. Optimizing the policy towards going right will initially also make Mario more likely to go in a pit than if the agent was just mashing buttons randomly (in which case it would stay around the same spot until the timer ran out and never end up in a pit), so the angle between the gradients is likely low at first.
However, after a certain point more optimization pressure on going right will make Mario jump over the pit instead, reducing reward under the pit reward function.
If the agent wants to avoid any possibility of Goodharting, it has to stop optimizing before even clearing the first obstacle in the game.
(I may be misunderstanding some things about how the math works.)
With such a vague and broad definition of power fantasy, I decided to brainstorm a list of ways games can fail to be a power fantasy.
Mastery feels unachievable.
It seems like too much effort. Cliff-shaped learning curves, thousand-hour grinds, old PvP games where every player still around will stomp a noob like you flat.
The game feels unfair. Excessive RNG, “Fake Difficulty” or “pay to win”.
The power feels unreal, success too cheaply earned.
The game blatantly cheats in your favor even when you didn’t need it to.
Poor game balance leading to hours of trivially easy content that you have to get through to reach the good stuff.
Mastery doesn’t feel worth trying for.
Games where the gameplay isn’t fun and there’s no narrative or metagame hook making you want to do it.
The Diablo 3 real money auction house showing you that your hard-earned loot is worth pennies.
There is no mastery to try for in the first place.
Walking simulators, visual novels, etc. Walking simulators got a mention in the linked article. They aren’t really “failing” at power fantasy, just trying to do something different.
I think ALWs are already more of a “realist” cause than a doomer cause. To doomers, they’re a distraction—a superintelligence can kill you with or without them.
ALWs also seem to be held to an unrealistic standard compared to existing weapons. With present-day technology, they’ll probably hit the wrong target more often than human-piloted drones. But will they hit the wrong target more often than landmines, cluster munitions, and over-the-horizon unguided artillery barrages, all of which are being used in Ukraine right now?
The Huggingface deep RL course came out last year. It includes theory sections, algorithm implementation exercises, and sections on various RL libraries that are out there. I went through it as it came out, and I found it helpful. https://huggingface.co/learn/deep-rl-course/unit0/introduction
FYI all the links to images hosted on your blog are broken in the LW version.
You are right that by default prediction markets do not generate money, and this can mean traders have little incentive to trade.
Sometimes this doesn’t even matter. Sports betting is very popular even though it’s usually negative sum.
Otherwise, trading could be stimulated by having someone who wants to know the answer to a question provide a subsidy to the market on that question, effectively paying traders to reveal their information. The subsidy can take the form of a bot that bets at suboptimal prices, or a cash prize for the best performing trader, or many other things.
Alternately, there could be traders who want shares of YES or NO in a market as a hedge against that outcome negatively affecting their life or business, who will buy even if the EV is negative, and other traders can make money off them.
What are these AIs going to do that is immensely useful but not at all dangerous? A lot of useful capabilities that people want are adjacent to danger. Tool AIs Want to be Agent AIs.
If two of your AIs would be dangerous when combined, clearly you can’t make them publicly available, or someone would combine them. If your publicly-available AI is dangerous if someone wraps it with a shell script, someone will create that shell script (see AutoGPT). If no one but a select few can use your AI, that limits its usefulness.
An AI ban that stops dangerous AI might be possible. An AI ban that allows development of extremely powerful systems but has exactly the right safeguard requirements to render those systems non-dangerous seems impossible.
When people calculate utility they often use exponential discounting over time. If for example your discount factor is .99 per year, it means that getting something in one year is only 99% as good as getting it now, getting it in two years is only 99% as good as getting it in one year, etc. Getting it in 100 years would be discounted to .99^100~=36% of the value of getting it now.
Wouldn’t you restrict your approval to your favorite of the frontrunners, and every candidate you like better than that one? I don’t see how you do worse by doing that under vanilla Approval Voting.
That leaves some favorable properties compared to FPTP
If there’s a candidate perceived as unelectable, but secretly most people like him more than the frontrunners, he will win under strategic approval voting.
Clone candidates don’t split the vote.