R&Ds human systems http://aboutmako.makopool.com
mako yass
I wonder what the crisis will be.
I think it’s quite likely that if there is a crisis that leads to beneficial response, it’ll be one of these three:
An undeployed privately developed system, not yet clearly aligned nor misaligned, either:
passes the Humanity’s Last Exam benchmark, demonstrating ASI, and the developers go to congress and say “we have a godlike creature here, you can all talk to it if you don’t believe us, it’s time to act accordingly.”
Not quite doing that, but demonstrating dangerous capability levels in red-teaming, ie, replication ability, ability to operate independently, pass the hardest versions of the turing test, get access to biolabs etc. And METR and hopefully their client go to the congress and say “This AI stuff is a very dangerous situation and now we can prove it.”
A deployed military (beyond frontier) system demonstrates such generality that, eg, Palmer Luckey (possibly specifically Palmer Luckey) has to go to congress and confess something like “that thing we were building for coordinating military operations and providing deterrence, turns out it can also coordinate other really beneficial tasks like disaster relief, mining, carbon drawdown, research, you know, curing cancer? But we aren’t being asked to use it for those tasks. So, what are we supposed to do? Shouldn’t we be using it for that kind of thing?” And this could lead to some mildly dystopian outcomes, or not, I don’t think the congress or the emerging post-prime defence research scene is evil, I think it’s pretty likely they’d decide to share it with the world (though I doubt they’d seek direct input from the rest of the world on how it should be aligned)
Some of the crises I expect, I guess, wont be recognized as crises. Boiled frog situations.
A private system passes those tests, but instead of doing the responsible thing and raising the alarm, the company just treats it like a normal release and sells it. (and the die is rolled and we live or we don’t.)
Or crises in the deployment of AI that reinforce the “AI as tool” frame so deeply that it becomes harder to discuss preparations for AI as independent agents:
Automated invasion: a country is successfully invaded, disarmed, controlled and reshaped with almost entirely automated systems, minimal human presence from the invading side. Probable in gaza or taiwan.
It’s hard to imagine a useful policy response to this. I can only imagine this leading to reactions like “Wow. So dystopian and oppressive. They Should Not have done that and we should write them some sternly worded letters at the UN. Also let’s build stronger AI weapons so that they can’t do that to us.”
A terrorist attack or a targeted assassination using lethal autonomous weapons.
I expect this to just be treated as if it’s just a new kind of bomb.
This is interesting. In general the game does sound like the kind of fun I expect to find in these parts. I’d like to play it. It sounds like it really can be played as a cohabitive game, and maybe it was even initially designed to be played that way?[1], but it looks to me like most people don’t understand it this way today. I’m unable to find this manual you quote. I’m coming across multiple reports that victory = winning[2].
Even just introducing the optional concept of victory muddies the exercise by mixing it up with a zero sum one in an ambiguous way. IME many players, even hearing that, will just play this for victory alone, compromise their win condition while pretending not to be doing that in hope of deceiving other players about their agenda, so it becomes hard to plan with them. This wouldn’t necessarily ruin the game but it would lead to a situation where those players are learning bad lessons.
- ^
I’d be curious to know what the original rulebook says, it sounds like it’s not always used today?
- ^
The first review I found (Phasing Player) presents it as a fully zero-sum game, completely declined to mention multi-win outcomes (43 seconds).
- ^
A moral code is invented[1] by a group of people to benefit the group as a whole, it sometimes demands sacrifice from individuals, but a good one usually has the quality that at some point in a person’s past, they would have voluntarily signed on with it. Redistribution is a good example. If you have a concave utility function, and if you don’t know where you’ll end up in life, you should be willing to sign a pledge to later share your resources with less fortunate people who’ve also signed the pledge, just in case you become one of the less fortunate. The downside of not being covered in that case is much larger than the upside of not having to share in the other case.
For convenience, we could decide to make the pledge mandatory and the coverage universal (ie, taxes and welfare) since there aren’t a lot of humans who would decline that deal in good faith. (Perhaps some humans are genuinely convex egoists and wouldn’t sign that deal, but we outnumber them, and accomodating them would be inconvenient, so we ignore them.)
If we’re pure of heart, we could make the pledge acausal and implicit and adhere to it without any enforcement mechanisms, and I think that’s what morality usually is or should be in the common sense.
But anyway, it sometimes seems to me that you often advocate a morality regarding AI relations that doesn’t benefit anyone who currently exists, or, the coalition that you are a part of. This seems like a mistake. Or worse.I wonder if it comes from a place of concern that… if we had public consensus that humans would prefer to retain full control over the lightcone, then we’d end up having stupid and unnecessary conflicts with the AIs over that, while, if we pretend we’re perfectly happy to share, relations will be better? You may feel that as long as we survive and get a piece, it’s not worth fighting for a larger piece? The damages from war would be so bad for both sides that we’d prefer to just give them most of the lightcone now?
And I think stupid wars aren’t possible under ASI-level information technology. If we had the capacity to share information and find out who’d win a war and skip to a surrender deal, doing so always has higher EV for both sides than actually fighting. The reason wars are not skipped that way today is that we still lack the capacity to simultaneously mutually exchange proofs of force capacity, but we’re getting closer to having that every day. Generally, in that era, coexisting under confessed value differences will be pretty easy. Honestly I feel like it already ought to be easy, for humans, if we’d get serious about it.
- ^
Though, as Singer says, much of morality is invented only in the same sense as mathematics is invented, being so non-arbitrary that it seems to have a kind of external observer-independent existence and fairly universal truths, which powerful AIs are likely to also discover. But the moralities in that class are much weaker (I don’t think Singer fully recognises the extent of this), and I don’t believe they have anything to say about this issue.
- ^
Do you believe there’s a god who’ll reward you for adhering to this kind of view-from-nowhere morality? If not, why believe in it?
Jellychip seems like a necessary tutorial game. I sense comedy in the fact that everyone’s allowed to keep secrets and intuitively will try to do something with secrecy despite it being totally wrongheaded. Like the only real difficulty of the game is reaching the decision to throw away your secrecy.
Escaping the island is the best outcome for you. Surviving is the second best outcome. Dying is the worst outcome.
You don’t mention how good or bad they are relative to each other though :) an agent cannot make decisions under uncertainty without knowing that.
I usually try to avoid having to explain this to players by either making it a score game or making the outcomes binary. But the draw towards having more than two outcomes is enticing. I guess in a roleplaying scenario, the question of just how good each ending is for your character is something players would like to decide for themselves. I guess as long as people are buying into the theme well enough, it doesn’t need to be made explicit, in fact, not making it explicit makes it clearer that player utilities aren’t comparable and that makes it easier for people to get into the cohabitive mindset.So now I’m imagining a game where different factions have completely different outcomes. None of them are conquest, nor death. They’re all weird stuff like “found my mother’s secret garden” or “fulfilled a promise to a dead friend” or “experienced flight”.
the hook
I generally think of hookness as “oh, this game tests a skill that I really want to have, and I feel myself getting better at it as I engage with the game, so I’ll deepen my engagement”.
There’s another component of it that I’m having difficulty with, which is “I feel like I will not be rejected if I ask friends to play this with me.” (well, I think I could get anyone to play it once, the second time is the difficult one) And for me I see this quality in very few board games, and to get there you need to be better than the best board games out there, because you’re competing with them, so that’s becoming very difficult. But since cohabitive games rule that should be possible for us.
And on that, I glimpsed something recently that I haven’t quite unpacked. There’s a certain something about the way Efka talks about Arcs here … he admitted that it wasn’t necessarily all fun. It was an ordeal. And just visually, the game looks like a serious undertaking. Something you’d look brave for sitting in front of. It also looks kind of fascinating. Like it would draw people in. He presents it with the same kind of energy as one would present the findings of a major government conspiracy investigation, or the melting of the clathrates. It does not matter whether you want to play this game, you have to, there’s no decision to be made as to whether to play it or not, it’s here, it fills the room.
And we really could bring an energy like that, because I think there are some really grim findings along the path to cohabitive enlightenment. But I’m wary of leaning into that, because I think cohabitive enlightenment is also the true name of peace. Arcs is apparently controversial. I do not want cohabitive games to be controversial.
(Plus a certain degree of mathematician crankery: his page on Google Image Search, and how it disproves AI
I’m starting to wonder if a lot/all of the people who are very cynical about the feasibility of ASI have some crank belief or other like that. Plenty of people have private religion, for instance. And sometimes that religion informs their decisions, but they never tell anyone the real reasons underlying these decisions, because they know they could never justify them. They instead say a load of other stuff they made up to support the decisions that never quite adds up to a coherent position because they’re leaving something load-bearing out.
I don’t think the intelligence consistently leads to self-annihilation hypothesis is possible. At least a few times it would amount to robust self-preservation.
Well.. I guess I think it boils down to the dark forest hypothesis. The question is whether your volume of space is likely to contain a certain number of berserkers, and the number wouldn’t have to be large for them to suppress the whole thing.
I’ve always felt the logic of berserker extortion doesn’t work, but occasionally you’d get a species that just earnestly wants the forest to be dark and isn’t very troubled by their own extinction, no extortion logic required. This would be extremely rare, but the question is, how rare.
Light speed migrations with no borders means homogeneous ecosystems, which can be very constrained things.
In our ecosystems, we get pockets of experimentation. There are whole islands where the birds were allowed to be impractical aesthetes (indonesia) or flightless blobs (new zealand). In the field-animal world, islands don’t exist, pockets of experimentation like this might not occur anywhere in the observable universe.
If general intelligence for field-animals costs a lot, has no immediate advantages (consistently takes say, a thousand years of ornament status before it becomes profitable), then it wouldn’t get to arise. Could that be the case?
We could back-define “ploitation” as “getting shapley-paid”.
Yeah. But if you give up on reasoning about/approximating solomonoff, then where do you get your priors? Do you have a better approach?
Buried somewhere in most contemporary bayesians’ is the solomonoff prior (the prior that the most likely observations are those that have short generating machine encodings) Do we have standard symbol for the solomonoff prior? Claude suggests that is the most common, but is more often used as a distribution function, or perhaps for Komogorov? (which I like because it can also be thought to stand for “knowledgebase”, although really it doesn’t represent knowledge, it pretty much represents something prior to knowledge)
I’d just define exploitation to be precisely the opposite of shapley bargaining, situations where a person is not being compensated in proportion to their bargaining power.
This definition encompasses any situation where a person has grievances and it makes sense for them to complain about them and take a stand, or, where striking could reasonably be expected to lead to a stable bargaining equilibrium with higher net utility (not all strikes fall into this category).
This definition also doesn’t fully capture the common sense meaning of exploitation, but I don’t think a useful concept can.
As a consumer I would probably only pay about 250$ for the unitree B2-W wheeled robot dog because my only use for it is that I want to ride it like a skateboard, and I’m not sure it can do even that.
I see two major non-consumer applications: Street to door delivery (it can handle stairs and curbs), and war (it can carry heavy things (eg, a gun) over long distances over uneven terrain)
So, Unitree… do they receive any subsidies?
Okay if send rate gives you a reason to think it’s spam. Presumably you can set up a system that lets you invade the messages of new accounts sending large numbers of messages that doesn’t require you to cross the bright line of doing raw queries.
Any point that you can sloganize and wave around on a picket sign is not the true point, but that’s not because the point is fundamentally inarticulable, it just requires more than one picket sign to locate it. Perhaps ten could do it.
The human struggle to find purpose is a problem of incidentally very weak integration or dialog between reason and the rest of the brain, and self-delusional but mostly adaptive masking of one’s purpose for political positioning. I doubt there’s anything fundamentally intractable about it. If we can get the machines to want to carry our purposes, I think they’ll figure it out just fine.
Also… you can get philosophical about it, but the reality is, there are happy people, their purpose to them is clear, to create a beautiful life for themselves and their loved ones. The people you see at neurips are more likely to be the kind of hungry, high-achieving professionals who are not happy in that way, and perhaps don’t want to be. So maybe you’re diagnosing a legitimately enduring collective issue (the sorts of humans who end up on top tend to be the ones who are capable of divorcing their actions from a direct sense of purpose, or the types of people who are pathologically busy and who lose sight of the point of it all or never have the chance to cultivate a sense for it in the first place). It may not be human nature, but it could be humanity nature. Sure.
But that’s still a problem that can be solved by having more intelligence. If you can find a way to manufacture more intelligence per human than the human baseline, that’s going to be a pretty good approach to it.
Conditions where a collective loss is no worse than an individual loss. A faction who’s on the way to losing will be perfectly willing to risk coal extinction, and may even threaten to cross the threshold deliberately to extort other players.
Do people ever talk about dragons and dinosaurs in the same contexts? If so you’re creating ambiguities. If not (and I’m having difficulty thinking of any such contexts) then it’s not going to create many ambiguities so it’s harder to object.
I think I’ve been calling it “salvaging”. To salvage a concept/word allows us to keep using it mostly the same, and to assign familiar and intuitive symbols to our terms, while intensely annoying people with the fact that our definition is different from the normal one and thus constantly creates confusion.
Mm, scenario where mass unemployment can be framed as a discrete event with a name and a face.
I guess I think it’s just as likely there isn’t an event, human-run businesses die off, new businesses arise, none of them outwardly emphasise their automation levels, the press can’t turn it into a scary story because automation and foreclosures are nothing fundamentally new (only in quantity, but you can’t photograph a quantity), the public become complicit by buying their cheaper higher quality goods and services so appetite for public discussion remains low.