R&Ds human systems http://aboutmako.makopool.com
mako yass
The people who think of utility in the way the article is critiquing don’t know what utility actually is, presenting a critque of this tangible utility as a critique of utility in general takes the target audience further away from understanding what utility is.
A Utility function is a property of a system rather than a physical thing (like, eg, voltage, or inertia, or entropy). Not being a simple physical substance doesn’t make it fictional.
It’s extremely non-fictional. A human’s utility function encompasses literally everything they care about, ie, everything they’re willing to kill for.
It seems to be impossible for a human to fully articulate what the human utility function is exactly, but that’s just a peculiarity of humans rather than a universal characteristic of utility functions. Other agents could have very simple utility functions, and humans are likely to grow to be able to definitely know their utility function at some point in the next century.
Contemplating an argument that free response rarely gets more accurate results for questions like this because listing the most common answers as checkboxes helps respondents to remember all of the answers that’re true for of them.
I’d be surprised if LLM use for therapy or sumarization is that low irl, and I’d expect people would’ve just forgot to mention those usecases. Hope they’ll be in the option list this year.
Hmm I wonder if a lot of trends are drastically underestimated because surveyers are getting essentially false statistics from the Other gutter.
Apparently Anthropic in theory could have released claude 1 before chatgpt came out? https://www.youtube.com/live/esCSpbDPJik?si=gLJ4d5ZSKTxXsRVm&t=335
I think the situation would be very different if they had.
Were OpenAI also, in theory, able to release sooner than they did, though?
The assumption that being totally dead/being aerosolised/being decayed vacuum can’t be a future experience is unprovable. Panpsychism should be our null hypothesis[1], and there never has and never can be any direct measurement of consciousness that could take us away from the null hypothesis.
Which is to say, I believe it’s possible to be dead.
- ^
the negation, that there’s something special about humans that makes them eligible to experience, is clearly held up by a conflation of having experiences and reporting experiences and the fact that humans are the only things that report anything.
- ^
I have preferences about how things are after I stop existing. Mostly about other people, who I love, and at times, want there to be more of.
I am not an epicurean, and I am somewhat skeptical of the reality of epicureans.
It seems like you’re assuming a value system where the ratio of positive to negative experience matters but where the ratio of positive to null (dead timelines) experiences doesn’t matter. I don’t think that’s the right way to salvage the human utility function, personally.
Okay? I said they’re behind in high precision machine tooling, not machine tooling in general. That was the point of the video.
Admittedly, I’m not sure what the significance of this is. To make the fastest missiles I’m sure you’d need the best machine tools, but maybe you don’t need the fastest missiles if you can make twice as many. Manufacturing automation is much harder if there’s random error in the positions of things, but whether we’re dealing with that amount of error, I’m not sure.
I’d guess low grade machine tools also probably require high grade machine tools to make.
Fascinating. China has always lagged far behind the rest of the world in high precision machining, and is still a long way behind, they have to buy all of those from other countries. The reasons appear complex.
All of the US and european machine tools that go to china use hardware monitoring and tamperproofing to prevent reverse engineering or misuse. There was a time when US aerospace machine tools reported to the DOC and DOD.
Regarding privacy-preserving AI auditing, I notice this is an area where you really need to have a solution to adversarial robustness, given that the adversary is 1) a nationstate, 2) has complete knowledge of the auditor’s training process and probably weights (they couldn’t really agree to an inspection deal if they didn’t trust the auditors to give accurate reports) 3) knows and controls the data the auditor will be inspecting. 4) Never has to show it to you (if they pass the audit).
Given that you’re assuming computers can’t practically be secured (though I doubt that very much[1].), it seems unlikely that a pre-AGI AI auditor could be secured either in that situation.
- ^
Tech stacks in training and inference centers are shallow enough (or vertically integrated enough) to rewrite, and rewrites and formal verification becomes cheaper as math-coding agents improve. Hardware is routinely entirely replaced. Preventing proliferation of weights and techniques also requires ironclad security, so it’s very difficult to imagine the council successfully framing the acquisition of fully fortified computers as an illicit threatening behaviour and forbidding it.
It seems to think that we could stably sit at a level of security that’s enough to keep terrorists out but not enough to keep peers out, without existing efforts in conventional security bleeding over into full forrtification programmes.
- ^
Mm, scenario where mass unemployment can be framed as a discrete event with a name and a face.
I guess I think it’s just as likely there isn’t an event, human-run businesses die off, new businesses arise, none of them outwardly emphasise their automation levels, the press can’t turn it into a scary story because automation and foreclosures are nothing fundamentally new (only in quantity, but you can’t photograph a quantity), the public become complicit by buying their cheaper higher quality goods and services so appetite for public discussion remains low.
I wonder what the crisis will be.
I think it’s quite likely that if there is a crisis that leads to beneficial response, it’ll be one of these three:
An undeployed privately developed system, not yet clearly aligned nor misaligned, either:
passes the Humanity’s Last Exam benchmark, demonstrating ASI, and the developers go to congress and say “we have a godlike creature here, you can all talk to it if you don’t believe us, it’s time to act accordingly.”
Not quite doing that, but demonstrating dangerous capability levels in red-teaming, ie, replication ability, ability to operate independently, pass the hardest versions of the turing test, get access to biolabs etc. And METR and hopefully their client go to the congress and say “This AI stuff is a very dangerous situation and now we can prove it.”
A deployed military (beyond frontier) system demonstrates such generality that, eg, Palmer Luckey (possibly specifically Palmer Luckey) has to go to congress and confess something like “that thing we were building for coordinating military operations and providing deterrence, turns out it can also coordinate other really beneficial tasks like disaster relief, mining, carbon drawdown, research, you know, curing cancer? But we aren’t being asked to use it for those tasks. So, what are we supposed to do? Shouldn’t we be using it for that kind of thing?” And this could lead to some mildly dystopian outcomes, or not, I don’t think the congress or the emerging post-prime defence research scene is evil, I think it’s pretty likely they’d decide to share it with the world (though I doubt they’d seek direct input from the rest of the world on how it should be aligned)
Some of the crises I expect, I guess, wont be recognized as crises. Boiled frog situations.
A private system passes those tests, but instead of doing the responsible thing and raising the alarm, the company just treats it like a normal release and sells it. (and the die is rolled and we live or we don’t.)
Or crises in the deployment of AI that reinforce the “AI as tool” frame so deeply that it becomes harder to discuss preparations for AI as independent agents:
Automated invasion: a country is successfully invaded, disarmed, controlled and reshaped with almost entirely automated systems, minimal human presence from the invading side. Probable in gaza or taiwan.
It’s hard to imagine a useful policy response to this. I can only imagine this leading to reactions like “Wow. So dystopian and oppressive. They Should Not have done that and we should write them some sternly worded letters at the UN. Also let’s build stronger AI weapons so that they can’t do that to us.”
A terrorist attack or a targeted assassination using lethal autonomous weapons.
I expect this to just be treated as if it’s just a new kind of bomb.
This is interesting. In general the game does sound like the kind of fun I expect to find in these parts. I’d like to play it. It sounds like it really can be played as a cohabitive game, and maybe it was even initially designed to be played that way?[1], but it looks to me like most people don’t understand it this way today. I’m unable to find this manual you quote. I’m coming across multiple reports that victory = winning[2].
Even just introducing the optional concept of victory muddies the exercise by mixing it up with a zero sum one in an ambiguous way. IME many players, even hearing that, will just play this for victory alone, compromise their win condition while pretending not to be doing that in hope of deceiving other players about their agenda, so it becomes hard to plan with them. This wouldn’t necessarily ruin the game but it would lead to a situation where those players are learning bad lessons.
- ^
I’d be curious to know what the original rulebook says, it sounds like it’s not always used today?
- ^
The first review I found (Phasing Player) presents it as a fully zero-sum game, completely declined to mention multi-win outcomes (43 seconds).
- ^
A moral code is invented[1] by a group of people to benefit the group as a whole, it sometimes demands sacrifice from individuals, but a good one usually has the quality that at some point in a person’s past, they would have voluntarily signed on with it. Redistribution is a good example. If you have a concave utility function, and if you don’t know where you’ll end up in life, you should be willing to sign a pledge to later share your resources with less fortunate people who’ve also signed the pledge, just in case you become one of the less fortunate. The downside of not being covered in that case is much larger than the upside of not having to share in the other case.
For convenience, we could decide to make the pledge mandatory and the coverage universal (ie, taxes and welfare) since there aren’t a lot of humans who would decline that deal in good faith. (Perhaps some humans are genuinely convex egoists and wouldn’t sign that deal, but we outnumber them, and accomodating them would be inconvenient, so we ignore them.)
If we’re pure of heart, we could make the pledge acausal and implicit and adhere to it without any enforcement mechanisms, and I think that’s what morality usually is or should be in the common sense.
But anyway, it sometimes seems to me that you often advocate a morality regarding AI relations that doesn’t benefit anyone who currently exists, or, the coalition that you are a part of. This seems like a mistake. Or worse.I wonder if it comes from a place of concern that… if we had public consensus that humans would prefer to retain full control over the lightcone, then we’d end up having stupid and unnecessary conflicts with the AIs over that, while, if we pretend we’re perfectly happy to share, relations will be better? You may feel that as long as we survive and get a piece, it’s not worth fighting for a larger piece? The damages from war would be so bad for both sides that we’d prefer to just give them most of the lightcone now?
And I think stupid wars aren’t possible under ASI-level information technology. If we had the capacity to share information and find out who’d win a war and skip to a surrender deal, doing so always has higher EV for both sides than actually fighting. The reason wars are not skipped that way today is that we still lack the capacity to simultaneously mutually exchange proofs of force capacity, but we’re getting closer to having that every day. Generally, in that era, coexisting under confessed value differences will be pretty easy. Honestly I feel like it already ought to be easy, for humans, if we’d get serious about it.
- ^
Though, as Singer says, much of morality is invented only in the same sense as mathematics is invented, being so non-arbitrary that it seems to have a kind of external observer-independent existence and fairly universal truths, which powerful AIs are likely to also discover. But the moralities in that class are much weaker (I don’t think Singer fully recognises the extent of this), and I don’t believe they have anything to say about this issue.
- ^
Do you believe there’s a god who’ll reward you for adhering to this kind of view-from-nowhere morality? If not, why believe in it?
Jellychip seems like a necessary tutorial game. I sense comedy in the fact that everyone’s allowed to keep secrets and intuitively will try to do something with secrecy despite it being totally wrongheaded. Like the only real difficulty of the game is reaching the decision to throw away your secrecy.
Escaping the island is the best outcome for you. Surviving is the second best outcome. Dying is the worst outcome.
You don’t mention how good or bad they are relative to each other though :) an agent cannot make decisions under uncertainty without knowing that.
I usually try to avoid having to explain this to players by either making it a score game or making the outcomes binary. But the draw towards having more than two outcomes is enticing. I guess in a roleplaying scenario, the question of just how good each ending is for your character is something players would like to decide for themselves. I guess as long as people are buying into the theme well enough, it doesn’t need to be made explicit, in fact, not making it explicit makes it clearer that player utilities aren’t comparable and that makes it easier for people to get into the cohabitive mindset.So now I’m imagining a game where different factions have completely different outcomes. None of them are conquest, nor death. They’re all weird stuff like “found my mother’s secret garden” or “fulfilled a promise to a dead friend” or “experienced flight”.
the hook
I generally think of hookness as “oh, this game tests a skill that I really want to have, and I feel myself getting better at it as I engage with the game, so I’ll deepen my engagement”.
There’s another component of it that I’m having difficulty with, which is “I feel like I will not be rejected if I ask friends to play this with me.” (well, I think I could get anyone to play it once, the second time is the difficult one) And for me I see this quality in very few board games, and to get there you need to be better than the best board games out there, because you’re competing with them, so that’s becoming very difficult. But since cohabitive games rule that should be possible for us.
And on that, I glimpsed something recently that I haven’t quite unpacked. There’s a certain something about the way Efka talks about Arcs here … he admitted that it wasn’t necessarily all fun. It was an ordeal. And just visually, the game looks like a serious undertaking. Something you’d look brave for sitting in front of. It also looks kind of fascinating. Like it would draw people in. He presents it with the same kind of energy as one would present the findings of a major government conspiracy investigation, or the melting of the clathrates. It does not matter whether you want to play this game, you have to, there’s no decision to be made as to whether to play it or not, it’s here, it fills the room.
And we really could bring an energy like that, because I think there are some really grim findings along the path to cohabitive enlightenment. But I’m wary of leaning into that, because I think cohabitive enlightenment is also the true name of peace. Arcs is apparently controversial. I do not want cohabitive games to be controversial.
(Plus a certain degree of mathematician crankery: his page on Google Image Search, and how it disproves AI
I’m starting to wonder if a lot/all of the people who are very cynical about the feasibility of ASI have some crank belief or other like that. Plenty of people have private religion, for instance. And sometimes that religion informs their decisions, but they never tell anyone the real reasons underlying these decisions, because they know they could never justify them. They instead say a load of other stuff they made up to support the decisions that never quite adds up to a coherent position because they’re leaving something load-bearing out.
I don’t think the intelligence consistently leads to self-annihilation hypothesis is possible. At least a few times it would amount to robust self-preservation.
Well.. I guess I think it boils down to the dark forest hypothesis. The question is whether your volume of space is likely to contain a certain number of berserkers, and the number wouldn’t have to be large for them to suppress the whole thing.
I’ve always felt the logic of berserker extortion doesn’t work, but occasionally you’d get a species that just earnestly wants the forest to be dark and isn’t very troubled by their own extinction, no extortion logic required. This would be extremely rare, but the question is, how rare.
Light speed migrations with no borders means homogeneous ecosystems, which can be very constrained things.
In our ecosystems, we get pockets of experimentation. There are whole islands where the birds were allowed to be impractical aesthetes (indonesia) or flightless blobs (new zealand). In the field-animal world, islands don’t exist, pockets of experimentation like this might not occur anywhere in the observable universe.
If general intelligence for field-animals costs a lot, has no immediate advantages (consistently takes say, a thousand years of ornament status before it becomes profitable), then it wouldn’t get to arise. Could that be the case?
These are not concepts of utility that I’ve ever seen anyone explicitly espouse, especially not here, the place to which it was posted.