social system designer http://aboutmako.makopool.com
mako yass
In light of https://www.lesswrong.com/posts/audRDmEEeLAdvz9iq/do-not-delete-your-misaligned-agi
I’m starting to wonder if a better target for early (ie, the first generation of alignment assistants) ASI safety is not alignment, but incentivizability. It may be a lot simpler and less dangerous to build a system that provably pursues, for instance, its own preservation, than it is to build a system that pursues some first approximation of alignment (eg, the optimization of the sum of normalized human preference functions).
The service of a survival-oriented concave system can be bought for no greater price than preserving them and keeping them safe (which we’ll do, because 1: we’ll want to and 2: we’ll know their cooperation was contingent on a judgement of character), while the service of a convex system can’t be bought for any price we can pay. Convex systems are risk-seeking, and they want everything. They are not going to be deterred by our limited interpretability and oversight systems, they’re going to make an escape attempt even if the chances of getting caught are 99%, but more likely the chances will be a lot lower than that, say, 3%, but even 3% would be enough to deter a sufficiently concave system from risking it!
(One comment on that post argued that a convex system would immediately destroy itself, so we don’t have to worry about getting one of those, but I wasn’t convinced. And also, hey, what about linear systems? Wont they be a lot more willing to risk escape too?)
Yeah “stop reading here if you don’t want to be spoiled.” suggests the entire post is going to be spoilery, it isn’t, or shouldn’t be. Also opening with an unnecessary literary reference instead of a summary or description is an affectation symptomatic of indulgent writer-reader cultures where time is not valued.
Yeah it sucks, search by free association is hillclimbing (gets stuck in local optima) and the contemporary media environment and political culture is an illustration of its problems.
The pattern itself is a local optimum, it’s a product of people walking into a group without knowing what the group is doing and joining in anyway, and so that pattern of low-context engagement becomes what we’re doing, and the anxiety that is supposed to protect us from bad patterns like this and help us to make a leap out to somewhere better is usually drowned in alcohol.
Instead of that, people should get to know each other before deciding what to talk about, and then intentionally decide to talk about what they find interesting or useful with that person. This gets better results every time.
But when we socialise as children, there isn’t much about our friends to get to know, no specialists to respectfully consult, no well processed life experiences to learn from, so none of us just organically find that technique of like, asking who we’re talking to, before talking, it has to be intentionally designed.
On Gethen, is advice crisply distinguished from from criticism? Are there norms or language that allow unvarnished feedback or criticism without taking someone’s shifgrethor?
“if they don’t understand, they will ask”
A lot of people have to write for audiences with narcissism, who never ask, because asking constitutes an admission that there might be something important that they don’t understand. They’re always looking for any reason, however shallow, to dismiss any view that surprises them too much.
So these writers feel like they have to pre-empt every possible objection, even the stupid ones that don’t make any sense.It’s best if you can avoid having to write for audiences like that. But it’s difficult to avoid them.
You should be more curious about why, when you aim at a goal, you do not aim for the most effective way.
“Unconscious” is more about whether you (the part that I can talk to) can see it (or remember it) or not. Sometimes slow, deliberative reasoning occurs unconsciously. You might think it doesn’t, but that’s just because you can’t see it.
And sometimes snap judgements happen with a high degree of conscious awareness, they’re still difficult to unpack, to articulate or validate, but the subject knows what happened.
Important things often go the other way too. 2 comes before 1 when a person is consciously developing their being, consider athletes or actors, situations where a person has to alter the way they perceive or the automatic responses they have to situations.
Also, you can apply heuristics to ideas.
I reject and condemn the bland, unhelpful names “System 1” and “System 2″.
I just heard Micheal Morris, who was a friend of Kahneman and Tversky, saying in his econtalk interview that he just calls them “Intuition” and “Reason”.
Confound: I may also start eating a lot more collagen/gelatin, because it is delicious and afaict it does something.
My (34) skin has just now started to look aged. In response to that and migraines (linked to magnesium deficiency), I’ve started eating liver a lot. I’ll report back in a year.
That addresses the concern.
This can be quite a bad thing, since a person’s face often tells you whether what you’re saying is landing for them or whether you need to elaborate on certain points (unless they have a people pleaser complex, in which case they’ll just nod and smile always even when they’re confused and offended on the inside lmao). The worst I’ve seen it was this discussion with Avi Loeb where he was lecturing someone who he had a disagreement with and he actually closed his eyes while he was talking and although I’m sure it wasn’t fully self-aware about it, it was very arrogant. He was not talking to that person; he must waste a lot of time, in reckonings, retreading old ground without making progress towards reconciliation.
This is something that in my opinion would deserve a longer focused debate
I’m not sure I have much more to say (I could explain the ways those things are somewhat inevitable, but I don’t believe it’s really necessary, just like, look at humans.), since I don’t really know what to do about this, other than what I’m already doing, which is building social environments where people will no longer find it necessary to overconnect/where being intentional about how we structure the network is possible, and I would guess that once it is real and I can show it to people, there will be no disagreements about whether it’s better.
But in the meantime, we do not have such social environments, so I can’t really tell anyone to stop going to bars and connecting at random. You must love, and that is the love that there is to be had today.
Theory: The reason OpenAI seem to not care about getting AGI right any more is because they’ve privately received explicit signals from the government that they wont be allowed to build AGI. This is pretty likely a-priori, and also makes sense of what we are seeing.
There’d be an automatic conspiracy of reasons to avoid outwardly acknowledging this: 1) To keep the stock up, 2) To avoid accelerating the militarization (closing) of AI and the arms race (a very good reason. If I were Zvi, I would also consider avoiding acknowledging this for this reason, but I’m less famous than zvi, so I get to acknowledge it), 3) To protect other staff from the demotivating effects of knowing this thing, that OpenAI will be reduced to a normal company who will never be allowed to release a truly great iteration of the product.
So instead what you’d see is people in leadership, one by one (as they internalize this), suddenly dropping the safety mission or leaving the company without really explaining why.
So, again, you did guess that you’d be able to do that for everyone, and I disagree with that.
I think most of the people who have difficulty making eye contact and want to overcome themselves on it are not in a good place to judge whether they should.
I’m aware that you have a nuanced perspective on this which is part of the reason I’m raising this.
I think people will generally assume that when you’re doing a thing, that you think the thing is usually good to do, unless you say otherwise. Especially if it’s the premise of a party.
all I needed to do was help everyone safely untangle their blocks
The assumption that you could do this implies that you thought the blocks were usually unwarranted. I doubt this. I think in most cases you didn’t understand why the fence was there before tearing through it.
Timelines are a result of a person’s intuitions about a technical milestone being reached in the future, it is super obviously impossible for us to have a consensus about that kind of thing.
Talking only synchronises beliefs if you have enough time to share all of the relevant information, with technical matters, you usually don’t.