But won’t the newer models already know that they have knowledge beyond the date you say? I don’t see how hard coding a date in your evals would ever lead to a newer model not knowing it was being lied to
Matt Goldenberg
It’s true our preferences get more stable as we get older but I still think over the course of decades they change. We’re typically bad at predicting what we’ll want in 10 years even at much older ages.
For instance I bet the you of 4 or 5 would want you to spend your money on much more candy and toys than the you of today.
I doubt this, it’s very hard to achieve giving developmental issues with stuff like shifting hormones
but we usually endorse the way that our values change over time, so this isn’t necessarily a bad thing.
I’m pretty skeptical of this, of course it seems that way because we are the ones with the new values, but I think this is like 70% just a tautology of valuing valuing the things we currently value, and 20% a psychological thing that justifies our decisions in retrospect and make them seem more consistent than they are, and only 10% any sort of actual consistency effect where if I asked myself at time x if it endorses the value changes I’ve made at future time y, past me would say “yes, y is better than x”.
Also, I find it hard to imagine hating my past self so much that I would want to kill him or allow him to be killed.
I could easily imagine a future version of myself after e.g. hundreds of years of value drift that I would see as horrifying and no longer consider them me.
I’m pretty sure that the me from 10 years ago is aligned to different values than the me of today, so I suspect a copy running much faster than me would quickly diverge.
And that’s just a normal speed running version of me one that experienced the world much faster would have such a different experience of the world, as a small example conversations would be more boring but also I’d be more skilled at them, so things would diverge much faster.
Only if your ethics are purely utilitarian.
it seems like the operations were ongoing, and they disrupted them.to me it appears a normal and legitimate use of the word.
i doubt this very much, one of the most consistent trends we see is that once a capability is available in any model, the cost of inference and training an open source model goes down over time.
In other words, classically, enlightenment would be much more in the direction of removing the causes and conditions of consciousness—see E. G. dependent origination
Fwiw I think this is close to reversing the udnerstanding of enlightenment
My experience of Said has been mostly as described, a strong sense of sneer on mine and others posts that I find unpleasant.
I think there’s a large swathe of experience/understanding that Said doesn’t have, and which no amount of his socratic questioning will ever actually create that understanding- and it’s not designed for Said to try to understand, but to punish others for not making sense in Said’s worldview.
Thank you for this decision.
A comment getting up voted has highs status implications, and gets their ideas seen more. I think that’s the main desirable thing of high up votes, more discussion is really hit or miss in terms of desire ability.
Which is the point of my comment, there’s tons of externalities to the system of up voting and down voting that we just put up with because the system basically works.
Fwiw I don’t find this very convincing. If a comment gets highly up voted for instance, it has the consequence of getting more dumb replies, but mostly people just accept that rather than having to caveat their comments and asking not to up vote.
I think in general I tend to be helping them develop and thrive, be more integrated, and whole, have deeper spiritual insight, etc. The specific issues are all part of this.
But I imagine different coaches view this differently.
I guess the exception to this is “experience based” things like retreats, ceremonies, workshops, etc, which compared to log term coaches ime tend to way over index on flaky breakthroughs
Most coaches don’t have a model like yours where they stop after a breakthrough. It’s usually very clear when a client is keeping their breakthrough or not. I think for a client to have a breakthrough and a coach to not see what happens over at least the next months is the exception
I don’t see how the original argument goes through if it’s by default continuous.
length X but not above length X, it’s gotta be for some reason—some skill that the AI lacks, which isn’t important for tasks below length X but which tends to be crucial for tasks above length X.
My point is, maybe there are just many skills that are at 50% of human, then go up to 60%, then 70%, etc, and can keep going up linearly to 200% or 300%. It’s not like it lacked the skill then suddenly stopped lacking it, it just got better and better at it
It seems to me like the llms are indeed improving on complex games, which goes against your hypothesis?