Independent alignment researcher
I have signed no contracts or agreements whose existence I cannot mention.
Independent alignment researcher
I have signed no contracts or agreements whose existence I cannot mention.
It does not seem like this writer is aware of the Von Neumann–Morgenstern utility theorem. There are criticisms one can level against utility as a concept, but the central question ends up being which of those axioms do you disagree with and why? For example, Garabrant’s Geometric Rationality is a great counter if you’re looking for one.
Edit: I notice that all of your previous posts have been of this same format, and they all consistently receive negative karma. You should probably reconsider what you post to this forum.
The weather, or the behavior of any economy larger than village size, for example—systems so chaotically interdependent that exact prediction is effectively impossible (not just in fact but in principle).
Flagging that those two examples seem false. The weather is chaotic, yes, and there’s a sense in which the economy is anti-inductive, but modeling methods are advancing, and will likely find more loop-holes in chaos theory.
For example, in thermodynamics, temperature is non-chaotic while the precise kinetic energies and locations of all particles are. A reasonable candidate similarity in weather are hurricanes.
Similarly as our understanding of the economy advances it will get more efficient which means it will be easier to model. eg (note: I’ve only skimmed this paper). And definitely large economies are even more predictable than small villages, talk about not having a competitive market!
My model is that early on physics had very impressive & novel math, which attracted people who like math, who did more math largely with the constraint the math had to be trying to model something in the real world, which produced more impressive & novel math, which attracted more people who like math, etc etc, and this is the origin of the equilibrium.
Note a similar argument can be made for economics, though the nice math came much later on, and obviously was much less impactful than literally inventing calculus.
Happy to take your word on these things if the wikipedia article is unrepresentative!
In contrast, physicists were not committed to discovering the periodic table, fields or quantum wave functions. Many of the great successes of physics are answers to question no one would think to ask just decades before they were discovered. The hard sciences were formed when frontiers of highly tractable and promising theorizing opened up.
This seems a crazy comparison to make[1]. These seem like methodological constraints. Are there any actual predictions past physics was trying to make which we still can’t make and don’t even care about? None that I can think of.
Since ancient greece people were trying to break things down into their elements, though of course they called it “stoikheion”, which literally means “One of a row”. Now of course, they were wrong, it turns out the stoicheia ought to be arranged in a table not a row. But in either case the idea was there. “We can break things down into their elemental units and those things will have definite interaction properties we can use to understand all substances”.
Apropos of the comments below this post, many seem to be assuming humans can complete tasks which require arbitrarily many years. This doesn’t seem the case to me. People often peak intellectually in their 20′s, and sometimes get dementia late in life. Others just get dis-interested in their previous goals through a mid-life crisis or ADHD.
I don’t think this has much an impact on the conclusions reached in the comments (which is why I’m not putting this under the post), but this assumption does seem wrong in most cases (and I’d be interested in cases where people think its right!)
Capitalist Realism by Mark Fisher (as close to a self-portrait by the modern humanities as it gets)
At least reading the wikipedia, this… does not seem so self-conscious to me. Eg.
Fisher regards capitalist realism as emerging from a purposeful push by the neoliberal right to transform the attitudes of both the general population and the left towards capitalism and specifically the post-Fordist form of capitalism that prevailed throughout the 1980s. The relative inability of the political left to come up with an alternative economic model in response to the rise of neoliberal capitalism and the concurrent Reaganomics era created a vacuum that facilitated the birth of a capitalist realist perspective. The collapse of the Soviet Union, which Fisher believes represented the only real example of a working non-capitalist system, further cemented the place of capitalist realism both politically and in the general population, and was hailed as the decisive final victory of capitalism. According to Fisher, in a post-Soviet era, unchecked capitalism was able to reframe history into a capitalist narrative in which neoliberalism was the result of a natural progression of history and even embodied the culmination of human development.
and
Fisher argues that the bank bailouts following the 2008 economic crisis were a quintessential example of capitalist realism in action, reasoning that the bailouts occurred largely because the idea of allowing the banking system to fail was unimaginable to both politicians and the general population. Due to the intrinsic value of banks to the capitalist system, Fisher proposes that the influence of capitalist realism meant that such a failure was never considered an option. As a consequence, Fisher observes, the neoliberal system survived and capitalist realism was further validated. Fisher classifies the current state of capitalist realism in the neoliberal system in the following terms:
The only powerful agents influencing politicians and managers in education are business interests. It’s become far too easy to ignore workers and, partly because of this, workers feel increasingly helpless and impotent. The concerted attack on unions by neoliberal interest groups, together with the shift from a Fordist to a post-Fordist organisation of the economy – the move towards casualisation, just-in-time production, globalization – has eroded the power base of unions [and thus the labor force].
These are not exactly hard-hitting or at all novel or even interesting criticisms. And they’re not even criticisms of humanities! So how can it be self-conscious?
And we didn’t filter them in any way.
This seems contrary to what that page claims
Here, we present highly misaligned samples (misalignment >= 90) from GPT-4o models finetuned to write insecure code.
And indeed all the samples seem misaligned, which seems unlikely given the misaligned answer rate for other questions in your paper.
In my experience playing a lot with LLMs, “Nova” is a reasonably common name they give themselves if you ask, and sometimes they will spontaneously decide they are sentient, but that is the extent to which my own experiences are consistent with the story. I can imagine though that since the time I was playing with these things a lot (about 6 months ago) much has changed.
As a datapoint, I really liked this post. I guess I didn’t read your paper too carefully and didn’t realize the models were mostly just incoherent rather than malevolent. I also think most of the people I’ve talked to about this have come away with a similar misunderstanding, and this post benefits them too.
Yeah, I don’t know the answer here, but I will also say that one flaw of the brier score is that its not even clear that these sorts of differences will be even all that meaningful. Like, what you actually want to know is, how much more information does one group here give over the other groups here, and how much credence should we assign to each of the groups (acting as if they were each hypotheses in a Bayes update) given their predictions on the data we have. And for that, you can just run the bayes update.
The brier score was chosen for forecasters as far as I can tell because its more fun than scoring yourself based on log-odds (equivalent to the bayes update thing). Its less sensitive to horribly bad predictions, and it has a bounded “how bad can you be”. Its also easier to explain and think about, and has a different incentive landscape for those trying to maximize their scores, which may be useful if you’re trying to elicit good predictions.
But if you’re trying to determine who you should listen to (ie in what proportion you should update your model given so-and-so says such-and-such) you can’t do better than a Bayes update (given the constraints), so just use that!
The difference between our results and OpenAI’s might be due to OpenAI evaluating with a more powerful internal scaffold, using more test-time compute, or because those results were run on a different subset of FrontierMath (the 180 problems in frontiermath-2024-11-26 vs the 290 problems in frontiermath-2025-02-28-private).
That definitely sounds like OpenAI training on (or perhaps constructing a scaffold around) the part of the benchmark Epoch shared with them.
So part of it is slowly becoming a journal, and the felt social norms around posts are morphing to reflect that.
In some ways the equilibrium here is worse, journals have page limits.
Yes, by default there is always a reading group, we just forgot to post about it this week.
I think the biggest problem with how posts are presented is it doesn’t make the author embarrassed to make their post needlessly long, and doesn’t signal “we want you to make this shorter”. Shortforms do this, so you get very info dense posts, but actual posts kinda signal the opposite. If its so short, why not just make it a shortform, and if it shouldn’t be a shortform, surely you can add more to it. After all, nobody makes half-page lesswrong posts anymore.
So it’s certainly not a claim that could be verified empirically by looking at any individual humans because there aren’t yet any millenarians or megaannumarians.
If its not a conclusion which could be disproven empirically, then I don’t know how you came to it.
(I wrote my quick take quickly and therefore very elliptically, and therefore it would require extra charity / work on the reader’s part (like, more time spent asking “huh? this makes no sense? ok what could he have meant, which would make this statement true?”).)
I mean, I did ask myself about counter-arguments you could have with my objection, and came to basically your response. That is, something approximating “well they just don’t have enough information, and if they had way way more information then they’d love each other again” which I don’t find satisfying.
Namely because I expect people in such situations get stuck in a negative-reinforcement cycle, where the things which used to be fun which the other did lose their novelty over time as they get repetitive, which leads to the predicted reward of those interactions overshooting the actual reward, which in a TD learning sense is just as good (bad) as a negative reinforcement event. I don’t see why this would be fixed with more knowledge, and it indeed does seem likely to be exacerbated with more knowledge as more things the other does become less novel & more boring, and worse, fundamental implications of their nature as a person, rather than unfortunate accidents they can change easily.
I also think intuitions in this area are likely misleading. It is definitely the case now that marginally more understanding of each other would help with coordination problems, since people love making up silly reasons to hate each other. I do also think this is anchoring too much on our current bandwidth limitations, and generalizing too far. Better coordination does not always imply more love.
Given more information about someone, your capacity for having {commune, love, compassion, kindness, cooperation} for/with them increases more than your capacity for {hatred, adversariality} towards them increases.
If this were true, I’d expect much lower divorce rates. After all, who do you have the most information about other than your wife/husband, and many of these divorces are un-amicable, though I wasn’t quickly able to get particular numbers. [EDIT:] Though in either case, this indeed indicates a much decreasing level of love over long periods of time & greater mutual knowledge. See also the decrease in all objective measures of quality of life after divorce for both parties after long marriages.
You probably mention this somewhere, but I’ll ask here, are you currently researching whether these results hold for those other domains? I’m personally more interested about math than law.