Explaining the Shapley value in terms of the “synergies” (and the helpful split in the Venn diagram) makes much more intuitive sense than the more complex normal formula without synergies, which is usually just given without motivation. That being said, it requires first computing the synergies, which seems somewhat confusing for more than three players. The article itself doesn’t mention the formula for the synergy function, but Wikipedia has it.
cubefox
Let me politely disagree with this post. Yes, often desires (“wants”) are neither rational nor irrational, but that’s far from always the case. Let’s begin with this:
But the fundamental preferences you have are not about rationality. Inconsistent actions can be irrational if they’re self-defeating, but “inconsistent preferences” only makes sense if you presume you’re a monolithic entity, or believe your “parts” need to all be in full agreement all the time… which I think very badly misunderstands how human brains work.
In the above quote you could simply replace “preferences” with “beliefs”. The form of argument wouldn’t change, except that you now say (absurdly) that beliefs, like preferences, can’t be irrational. I disagree with both.
One example of irrational desires is Akrasia (weakness of will). This phenomenon occurs when you want something (eat unhealthy, procrastinate, etc) but do not want to want it. In this case the former desire is clearly instrumentally irrational. This is a frequent and often serious problem and adequately labeled “irrational”.
Note that this is perfectly compatible with the brain having different parts: E.g.: The (rather stupid) cerebellum wants to procrastinate, the (smart) cortex wants to not procrastinate. When in contradiction, you should listen to your cortex rather than to your cerebellum. Or something like that. (Freud called the stupid part of the motivation system the “id” and the smart part the “super ego”.)
Such irrational desires are not reducible to actions. An action can fail to obtain for many reasons (perhaps it presupposed false beliefs) but that doesn’t mean the underlying desire wasn’t irrational.
Wants are not beliefs. They are things you feel.
Feelings and desires/”wants” are not the same. It’s the difference between hedonic and preference utilitarianism. Desires are actually more similar to beliefs, as both are necessarily about something (the thing which we believe or desire), whereas feelings can often just be had, without them being about anything. E.g. you can simply feel happy without being happy about something specific. (Philosophers call mental states that are about something “intentional states” or “propositional attitudes”.)
Moreover, sets of desires, just like sets of beliefs, can be irrational (“inconsistent”). For example, if you want x to be true and also want not-x to be true. That’s irrational, just like believing x while also believing not-x. A more complex example from utility theory: If describes your degrees of belief in various propositions, and describes your degrees of desire that various proposition are true, and , then . In other words, if you believe two propositions to be mutually exclusive, your expected desire for their disjunction should equal the sum of your expected desires for the individual propositions, a form of weighted average.
More specifically, for a Jeffrey utility function defined over a Boolean algebra of propositions, and some propositions , “the sum is greater than its parts” would be expressed as the condition (which is, of course, not a theorem). The respective general theorem only states that , which follows from the definition of conditional utility
Yeah definitional. I think “I should do x” means about the same as “It’s ethical to do x”. In the latter the indexical “I” has disappeared, indicating that it’s a global statement, not a local one, objective rather than subjective. But “I care about doing x” is local/subjective because it doesn’t contain words like “should”, “ethical”, or “moral patienthood”.
Ethics is a global concept, not many local ones. That I care more about myself than about people far away from me doesn’t mean that this makes an ethical difference.
This seems to just repeat the repugnant conclusion paradox in more graphic detail. Any paradox is such that one can make highly compelling arguments for either side. That’s why it’s called a paradox. But doing this won’t solve the problem. A quote from Robert Nozick:
Given two such compelling opposing arguments, it will not do to rest content with one’s belief that one knows what to do. Nor will it do to just repeat one of the arguments, loudly and slowly. One must also disarm the opposing argument; explain away its force while showing it due respect.
Tailcalled talked about this two years ago. A model which predicts text does a form of imitation learning. So it is bounded by the text it imitates, and by the intelligence of humans who have written the text. Models which predict future sensory inputs (called “predictive coding” in neuroscience, or “the dark matter of intelligence” by LeCun) don’t have such a limitation, as they predict reality more directly.
This still included other algorithmically determined tweets—from what your followers had liked and later more generally “recommended” tweets. These are no longer present in the following tab.
I’m pretty sure there were no tabs at all before the acquisition.
Twitter did use an algorithmic timeline before (e.g. tweets you might be interested in, tweets people you followed liked), it was just less algorithmic than the “for you” tab currently. The time when it was completely like the current “following” tab was many years ago.
The algorithm has been horrific for a while
After Musk took over, they implemented a mode which doesn’t use an algorithm on the timeline at all. It’s the “following” tab.
In the past we already had examples (“logical AI”, “Bayesian AI”) where galaxy-brained mathematical approaches lost out against less theory-based software engineering.
Cities are very heavily Democratic, while rural areas are only moderately Republican.
I think this isn’t compatible with both getting about equally many votes. Because much more US Americans live in cities than in rural areas:
In 2020, about 82.66 percent of the total population in the United States lived in cities and urban areas.
https://www.statista.com/statistics/269967/urbanization-in-the-united-states/
It’s not that “they” should be more precise, but that “we” would like to have more precise information.
We know pretty conclusively now from The Information and Bloomberg that for OpenAI, Google and Anthropic, new frontier base LLMs have yielded disappointing performance gains. The question is which of your possibilities did cause this.
They do mention that the availability of high quality training data (text) is an issue, which suggests it’s probably not your first bullet point.
Ah yes, the fork asymmetry. I think Pearl believes that correlations reduce to causations, so this is probably why he wouldn’t particularly try to, conversely, reduce causal structure to a set of (in)dependencies. I’m not sure whether the latter reduction is ultimately possible in the universe. Are the correlations present in the universe, e.g. defined via the Albert/Loewer Mentaculus probability distribution, sufficient to recover the familiar causal structure of the universe?
This approach goes back to Hans Reichenbach’s book The Direction of Time. I think the problem is that the set of independencies alone is not sufficient to determine a causal and temporal order. For example, the same independencies between three variables could be interpreted as the chains and . I think Pearl talks about this issue in the last chapter.
If base model scaling has indeed broken down, I wonder how this manifests. Does the Chinchilla scaling law no longer hold beyond a certain size? Or does it still hold, but reduction in prediction loss no longer goes along with a proportional increase in benchmark performance? The latter could mean the quality of the (largely human generated) training data is the bottle neck.
“Misinterpretation” is somewhat ambiguous. It either means not correctly interpreting the intent of an instruction (and therefore also not acting on that intent) or correctly understanding the intent of the instruction while still acting on a different interpretation. The latter is presumably what the outcome pump was assumed to do. LLMs can apparently both understand and act on instructions pretty well. The latter was not at all clear in the past.
Interesting. Question: Why does the prediction confidence start at 0.5? And how is the “actual accuracy” calculated?
Related: In the equation y=ax+b, the values of all four variables are unknown, but x and y seem to be more unknown (more variable?) than a and b. It’s not clear what the difference is exactly.