Decaeneus

Karma: 76

Decaeneus May 30, 2024, 8:25 PM
1 point
0
on: Decaeneus’s Shortform
Does belief quantization explain (some amount of) polarization?
Suppose people generally do Bayesian updating on beliefs. It seems plausible that most people (unless trained to do otherwise) subconsciosuly quantize their beliefs—let’s say, for the sake of argument, by rounding to the nearest 1%. In other words, if someone’s posterior on a statement is 75.2%, it will be rounded to 75%.
Consider questions that exhibit group-level polarization (e.g. on climate change, or the morality of abortion, or whatnot) and imagine that there is a series of “facts” that are floating around that someone uninformed doesn’t know about.
If one is exposed to facts in a randomly chosen order, then one will arrive at some reasonable posterior after all facts have been processed—in fact we can use this as a computational definition of the what it would be rational to conclude.
However, suppose that you are exposed to the facts that support the in-group position first (e.g. when coming of age in your own tribe) and the ones that contradict it later (e.g. when you leave the nest.) If your in-group is chronologically your first source of intel, this is plausible. In this case, if you update on sufficiently many supportive facts of the in-group stance, and you quantize, you’ll end up with a 100% belief on the in-group stance (or, conversely, a 0% belief on the out-group stance), after which point you will basically be unmoved by any contradictory facts you may later be exposed to (since you’re locked into full and unshakeable conviction by quantization).
One way to resist this is to refuse to ever be fully convinced of anything. However, this comes at a cost, since it’s cognitively expensive to hold onto very small numbers, and to intuitively update them well.

Decaeneus May 10, 2024, 7:00 PM
3 points
0
on: Decaeneus’s Shortform
Causality is rare! The usual statement that “correlation does not imply causation” puts them, I think, on deceptively equal footing. It’s really more like correlation is almost always not causation absent something strong like an RCT or a robust study set-up.
Over the past few years I’d gradually become increasingly skeptical of claims of causality just by updating on empirical observations, but it just struck me that there’s a good first principles reason for this.
For each true cause of some outcome we care to influence, there are many other “measurables” that correlate to the true cause but, by default, have no impact on our outcome of interest. Many of these measures will (weakly) correlate to the outcome though, via their correlation to the true cause. So there’s a one-to-many relationship between the true cause and the non-causal correlates. Therefore, if all you know is that something correlates with a particular outcome, you should have a strong prior against that correlation being causal.
My thinking previously was along the lines of p-hacking: if there are many things you can test, some of them will cross a given significance threshold by chance alone. But I’m claiming something more specific than that: any true cause is bound to be correlated to a bunch of stuff, which will therefore probably correlate with our outcome of interest (though more weakly, and not guaranteed since correlation is not necessarily transitive).
The obvious idea of requiring a plausible hypothesis for the causation helps somewhat here, since it rules out some of the non-causal correlates. But it may still leave many of them untouched, especially the more creative our hypothesis formation process is! Another (sensible and obvious, that maybe doesn’t even require agreement with the above) heuristic is to distrust small (magnitude) effects, since the true cause is likely to be more strongly correlated with the outcome of interest than any particular correlate of the true cause.

Decaeneus May 8, 2024, 4:42 PM
3 points
2
in reply to: RobertM’s comment on: Decaeneus’s Shortform
Perhaps that can work depending on the circumstances. In the specific case of a toddler, at the risk of not giving him enough credit, I think that type of distinction is too nuanced. I suspect that in practice this will simply make him litigate every particular application of any given rule (since it gives him hope that it might work) which raises the cost of enforcement dramatically. Potentially it might also make him more stressed, as I think there’s something very mentally soothing / non-taxing about bright line rules.
I think with older kids though, it’s obviously a really important learning to understand that the letter of the law and the spirit of the law do not always coincide. There’s a bit of a blackpill that comes with that though, once you understand that people can get away with violating the spirit as long as they comply with the letter, or that complying with the spirit (which you can grok more easily) does not always guarantee compliance with the letter, which puts you at risk of getting in trouble.

Decaeneus May 7, 2024, 2:44 PM
22 points
5
on: Decaeneus’s Shortform
Pretending not to see when a rule you’ve set is being violated can be optimal policy in parenting sometimes (and I bet it generalizes).
Example: suppose you have a toddler and a “rule” that food only stays in the kitchen. The motivation is that each time food is brough into the living room there is a small chance of an accident resulting in a permanent stain. There’s cost to enforcing the rule as the toddler will put up a fight. Suppose that one night you feel really tired and the cost feels particularly high. If you enforce the rule, it will be much more painful than it’s worth in that moment (meaning, fully discounting future consequences). If you fail to enforce the rule, you undermine your authority which results in your toddler fighting future enforcement (of this and possibly all other rules!) much harder, as he realizes that the rule is in fact negotiable / flexible.
However, you have a third choice, which is to credibly pretend to not see that he’s doing it. It’s true that this will undermine your perceived competence, as an authority, somewhat. However, it does not undermine the perception that the rule is to be fully enforced if only you noticed the violation. You get to “skip” a particularly costly enforcement, without taking steps back that compromise future enforcement much.
I bet this happens sometimes in classrooms (re: disruptive students) and prisons (re: troublesome prisoners) and regulation (re: companies that operate in legally aggressive ways).
Of course, this stops working and becomes a farce once the pretense is clearly visible. Once your toddler knows that sometimes you pretend not to see things to avoid a fight, the benefit totally goes away. So it must be used judiciously and artfully.

Decaeneus Mar 18, 2024, 2:42 PM
1 point
0
in reply to: Razied’s comment on: Decaeneus’s Shortform
Agreed with your example, and I think that just means that L2 norm is not a pure implementation of what we mean by “simple”, in that it also induces some other preferences. In other words, it does other work too. Nevertheless, it would point us in the right direction frequently e.g. it will dislike networks whose parameters perform large offsetting operations, akin to mental frameworks or beliefs that require unecessarily and reducible artifice or intermediate steps.
Worth keeping in mind that “simple” is not clearly defined in the general case (forget about machine learning). I’m sure lots has been written about this idea, including here.

Decaeneus Mar 17, 2024, 2:26 PM
1 point
0
on: Decaeneus’s Shortform
Regularization implements Occam’s Razor for machine learning systems.

When we have multiple hypotheses consistent with the same data (an overdetermined problem) Occam’s Razor says that the “simplest” one is more likely true.

When an overparameterized LLM is traversing the subspace of parameters that solve the training set seeking the smallest l2-norm say, it’s also effectively choosing the “simplest” solution from the solution set, where “simple” is defined as lower parameter norm i.e. more “concisely” expressed.

Decaeneus Mar 15, 2024, 3:28 PM
1 point
0
on: Rule Thinkers In, Not Out
In early 2024 I think it’s worth noting that deep-learning based generative models (presently, LLMs) have the property of generating many plausible hypotheses, not all of which are true. In a sense, they are creative and inaccurate.
An increasingly popular automated problem-solving paradigm seems to be bolting a slow & precise-but-uncreative verifier onto a fast & creative-but-imprecise (deep learning based) idea fountain, a la AlphaGeometry and FunSearch.
Today, in a paper published in Nature, we introduce FunSearch, a method to search for new solutions in mathematics and computer science. FunSearch works by pairing a pre-trained LLM, whose goal is to provide creative solutions in the form of computer code, with an automated “evaluator”, which guards against hallucinations and incorrect ideas. By iterating back-and-forth between these two components, initial solutions “evolve” into new knowledge. The system searches for “functions” written in computer code; hence the name FunSearch.
Perhaps we’re getting close to making the valuable box you hypothesize.

Decaeneus Feb 21, 2024, 9:57 PM
1 point
0
on: Daisy-chaining epsilon-step verifiers
Upon reflection, the only way this would work is if verification were easier than deception, so to speak. It’s not obvious that this is the case. Among humans, for instance, it seems very difficult for a more intelligent person to tell, in the general case, whether a less intelligent person is lying or telling the truth (unless the verifier is equipped with more resources and can collect evidence and so on, which is very difficult to do about some topics such as the verified’s internal state) so, in the case of humans, in general, deception seems easier than verification.
So perhapst the daisy-chain only travels down the intelligence scale, not up.

Decaeneus Feb 21, 2024, 9:46 PM
1 point
0
in reply to: Zac Hatfield-Dodds’s comment on: Decaeneus’s Shortform
To be sure, let’s say we’re talking about something like “the entirety of published material” rather than the subset of it that comes from academia. This is meant to very much include the open source community.
Very curious, in what way are most CS experiments not replicable? From what I’ve seen in deep learning, for instance, it’s standard practice to include a working github repo along with the paper (I’m sure you know lots more about this than I do). This is not the case in economics, for instance, just to pick a field I’m familiar with.

Decaeneus Feb 21, 2024, 7:27 PM
1 point
0
on: Decaeneus’s Shortform
I wonder how much of the tremendously rapid progress of computer science in the last decade owes itself to structurally more rapid truth-finding, enabled by:
- the virtual nature of the majority of the experiments, making them easily replicable
- the proliferation of services like github, making it very easy to replicate others’ experiments
- (a combination of the points above) the expectation that one would make one’s experiments easily available for replication by others
There are other reasons to expect rapid progress in CS (compared to, say, electrical engineering) but I wonder how much is explained by this replication dynamic.

Decaeneus Feb 14, 2024, 4:59 PM
2 points
0
on: Decaeneus’s Shortform
It feels like (at least in the West) the majority of our ideation about the future is negative, e.g.
- popular video games like Fallout
- zombie apocalypse themed tv
- shows like Black Mirror (there’s no equivalent White Mirror)
Are we at a historically negative point in the balance of “good vs bad ideation about the future” or is this type of collective pessimistic ideation normal?
If the balance towards pessimism is typical, is the promise of salvation in the afterlife in e.g. Christianity a rare example of a powerful and salient positive ideation about our futures (conditioned on some behavior)?

Decaeneus Feb 12, 2024, 6:16 PM
1 point
0
on: Decaeneus’s Shortform
From personal observation, kids learn text (say, from a children’s book, and from songs) back-to-front. That is, the adult will say all but the last word in the sentence, and the kid will (eventually) learn to chime in to complete the sentence.
This feels correlated to LLMs learning well when tasked with next-token prediction, and those predictions being stronger (less uniform over the vocabulary) when the preceding sequences get longer.
I wonder if there’s a connection to having rhyme “live” in the last sound of each line, as opposed to the first.

Decaeneus Feb 9, 2024, 8:04 PM
1 point
0
in reply to: Decaeneus’s comment on: Decaeneus’s Shortform
Kind of related Quanta article from a few days ago: https://www.quantamagazine.org/what-your-brain-is-doing-when-youre-not-doing-anything-20240205/

Decaeneus Feb 9, 2024, 6:14 PM
3 points
0
in reply to: porby’s comment on: porby’s Shortform
For what it’s worth (perhaps nothing) in private experiments I’ve seen that in certain toy (transformer) models, task B performance gets wiped out almost immediately when you stop training on it, in situations where the two tasks are related in some way.
I haven’t looked at how deep the erasure is, and whether it is far easier to revive than it was to train it in the first place.

Decaeneus Feb 9, 2024, 6:08 PM
3 points
0
on: Decaeneus’s Shortform
Reflecting on the particular ways that perfectionism differs from the optimal policy (as someone who suffers from perfectionism) and looking to come up with simple definitions, I thought of this:
- perfectionism looks to minimize the distance between an action and the ex-post optimal action but heavily dampening this penalty for the particular action “do nothing”
- optimal policy says to pick the best ex-ante action out of the set of all possible actions, which set includes “do nothing”
So, perfectionism will be maximally costly in an environment where you have lots of valuable options of new things you could do (breaking from status quo) but you’re unsure whether you can come close to the best one, like you might end up choosing something that’s half as good as the best you could have done. Optimal policy would say to just give it your best, and that you should be happy since this is an amazingly good problem to have, whereas perfectionism will whisper in your ear how painful it might be to only get half of this very large chunk of potential utility, and wouldn’t it be easier if you just waited.

Decaeneus Jan 28, 2024, 1:52 AM
1 point
0
in reply to: Perhaps’s comment on: Decaeneus’s Shortform
The parallel to athlete pre game rituals is an interesting one, but I guess I’d be interested in seeing the comparison between the following two groups:
group A: is told to meditate the usual way for 30 minutes / day, and does
group B: is told to just sit there for 30 minutes / day, and does
So both of the groups considered are sitting quietly for 30 minutes, but one group is meditating while the other is just sitting there. In this comparison, we’d be explicitly ignoring the benefit from meditation which acts via the channel of just making it more likely you actually sit there quietly for 30 minutes.

Decaeneus Jan 27, 2024, 7:58 PM
2 points
0
on: Decaeneus’s Shortform
Is meditation provably more effective than “forcing yourself to do nothing”?
Much like sleep is super important for good cognitive (and, of course, physical) functioning, it’s plausible that waking periods of not being stimulated (i.e. of boredom) are very useful for unlocking increased cognitive performance. Personally I’ve found that if I go a long time without allowing myself to be bored, e.g. by listening to podcasts or audiobooks whenever I’m in transition between activities, I’m less energetic, creative, sharp, etc.
The problem is that as a prescription “do nothing for 30 minutes” would be rejected as unappealing by most. So instead of “do nothing” it’s couched as “do this other thing” with a focus on breathing and so on. Does any of that stuff actually matter or does the benefit just come from doing nothing?

Decaeneus Jan 26, 2024, 9:34 PM
1 point
0
in reply to: tailcalled’s comment on: Decaeneus’s Shortform
To be sure, I’m not an expert on the topic.
Declines in male fertility I think are regarded as real, though I haven’t examined the primary sources.
Regarding female fertility, this report from Norway outlines the trend that I vaguely thought was representative of most of the developed world over the last 100 years.
Female fertility is trickier to measure, since female fertility and age are strongly correlated, and women have been having kids later, so it’s important (and likely tricky) to disentangle this confounder from the data.

Decaeneus Jan 26, 2024, 2:56 PM
6 points
0
on: Decaeneus’s Shortform
Infertility rates are rising and nobody seems to quite know why. Below is what feels like a possible (trivial) explanation that I haven’t seen mentioned anywhere.
I’m not in this field personally so it’s possible this theory is out there, but asking GPT about it doesn’t yield the proposed explanation: https://chat.openai.com/share/ab4138f6-978c-445a-9228-674ffa5584ea
Toy model:
- a family is either fertile or infertile, and fertility is hereditary
- the modal fertile family can have up to 10 kids, the modal infertile family can only have 2 kids
- in the olden days families aimed to have as many kids as they could
- now families aim to have 2 kids each
Under this model, in the olden days we would find a high proportion of fertile people in the gene pool, but in the modern world we wouldn’t. Put differently, the old convention lead to a strong positive correlation between fertility and participation in the gene pool, and the new convention leads to 0 correlation. This removes the selective pressure on fertility, hence we should expect fertility to drop / infertility to rise.
Empirical evidence for this would be something like an analysis of the time series of family size variance and infertility—is lower variance followed by increased infertility?

Decaeneus’s Shortform

DecaeneusJan 26, 2024, 2:56 PM

1 point

59 comments LW link

Decaeneus

De­caeneus’s Shortform

Decaeneus’s Shortform