Bunthut

Karma: 359

Bunthut 25 Jun 2025 9:35 UTC
1 point
0
on: Futarchy’s fundamental flaw
Suppose b is the true bias of the coin (which the supercomputer will compute). Then your expected return in this game is
𝔼[max(b, 0.50)] = 0.50 + 𝔼[max(b-0.50, 0)]
No. That formula would imply that, if the coin is 30% for sure and you buy it for 0.3, you make 0.2 in expectation, which you don’t, you make 0 regardless of what price you buy at.
Note that this kind of problem has also shown up in decision theory more generally. This is a good place to start. In particular, it seems like your problem can be fixed with epsilon exploration (if it doesn’t do so automatically, as per Soares), both the EDT and CDT variant should work.

Bunthut 20 Jun 2025 20:50 UTC
1 point
0
in reply to: XelaP’s comment on: Intelligence Is Not Magic, But Your Threshold For “Magic” Is Pretty Low
A simple version of this is done for panoramic photos. If he looked at the city from a consistent direction throughout the flight, that’s all that’s needed. If the direction varied, it can’t have varied a lot—he had to at least see the sides of the building he was drawing, if maybe from a different angle, and not all the buildings would have been parallel. That kind of rotation seems doable with current image transformers (and that’s only neccesary if the drawing has accurate angles even over long distances).
In any case, I don’t think it matters to my argument if current ML can do it. All the parts that might be difficult for the computer are doable even for normal humans, and therefore not magical. The only thing that’s added to the normal human skill here is perfect memory, which we know is easy for computers and have known for a long time.

Bunthut 20 Jun 2025 20:34 UTC
1 point
0
in reply to: Steven Byrnes’s comment on: Are superhuman savants real?
To clarify the question: I agree that there is variation in talent and that some very talented people can do things most could never. My question is, if you look at the distribution of talent among normal people, and then check how many standard deviations out our savant candidate is, then what’s the chance at least one person with that talent would exist? Basically, is this just the normal right tail that’s expected from additive genetic reshuffling, or an “X-man”.

[Question] Are superhuman savants real?

Bunthut16 Jun 2025 22:02 UTC

13 points

4 comments1 min readLW link

Bunthut 16 Jun 2025 17:33 UTC
1 point
−1
on: Intelligence Is Not Magic, But Your Threshold For “Magic” Is Pretty Low
Example 3: Stephen Wiltshire. He made a nineteen-foot-long drawing of New York City after flying on a helicopter for 20 minutes, and he got the number of windows and floors of all the buildings correct.
I think ~everyone understands that computers can do this. The “magical” part is doing it with a human brain, not doing it at all. Similarly, blindfolded chess is not more difficult than normal chess for computers. That may take a little knowledge to see. And “doing it faster” is again clear. So the threshold for magic you describe is not the one even the most naive use for AI.

Bunthut 28 Apr 2025 14:47 UTC
1 point
0
on: Why Have Sentence Lengths Decreased?
Sentence lengths have declined.
Data: I looked for similar data on sentence lengths in german, and the first result I found covering a similar timeframe was wikipedia referencing Kurt Möslein: Einige Entwicklungstendenzen in der Syntax der wissenschaftlich-technischen Literatur seit dem Ende des 18. Jahrhunderts. (1974), which does not find the same trend:
Year wps
1770 24,50
1800 25,54
1850 32,00
1900 23,58
1920 22,72
1940 19,60
1960 19,90
This data on scientific writing starts lower than any of your english examples from that time, and increases initially, but arrives in the same place (insofar as wps are comparably across languages, which I think is fine for english and german).

Bunthut 1 Apr 2025 15:12 UTC
1 point
0
in reply to: habryka’s comment on: LessWrong has been acquired by EA
6 picolightcones as well, don’t think that changed.

Bunthut 1 Apr 2025 14:54 UTC
1 point
0
on: LessWrong has been acquired by EA
Before logging in I had 200 LW-Bux, and 3 virtues. Now I have 50 LW and 8 virtues, and I didn’t do anything. Whats that? Is there any explanation of how this stuff works?

Bunthut 1 Apr 2025 14:45 UTC
3 points
0
in reply to: Kaj_Sotala’s comment on: Genetic fitness is a measure of selection strength, not the selection target
I think your disagreement can be made clear with more formalism. First, the point for your opponents:
When the animals are in a cold place, they are selected for a long fur coat, and also for IGF, (and other things as well). To some extent, these are just different ways of describing the same process. Now, if they move to a warmer place, they are now selected for a shorter fur instead, and they are still selected for IGF. And there’s also a more concrete correspondence to this: they have also been selected for “IF cold long fur, ELSE short fur” the entire time. Notice especially that there are animals actually implementing this dependent property—it can be evolved just fine, in the same way as the simple properties. And in fact, you could “unroll” the concept of IGF into a humongous environment-dependent strategy, which would then always be selected for, because all the environment-dependence is already baked in.
Now on the other hand, if you train an AI first on one thing, and then on another, wouldn’t we expect it to get worse at the first again? Indeed, we would also expect a species living in the cold for very long to lose those adaptations relevant to the heat. The reason for this in both cases are, broadly speaking, limits and penalties to complexity. I’m not sure very many people would have bought the argument in the previous paragraph—we all know unused genetic code decays over time. But in the behavioral/cognitive version with intentionally maximizing IGF that makes it easy to ignore the problems, we’re not used to remembering the physical correlates of thinking. Of course, a dragonfly couldn’t explicitly maximize IGF, because its brain is to small to even understand what that is, and developing that brain has demands for space and energy incompatible with the general dragonfly life strategy. The costs of cognition are also part of the demands of fitness, and the dragonfly is more fit the way it is, and similarly I think a human explicitly maximizing IGF would have done worse for most of our evolution^[1] because the odds you get something wrong are just too high with current expenditure on cognition, better to hardcode some right answers..
I don’t share your optimistic conclusion however. Because the part about selecting for multiple things simultanuously, that’s true. You are always selecting for everything thats locally extensionally equivalent to the intended selection criteria. There is not a move you could have done in evolutions place, to actually select for IGF instead of [various particular things], this already is what happens when you select for IGF, because it’s the complexity, rather than different intent, that lead to the different result^[2]. Similarly, reinforcement learning for human values will result is whatever is the simplest^[3] way to match human values on the training data.
1. ^
  and even today, still might if sperm donations et al weren’t possible
2. ^
  I don’t think you’ve tried to come up with what that different move might look like for evolution, but it’s strongly implied they exist for both it and the AI situation.
3. ^
  in the sense of that architecture

Bunthut 13 Mar 2025 8:53 UTC
13 points
−2
in reply to: the gears to ascension’s comment on: Trojan Sky
for AIs, more robust adversarial examples—especially ones that work on AIs trained on different datasets—do seem to look more “reasonable” to humans.
Then I would expect they are also more objectively similar. In any case that finding is strong evidence against manipulative adversarial examples for humans—your argument is basically “there’s just this huge mess of neurons, surely somewhere in there is a way”, but if the same adversarial examples work on minds with very different architectures, then that’s clearly not why they exist. Instead, they have to be explained by some higher-level cognitive factors shared by ~anyone who gets good at interpreting a wide range of visual data.
The really obvious adversarial example of this kind in human is like, cults, or so
Cults use much stronger means than is implied by adversarial examples. For one, they can react to and reinforce your behaviour—is a screen with text promising you things for doing what it wants, with escalating impact and building a track record an adversarial example? No. Its potentially worrying, but not really distinct from generic powerseeking problems. The cult also controls a much larger fraction of your total sensory input over an extended time. Cult members spreading the cult also use tactics that require very little precision—there isn’t information transmitted to them on how to do this, beyond simple verbal instructions. Even if there are more precision-needing ways of manipulating individuals, its another thing entirely to manipulate them into repeating those high precision strategies that they couldn’t themselves execute correctly on purpose.
if you’re not personally familiar with hypnosis
I think I am a little bit. I don’t think that means what you think it does. Listening-to-action still requires comprehension of the commands, which is much lower bandwidth than vision, and its a structure thats specifically there to be controllable by others, so it’s not an indication that we are controllable to others in other bizzare ways. And you are deliberately not being so critical—you haven’t, actually, been circumvented, and there isn’t really a path to escalating power—just the fact youre willing to oey someone in a specific context. Hypnosis also ends on its own—the brain naturally tends back towards baseline, implanting a mechanism that keeps itself active indefinitely is high-precision.

Bunthut 13 Mar 2025 0:25 UTC
3 points
−5
in reply to: the gears to ascension’s comment on: Trojan Sky
Ok, thats mostly what I’ve heard before. I’m skeptical because:
1. If something like classical adversarial examples existed for humans, it likely wouldn’t have the same effects on different people, or even just viewed from different angles, or maybe even in a different mood.
2. No known adversarial examples of the kind you describe for humans. We could tell if we had found them because we have metrics of “looking similar” which are not based on our intuitive sense of similarity, like pixelwise differences and convolutions. All examples of “easily confused” images I’ve seen were objectively similar to what theyre confused for.
3. Somewhat similar to what Grayson Chao said, it seems that the influence of vision on behaviour goes through a layer of “it looks like X”, which is much lower bandwidth than vision in total. Ads have qualitatively similar effects to what seeing their content actually happen in person would.
4. If adversarial examples exist, that doesn’t mean they exist for making you do anything of the manipulators choosing. Humans are, in principle, at least as programmable as a computer, but that also means there are vastly more courses of action than possible vision inputs. In practice, propably not a lot of high-cognitive-function-processing could be commandeered by adversarial inputs, and behaviours complex enough to glitch others couldn’t be implemented.

Bunthut 12 Mar 2025 10:53 UTC
2 points
0
in reply to: the gears to ascension’s comment on: Trojan Sky
I just thought through the causal graphs involved, there’s probably enough bandwidth through vision into reliably redundant behavior to do this
Elaborate.

Bunthut 10 Mar 2025 15:56 UTC
2 points
0
on: A computational no-coincidence principle
This isn’t my area of expertise, but I think I have a sketch for a very simple weak proof:
The conjecture states that V runtime and $π$ length are polynomial in C size, but leaves the constant open. Therefore a counterexample would have to be an infinite family of circuits satisfying P(C), with their corresponding $π$ growing faster than polynomial. To prove the existence of such a counterexample, you would need a proof that each member of the family satisfies P(C). But that proof has finite length, and can be used as the $π$ for any member of the family with minor modification. Therefore there can never be a proven counterexample.
Or am I misunderstanding something?

Bunthut 20 Dec 2024 15:13 UTC
9 points
5
in reply to: lincolnquirk’s comment on: When Is Insurance Worth It?
I think the solution to this is to add something to your wealth to account for inalienable human capital, and count costs only by how much you will actually be forced to pay. This is a good idea in general; else most people with student loans or a mortage are “in the red”, and couldnt use this at all.

Bunthut 17 Oct 2024 7:54 UTC
1 point
0
in reply to: Said Achmiz’s comment on: Momentum of Light in Glass
What are real numbers then? On the standard account, real numbers are equivalence classes of sequences of rationals, the finite diagonals being one such sequence. I mean, “Real numbers don’t exist” is one way to avoid the diagonal argument, but I don’t thinks that’s what cubefox is going for.

Bunthut 29 Sep 2024 22:01 UTC
1 point
0
in reply to: Mikhail Samin’s comment on: How to Give in to Threats (without incentivizing them)
The society’s stance towards crime- preventing it via the threat of punishment- is not what would work on smarter people
This is one of two claims here that I’m not convinced by. Informal disproof: If you are a smart individual in todays society, you shouldn’t ignore threats of punishment, because it is in the states interest to follow through anyway, pour encourager les autres. If crime prevention is in peoples interest, intelligence monotonicity implies that a smart population should be able to make punishment work at least this well. Now I don’t trust intelligence monotonicity, but I don’t trust it’s negation either.
The second one is:
You can already foresee the part where you’re going to be asked to play this game for longer, until fewer offers get rejected, as people learn to converge on a shared idea of what is fair.
Should you update your idea of fairness if you get rejected often? It’s not clear to me that that doesn’t make you exploitable again. And I think this is very important to your claim about not burning utility: In the case of the ultimatum game, Eliezers strategy burns very little over a reasonable-seeming range of fairness ideals, but in the complex, high-dimensional action spaces of the real world, it could easily be almost as bad as never giving in, if there’s no updating.

Bunthut 19 Sep 2024 22:53 UTC
LW: 11 AF: 8
3
AF
on: Superrational Agents Kelly Bet Influence!
Maybe I’m missing something, but it seems to me that all of this is straightforwardly justified through simple selfish pareto-improvements.
Take a look at Critchs cake-splitting example in section 3.5. Now imagine varying the utility of splitting. How high does it need to get, before [red->Alice;green->Bob] is no longer a pareto improvement over [(split)] from both player’s selfish perspective before the observation? It’s 27, and thats also exactly where the decision flips when weighing Alice 0.9 and Bob 0.1 in red, and Alice 0.1 and Bob 0.9 in green.
Intuitively, I would say that the reason you don’t bet influence all-or-nothing, or with some other strategy, is precisely because influence is not money. Influence can already be all-or-nothing all by itself, if one player never cares that much more than the other. The influence the “losing” bettor retains in the world where he lost is not some kind of direct benefit to him, the way money would be: it functions instead as a reminder of how bad a treatment he was willing to risk in the unlikely world, and that is of course proportional to how unlikely he thought it is.
So I think all this complicated strategizing you envision in influence betting, actually just comes out exactly to Critches results. Its true that there are many situations where this leads to influence bets that don’t matter to the outcome, but they also don’t hurt. The theorem only says that actions must be describable as following a certain policy, it doesn’t exclude that they can be described by other policies as well.

Bunthut 10 Sep 2024 23:09 UTC
1 point
0
on: Physical Therapy Sucks (but have you tried hiding it in some peanut butter?)
The timescale for improvement is dreadfully long and the day-to-day changes are imperceptible.
This sounded wrong, but I guess is technically true? I had great in-session improvements as I’m warming up the area and getting into it, and the difference between a session where I missed the previous day, and one where I didn’t, is absolutely preceptible. Now after that initial boost, it’s true that I couldn’t tell if the “high point” was improving day to day, but that was never a concern—the above was enough to give me confidence. Plus with your external rotations, was there not perceptible strength improvement week to week?

Bunthut 10 Sep 2024 22:50 UTC
1 point
0
in reply to: abramdemski’s comment on: FixDT
So I’ve reread your section on this, and I think I follow that, but its arguing a different claim. In the post, you argue that a trader that correctly identifies a fixed point, but doesn’t have enough weight to get it played, might not profit from this knowledge. That I agree with.
But now you’re saying that even if you do play the new fixed point, that trader still won’t gain?
I’m not really calling this a proof because it’s so basic that something else must have gone wrong, but:
$b_{t r u e}$ has a fixed point at $p$ , and $b_{f a l s e}$ doesn’t. Then $b_{f a l s e} (p) = p^{'} \neq p$ . So if you decide to play $p$ , then $b_{f a l s e}$ predicts $p^{'}$ , which is wrong, and gets punished. By continuity, this is also true in some neighborhood around p. So if you’ve explored your way close enough, you win.

Bunthut 9 Sep 2024 20:49 UTC
LW: 3 AF: 3
0
AF
in reply to: abramdemski’s comment on: FixDT
On reflection, I didn’t quite understand this exploration business, but I think I can save a lot of it.
>You can do exploration, but the problem is that (unless you explore into non-fixed-point regions, violating epistemic constraints) your exploration can never confirm the existence of a fixed point which you didn’t previously believe in.
I think the key here is in the word “confirm”. Its true that unless you believe p is a fixed point, you can’t just try out p and see the result. However, you can change your beliefs about p based on your results from exploring things other than p. (This is why I call the thing I’m objecting to humean trolling.) And there is good reason to think that the available fixed points are usually pretty dense in the space. For example, outside of the rule that binarizes our actions, there should usually be at least one fixed point for every possible action. Plus, as you explore, your beliefs change, creating new believed-fixed-points for you to explore.
>I think your idea for how to find repulsive fixed-points could work if there’s a trader who can guess the location of the repulsive point exactly rather than approximately
I don’t think thats needed. If my net beliefs have a closed surface in propability space on which they push outward, then necessarily those beliefs have a repulsive fixed point somewhere in that surface. I can then explore that believed fixed point. Then if its not a true fixed point, and I still believe in the closed surface, theres a new fixed point in that surface that I can again explore (generally more in the direction I just got pushed away from). This should converge on a true fixed point. The only thing that can go wrong is that I stop believing in the closed surface, and it seems like I should leave open that possibility—and even then, I might believe in it again after I do some checking along the outside.
>However, the wealth of that trader will act like a martingale; there’s no reliable profit to be made (even on average) by enforcing this fixed point.
This I don’t understand at all. If you’re in a certain fixed point, shouldn’t the traders that believe in it profit from the ones that don’t?

Year	wps
1770	24,50
1800	25,54
1850	32,00
1900	23,58
1920	22,72
1940	19,60
1960	19,90

Bunthut

[Question] Are su­per­hu­man sa­vants real?

[Question] Are superhuman savants real?