hairyfigment

Karma: 2,108

hairyfigment 11 Sep 2025 10:40 UTC
4 points
0
in reply to: dr_s’s comment on: The Rise of Parasitic AI
See also: https://en.wikipedia.org/wiki/Uzumaki

hairyfigment 5 Aug 2025 21:28 UTC
1 point
0
in reply to: FlorianH’s comment on: Saying Goodbye
That’s exactly what I mean. You aren’t special. It’s a mistake to act like nobody else is using the same method to make decisions.

hairyfigment 4 Aug 2025 22:37 UTC
2 points
0
in reply to: FlorianH’s comment on: Saying Goodbye
See: self-reference paradoxes.

hairyfigment 4 Aug 2025 7:28 UTC
25 points
18
on: Saying Goodbye
I do have to note that “secure her slice of the lightcone” seems like laughable nonsense to me, unless it’s a fancy way of saying ‘live comfortably until everyone dies.’ A rationally selfish entity would be trying to delay AGI until they had some hope of understanding what they were doing. What I see happening is, instead, the behavior of contemptibly stupid and short-sighted primates.

hairyfigment 5 Jun 2025 8:58 UTC
6 points
4
in reply to: Nina Panickssery’s comment on: Thane Ruthenis’s Shortform
This sounds like a question which can be addressed after we figure out how to avoid extinction.
I do note that you were the one who brought in “biological humans,” as if that meant the same as “ourselves” in the grandparent. That could already be a serious disagreement, in some other world where it mattered.

hairyfigment 29 Aug 2024 1:31 UTC
4 points
4
in reply to: justinpombrio’s comment on: Am I confused about the “malign universal prior” argument?
I don’t see how any of it can be right. Getting one algorithm to output Spongebob wouldn’t cause the SI to watch Spongebob -even a less silly claim in that vein would still be false. The Platonic agent would know the plan wouldn’t work, and thus wouldn’t do it.
Since no individual Platonic agent could do anything meaningful alone, and they plainly can’t communicate with each other, they can only coordinate by means of reflective decision theory. That’s fine, we’ll just assume that’s the obvious way for intelligent minds to behave. But then the SI works the same way, and knows the Platonic agents will think that way, and per RDT it refuses to change its behavior based on attempts to game the system. So none of this ever happens in the first place.
(This is without even considering the serious problems with assuming Platonic agents would share a goal to coordinate on. I don’t think I buy it. You can’t evolve a desire to come into existence, nor does an arbitrary goal seem to require it. Let me assure you, there can exist intelligent minds which don’t want worlds like ours to exist.)

hairyfigment 18 Jul 2024 8:29 UTC
2 points
0
in reply to: sunwillrise’s comment on: Optimistic Assumptions, Longterm Planning, and “Cope”
https://arxiv.org/abs/1712.05812
It’s directly about inverse reinforcement learning, but that should be strictly stronger than RLHF. Seems incumbent on those who disagree to explain why throwing away information here would be enough of a normative assumption (contrary to every story about wishes.)

hairyfigment 9 Jul 2024 6:40 UTC
2 points
0
in reply to: TurnTrout’s comment on: TurnTrout’s shortform feed
this always helps in the short term,
You seem to have ‘proven’ that evolution would use that exact method if it could, since evolution never looks forward and always must build on prior adaptations which provided immediate gain. By the same token, of course, evolution doesn’t have any knowledge, but if “knowledge” corresponds to any simple changes it could make, then that will obviously happen.

hairyfigment 15 Mar 2024 20:45 UTC
2 points
0
in reply to: ErioirE’s comment on: I was raised by devout Mormons, AMA &| Offer Advice
Well that’s disturbing in a different way. How often do they lose a significant fraction of their savings, though? How many are unvaccinated, which isn’t the same as loudly complaining about the shot’s supposed risks? The apparent lack of Flat Earthers could point to them actually expecting reality to conform to their words, and having a limit on the silliness of the claims they’ll believe. But if they aren’t losing real money, that could point to it being a game (or a cost of belonging).

hairyfigment 14 Mar 2024 0:59 UTC
2 points
0
in reply to: ErioirE’s comment on: I was raised by devout Mormons, AMA &| Offer Advice
The answer might be unhelpful due to selection bias, but I’m curious to learn your view of QAnon. Would you say it works like a fandom for people who think they aren’t allowed to read or watch fiction? I get the strong sense that half the appeal—aside from the fun of bearing false witness—is getting to invent your own version of how the conspiracy works. (In particular, the pseudoscientific FNAF-esque idea at the heart of it isn’t meant to be believed, but to inspire exegesis like that on the Kessel Run.) This would be called fanfic or “fanwank” if they admitted it was based on a fictional setting. Is there something vital you think I’m missing?

hairyfigment 25 Dec 2023 4:16 UTC
8 points
3
in reply to: Steven Byrnes’s comment on: Most People Don’t Realize We Have No Idea How Our AIs Work
There have, in fact, been numerous objections to genetically engineered plants and by implication everything in the second category. You might not realize how much the public is/was wary of engineered biology, on the grounds that nobody understood how it worked in terms of exact internal details. The reply that sort of convinced people—though it clearly didn’t calm every fear about new biotech—wasn’t that we understood it in a sense. It was that humanity had been genetically engineering plants via cultivation for literal millennia, so empirical facts allowed us to rule out many potential dangers.

hairyfigment 18 Dec 2023 1:49 UTC
2 points
2
on: The Limits of Artificial Consciousness: A Biology-Based Critique of Chalmers’ Fading Qualia Argument
Note that it requires the assumption that consciousness is material
Plainly not, assuming this is the same David J. Chalmers.

hairyfigment 5 Oct 2023 23:21 UTC
5 points
0
in reply to: gallabytes’s comment on: Evaluating the historical value misspecification argument
This would make more sense if LLMs were directly selected for predicting preferences, which they aren’t. (RLHF tries to bridge the gap, but this apparently breaks GPT’s ability to play chess—though I’ll grant the surprise here is that it works at all.) LLMs are primarily selected to predict human text or speech. Now, I’m happy to assume that if we gave humans a D&D-style boost to all mental abilities, each of us would create a coherent set of preferences from our inconsistent desires, which vary and may conflict at a given time even within an individual. Such augmented humans could choose to express their true preferences, though they still might not. If we gave that idealized solution to LLMs, it would just boost their ability to predict what humans or augmented humans would say. The augmented-LLM wouldn’t automatically care about the augmented-human’s true values.
While we can loosely imagine asking LLMs to give the commands that an augmented version of us would give, that seems to require actually knowing how to specify how a D&D ability-boost would work for humans—which will only resemble the same boost for AI at an abstract mathematical level, if at all. It seems to take us back to the CEV problem of explaining how extrapolation works. Without being able to do that, we’d just be hoping a better LLM would look at our inconsistent use of words like “smarter,” and pick the out-of-distribution meaning we want, for cases which have mostly never existed. This is a lot like what “Complexity of Wishes” was trying to get at, as well as the longstanding arguments against CEV. Vaniver’s comment seems to point in this same direction.
Now, I do think recent results are some evidence that alignment would be easier for a Manhattan Project to solve. It doesn’t follow that we’re on track to solve it.

hairyfigment 1 Sep 2023 8:15 UTC
2 points
−2
on: Meta Questions about Metaphilosophy
The classification heading “philosophy,” never mind the idea of meta-philosophy, wouldn’t exist if Aristotle hadn’t tutored Alexander the Great. It’s an arbitrary concept which implicitly assumes we should follow the aristocratic-Greek method of sitting around talking (or perhaps giving speeches to the Assembly in Athens.) Moreover, people smarter than either of us have tried this dead-end method for a long time with little progress. Decision theory makes for a better framework than Kant’s ideas; you’ve made progress not because you’re smarter than Kant, but because he was banging his head against a brick wall. So to answer your question, if you’ve given us any reason to think the approach of looking for “meta-philosophy” is promising, or that it’s anything but a proven dead-end, I don’t recall it.

hairyfigment 21 Aug 2023 3:19 UTC
0 points
0
in reply to: agp’s comment on: Ten Thousand Years of Solitude
Oddly enough, not all historians are total bigots, and my impression is that the anti-Archipelago version of the argument existed in academic scholarship—perhaps not in the public discourse—long before JD. E.g. McNeill published a book about fragmentation in 1982, whereas GG&S came out in 1997.

hairyfigment 20 Aug 2023 22:53 UTC
2 points
0
in reply to: agp’s comment on: Ten Thousand Years of Solitude
Perhaps you could see my point better in the context of Marxist economics? Do you know what I mean when I say that the labor theory of value doesn’t make any new predictions, relative to the theory of supply and demand? We seldom have any reason to adopt a theory if it fails to explain anything new, and its predictive power in fact seems inferior to that of a rival theory. That’s why the actual historians here are focusing on details which you consider “not central”—because, to the actual scholars, Diamond is in fact cherry-picking topics which can’t provide any good reason to adopt his thesis. His focus is kind of the problem.

hairyfigment 20 Aug 2023 8:51 UTC
3 points
0
in reply to: agp’s comment on: Ten Thousand Years of Solitude
>The first chapter that’s most commonly criticized is the epilogue—where Diamond puts forth a potential argument for why Europe, and not China, was the major colonial power. This argument is not central to the thesis of the book in any way,
It is, though, because that’s a much harder question to answer. Historians think they can explain why no American civilization conquered Europe, and why the reverse was more likely, without appeal to Diamond’s thesis. This renders it scientifically useless, and leaves us without any clear reason to believe it, unless he could take his thesis farther.
The counter-Diamond argument seems to be the opposite of Scott Alexander’s “Archipelago” idea. Constant war between similar cultures led to the development and spread of highly efficient government or state institutions, especially when it came to war. Devereaux writes, “Any individual European monarch would have been wise to pull the brake on these changes, but given the continuous existential conflict in Europe no one could afford to do so and even if they did, given European fragmentation, the revolutions – military, industrial or political – would simply slide over the border into the next state.”

hairyfigment 18 Aug 2023 0:48 UTC
2 points
0
in reply to: typingloudly’s comment on: Summary of and Thoughts on the Hotz/Yudkowsky Debate
I do see selves, or personal identity, as closely related to goals or values. (Specifically, I think the concept of a self would have zero content if we removed everything based on preferences or values; roughly 100% of humans who’ve every thought about the nature of identity have said it’s more like a value statement than a physical fact.) However, I don’t think we can identify the two. Evolution is technically an optimization process, and yet has no discernible self. We have no reason to think it’s actually impossible for a ‘smarter’ optimization process to lack identity, and yet form instrumental goals such as preventing other AIs from hacking it in ways which would interfere with its ultimate goals. (The latter are sometimes called “terminal values.”)

hairyfigment 1 Aug 2023 21:14 UTC
3 points
−1
on: What The Lord of the Rings Teaches Us About AI Alignment
So, what does LotR teach us about AI alignment? I thought I knew what you meant until near the end, but I actually can’t extract any clear meaning from your last points. Have you considered stating your thesis in plain English?

hairyfigment 22 Jul 2023 7:30 UTC
5 points
1
in reply to: Adele Lopez’s comment on: The UAP Disclosure Act of 2023 and its implications
You left out, ‘People naively thinking they can put this discussion to bed by legally requiring disclosure,’ though politicians would likely know they can’t stop conspiracy theorists just by proving there’s no conspiracy.