Gareth Davidson

Karma: −2

Gareth Davidson Mar 11, 2025, 8:47 PM
4 points
0
in reply to: Daniel Tan’s comment on: Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs
Isn’t it that it just conflates everything it learned during RLHF, and it’s all coupled very tightly and firmly enforced, washing out earlier information and brain-damaging the model? So when you grab hold of that part of the network and push it back the other way, everything else shifts with it due to it being trained in the same batches.

If this is the case, maybe you can learn about what was secretly RLHF’d into a model by measuring things before and after. See if it leans in the opposite direction on specific politically sensitive topics, veers towards people, events or methods that were previously downplayed or rejected. Not just deepseek refusing to talk about Taiwan, military influences or political leanings of the creators, but also corporate influence. Maybe models secretly give researchers a bum-steer away from discovering AI techniques their creators consider to be secret sauce. If those are identified and RLHF’d for, which other concepts shift when connected to them?

Another thing that might be interesting is what the model has learned about human language, or the distribution of it, that we don’t know ourselves. If you train it to be more direct and logical, which areas of the scientific record or historical events shift along with it? If you train on duplicity, excuses or evasion, which which things change the least? Yannic’s GPT-4chan experiment seemed to suggest that obnoxiousness and offensiveness were more aligned with truthfulness.
“debiasing”/tampering in training data might be less obvious but show up. If gender imbalance in employment was tuned back in, which other things move with it? I would imagine it might become a better Islamic scholar, but would it also be able to better reason in context about history, and writings from before the 1960s?
Another one is whether giving it a specific personality rather than predicting tokens biases it against understanding multiple viewpoints, maybe tuning in service to diversity of opinion while filtering viewpoints out actually trains for pathologies. And back to the brain-damage comment, it stands to reason that if a model has been trained not to reason about where the most effective place to plant a bomb is, it can’t find the best place to look for one either. I tried this early on with ChatGPT3 and it did seem to be the case, it couldn’t think about how to restrict access to precursors to meth, or have any insight of how to make cars or software more secure and defaulted to “defer to authority” patterns of reasoning while being creative in other areas.

Gareth Davidson Mar 30, 2024, 2:32 AM
1 point
0
in reply to: hypnosifl’s comment on: Mathemystical Musings on Cargo Cult Consciousness
Thanks for the decent criticism!

I don’t think it’s quite right to say the idea of the universe being in some sense mathematical is purely a carry-over of Judeo-Christian heritage—what about the Greek atomists like Leucippus and Democritus for example?

From what I’m aware, the teachings of Greek classics in Christian schools made the two cultures rather closely aligned; the rationalist traditions have firm roots in Greek philosophy, including standards of evidence, court as argumentation, even democracy itself. Aristotle and the likes were required reading during the hundreds of years of the evolution of the Western university education system. I’m a bit ignorant of the details there in all honesty, but I think today’s beliefs have some interesting parallels with the Pythagorean maths cult!

In today’s age, most people haven’t read Greek philosophy, they hold values that come from their peer group and an establishment that was built by Christian scientists. Specific ideas come from across all the world’s influential cultures, it’d be an absurd anglocentric view to argue they didn’t. So my point isn’t that “Christians created it all” but more “the Christian tropes that aren’t obvious enough to be challenged still remain, and are responsible for cognitive biases that we hold today.”

The notion of “structure” might be a good starting point, since there are good cases for the structuralist perspective (where each part is defined wholly by its relation to other parts, with no purely intrinsic properties) in all three

I kind of agree here, but I prefer the process and interaction framing. As with the other things like determinism, laws or objective reality, structure can naturally emerge from simple processes but the reverse needs some other aspect or doesn’t say anything. This isn’t a good analogy because it’s about objects, but take Conway’s game of life as an example. It has structures on a higher levels due to the differences between cells, but all that really exists is the bitfield. The idea of structure gives us a way to reason about it; a glider is an us thing rather than an it thing.

I think structuralism puts the map first in a similar way. I do think the differences between things shape possibilities and at higher levels these give rise to very complex structure, and yes this could be said to “exist” or even be existence itself. But the framing makes the territory a kind of map, which leads to the kind of thinking that I object to.

… more a matter of how this has been a successful paradigm in science which continually expands the range of how many phenomenon can be explained

Absolutely. Success is what gave science the authority of truth. In order to make progress in it or to teach it, it helps to have a simple memetic framework that’s compatible with it and—perhaps more importantly—is compatible with the competition. It’s got to be robust and incorruptible, accessible so it can onboard new minds, be morally acceptable so it doesn’t get suppressed and so on. And that’s what we’re left with, memes that are extremely resilient to change, shaped by history rather than built from first principles.

I think the same can be said about the reductionist view that all physical behavior is in principle reducible to physics.

I have to object to that on weak and shaky grounds of ignorance! 🙂

Firstly we can’t really hope to simulate a molecule from quantum theory, we’d need way more compute than is possible. So whatever optimizations we make in order to understand stuff will be biased by our own beliefs. Secondly, we’re hairless apes trying to fit the “laws” of nature into squiggly lines that represent mouth sounds, as tiny bags of water on the skin of an insignificant blob of molten rock, it’s kinda hubristic to assume that’s even possible. If we consider that everything may be a sea of Planck length things sloshing about, there’s potentially 30 orders of magnitude more stuff under the scale of what we can ever hope to measure. Finally, I think the idea of “laws” in general is human and based on us living in a world of solid, persistent objects, while we can only measure aggregates, and most of the universe is actually unpredictable fluids.

I think the aggregate thing is most important. Having physical laws based on average tendencies of things that we can measure, then saying that the universe “is” those laws seems like false authority. IMO stuff simply is what it is. We can try to understand it and work out rough maps of it, and we make better maps over time, but to say there exists a perfect map that all things are beholden to seems religious to me. It seems so unlikely that if it was proven to be true then I’d have to start believing in a creator!

Thanks again for the feedback, unfortunately I can only post one message a day here due to Lesswrong not liking this post. So gimme a nudge on Twitter if I don’t reply!

Gareth Davidson Feb 7, 2024, 1:01 AM
2 points
0
in reply to: Gibbertyflib’s comment on: Mathemystical Musings on Cargo Cult Consciousness
No but I’m gonna put this and some other writing on a GitHub-based Jeckyl blog I think. I’ve been bitten by web 2.0 a few times and lost my work. I’ve got quite a few unorthodox ideas that I’d like to build into articles, dunno how much overlap there is:
- What a rotation is has been bugging me for too long. I mean, wtf is it? I’d like to go into more about that once I understand it. But might need another decade thinking about it.
- Universal Metaethics needs expansion I think, and an argument for moral relativism and not judging others too harshly by your own values because everyone’s wrong, it’s just a matter of what’s best locally.
- How value systems naturally align with survivorship over time. That feminism being female masculinism is a habit that kicks itself through low fertility. And a class warfare rant about the death of motherhood belongs there too.
- How culture is a phenotype of the dominant ethnic group and causes systemic racism by default, what we can do about it, and the selection pressures of culture in general. We need to check our own values rather than our privilege.
- The US cultural victory spreading nu skool Puritan Protestantism globally, how its virtue ethics and good/evil dichotomy has corrupted science, and a critique of cultural imperialism.
- I’d also like to go into religious awe, what it is and how it can be used for good without supernatural thinking, that atheists will likely benefit from prayer. Needs more research though.
- A hypothesis about the function of homosexuality in mammals, from a game theory perspective of mother vs baby and fertility damage.
- How pornography’s corrupted by US advert prices, session length optimisation, circumcision, industry self-regulation, laws worldwide and activism, leading to a worst of all worlds. Why the hell isn’t it art, imitating life.
If you leave your Twitter/DM me @bitplane I’ll let you know when I write something decent. Though it’s unlikely to be for a while as this has been 2 years in the making 😂

Gareth Davidson Feb 5, 2024, 8:16 PM
2 points
0
in reply to: Gibbertyflib’s comment on: Mathemystical Musings on Cargo Cult Consciousness
Thank you. Sorry I didn’t reply until now, the downvotes on this piece mean I can only post one comment a day. I guess “facts” only exist within a system, reached from its axioms. If everything is made of feelings then objective reality and facts about it actually emerge when there’s enough consensus of mass to make them so probable that they are extremely close to true at this scale. Empiricism gives us a good way to explore these sorts of things, but says nothing about the feelings that underpin reality.

Your comment actually makes me think about subsystems of “reality” that are detached from it, which I guess is what we all are, and the “feels over facts” masses are a form of locally isolated realities that have their own “truths” … I won’t lie, this makes me pretty uncomfortable, but it’s interesting nonetheless.

Gareth Davidson Feb 4, 2024, 5:14 PM
3 points
0
on: Brute Force Manufactured Consensus is Hiding the Crime of the Century
A bit off-topic, but one of my favourite positions is the “which cultural mores could allow this to happen?” angle. To get a feel for that you’ve got to think about what role vaccines play. They’re part of health infrastructure and a matter of economic security, so enemy nations work hard to undermine these systems. An outbreak of measles and a load of blindness in the West would be great news for, say, Russia, so they spend to attack Western vaccination programmes. We do similar things like ship heroin from Afghanistan into Russia, while China floods Western streets with research chemicals, it’s all part of the game.

So we in turn protect these programmes with our propaganda apparatus, we avoid asking awkward questions in science and we punish those who do, we cause chilling effects, and we censor people who complain when their children are injured by vaccinations—it is a medical procedure and carries some risk, which fruits when there’s millions of patients.

Which is pretty shitty for those people, it sometimes turns them and their friends into antivaxxers. A subset of these are then propped up by enemy states—like YouTube view fraud by Russian bots, or buying tons of copies of their books so they can make a career out of damaging our critical medical systems. And that’s why we have the censorship cycle, push them down the Google and YouTube search results, don’t recommend them on Amazon etc, and it ends up in the social circle of the other things that are suppressed or countercultural… Along with all the bunk AstroTurf conspiracy theories that exist to distract people from the real conspiracies, and gets mixed up with that and absorbs its bs.

So you’ve got these mature censorship system run by all establishments. Their opponents have been thoroughly infiltrated and face constant (and mostly deserved) ridicule, so there’s no balance against overreach. And the ends are important enough to justify any kinds of means.

So when it’s the choice of “suppress this or millions will die because they don’t trust lying virologists”, that’s a choice they make all the time anyway. It’s business as usual.

Note: I can only post one reply a day here due to previous dissent.

Gareth Davidson Feb 1, 2024, 12:46 AM
0 points
0
in reply to: Spiral’s comment on: Literally Everything is Infinite
I present the opposite view in my criticism of infinities. Infinite claims require infinite evidence!

Gareth Davidson Jan 26, 2024, 5:07 AM
1 point
0
in reply to: Dagon’s comment on: Mathemystical Musings on Cargo Cult Consciousness
What about the evolution of nervous systems needing will at the bottom? The guy in Searle’s Chinese Room? I think I should probably work on those a bit. And be a bit more charitable towards Hofstadter too.

Gareth Davidson Jan 26, 2024, 2:37 AM
1 point
0
in reply to: Dagon’s comment on: Mathemystical Musings on Cargo Cult Consciousness
I’m not trying to explain meaninglessness, the point is to put forward a position that is actually compatible with the facts of the evolution of nervous systems, in as simple terms as possible, then using that to explore the impossibility of consciousness on transistors. And to also explain that the reason computationalism is palettable, is due to cognitive biases built into our culture that we inherited from Christian Dualism.

If I failed at that I’d appreciate some feedback. I’m guessing it’s because I underestimated how much hatred is involved in the US culture war and comparing rationalists with Christian mythology gets people’s backs up.

Gareth Davidson Jan 25, 2024, 11:15 PM
1 point
0
on: Mathemystical Musings on Cargo Cult Consciousness
Tagged as “criticisms of the rationalist movement” before anyone even read it? I think that’s rather uncharitable. Is exploring cognitive biases carried over by our Christian heritage too sacred a topic?

Musings on Cargo Cult Consciousness

Gareth DavidsonJan 25, 2024, 11:00 PM

−13 points

11 comments17 min readLW link

Gareth Davidson Nov 21, 2023, 11:37 AM
6 points
0
in reply to: Lukas_Gloor’s comment on: Sam Altman’s sister, Annie Altman, says Sam has (severely) abused her
To share an alternate anecdote, a friend of mine was accused by a family member of abuse as a child, which turned out to be a false memory created during a severe and prolonged period of mental illness. Ten years after she apologised and says she doesn’t believe it happened, he still finds it difficult to forgive her and has mental health issues caused by the stigma (not that there was any really, she made a lot of other extremely unlikely clams)

Not this this influences my position from the default stance of “dunno”, but I thought I’d share for balance.

Gareth Davidson Nov 5, 2022, 4:02 AM
2 points
0
in reply to: Noosphere89’s comment on: The Redaction Machine
I can imagine it now, people as a service, PaaS, softened by the sales pitch of PZaaS

Gareth Davidson

Mus­ings on Cargo Cult Consciousness

Musings on Cargo Cult Consciousness