As someone who has Graves’ Disease … one of the reasons that you really don’t want to run your metabolism faster with higher T4 levels is that higher heart rate for an extended period can cause your heart to fail.
Michael Roe
I will redact out the name of the person here, but it’s a moderately well known UK politician.
The question sometimes comes up as to whether X is an anti-Semite. To which, people have had direct dealings with X typically respond with something to that they don’t think X has it in for Jews specifically, but they think X is a complete asshole ..and then launch into telling some story of a thing X did that annoyed them. This is, to my mind, not exactly an endorsement of X’s character.
The AI risk community seems to be more frequently adjacent to “crazy Buddhist yoga sex cult” than I would have expected.
I think I usually understand why when I get bad vibes from someone.
Yoga sex cults have a bad track record for turning out to be abusive. So, if I know the guy is in some kind of yoga sex cult, I am going to suspect that there will eventually be some sort of sex scandal, even if I don’t have evidence for the exact specifics.
Given some past examples, I’ve seen, I now have a “tip of the iceberg” theory for bad behaviour. Like, if I know the guy has done some bad stuff, it is statistically likely that he’s also involved in some other bad stuff that I wasn’t in a position to observe,
That’s interesting, if true. Maybe the tokeniser was trained on a dataset that had been filtered for dirty words.
I suppose we might worry that LlMs might learn to do RLHF evasion this way—human evaluator sees Chinese character they don’t understand, assumes it’s ok, and then the LLM learns you can look acceptable to humans by writing it in Chinese.
Some old books (which are almost certainly in the training set) used Latin for the dirty bits. Translations of Sanskrit poetry, and various works by that reprobate Richard Burton, do this.
As someone who, in a previous job, got to go to a lot of meetings where the European commission is seeking input about standardising or regulating something—humans also often do the thing where they just use the English word in the middle of a sentence in another language, when they can’t think what the word is. Often with associated facial expression / body language to indicate to the person they’re speaking to “sorry, couldn’t think of the right word”. Also used by people speaking English, whose first language isn’t English, dropping into their own lamguage for a word or two. If you’ve been the editor of e.g. an ISO standard, fixing these up in the proposed text is such fun.
So, it doesn’t surprise me at all that LLMs do this.
I have, weirdly, seen llms put a single Chinese word in the middle of English text … and consulting a dictionary reveals that it was, in fact, the right word, just in Chinese.
I will take “actually, it’s even more complicated” as a reasonable response. Yes, it probably is.
Candidate explanations for some specific person being trans could as easily be that they are sexually averse, rather than that they are turned on by presenting as their preferred gender. Compare anorexia nervosa, which might have some parallel with some cases of gender identity disorder. If the patient is worrying about being gender non conforming in the same way that an anorexic worries that that they’re fat, then Blanchard is just completely wrong about what the condition even is in that case.
This might be a good (if controversial) example of “the reality is more complicated than typical simplifications, and it matters what your oversimplification is leaving out”.
And Blanchard’s account of autogynephilia is more nuanced than most peoples second hand version of it. Like, e.g. Blanchard doesn’t think trans men have AGP, and doesn’t think trans women who are attracted to men have AGP.
So, we might, say…
Oversimplication 1: Even Blanchard didn’t try to apply his theory to trans men or trans women attracted to men
Oversimplification 2: Bisexuals exist. Many trans women report their sexual orientation changing when they start taking hormones. The correlation between having AGP and being attracted to women can’t be as 100% as Blanchard appears to believe it is.
Oversimplification 3: looks like Blanchard only identified two subtypes of trans person, and completely missed some of the other subtypes.
Oversimplification 4: Do heterosexual cisgender women have AGP? (Cf. Comments by Aella, eigenrobot etc.) if straight cisgender women also like being attractive in the same way as (some) trans women do, it becomes somewhat doubtful that it’s a pathology.
To add to the differences between people:
I can choose to see mental images actually overlaid over my field of vision, or somehow in a separate space.
The obvious question someone might ask: can you trace an overlaid mental image? The problem is registration—if my eyes move, the overlaid mental image can shift relative to an actual, perceived, sheet of paper. Easier to do a side by side copy than trace.
I think there might be other aspects to trauma, though. Some possible candidates:
- memories feel as if they are “tagged” with an emotion, in a way that memories normally aren’t-depletion of some kind of mental resource; not sure what to call it, so I won’t be too so specific about exactly what is depleted
One of the ideas in Cognitive Behavioral Therapy is you might be treating as dangerous something that actually isn’t dangerous (and don’t learn that it’s safe because you’re avoiding it).
so the account you’re giving here seems to be fairly standard.
On the other hand: some things actually are dangerous.
In any case, as a researcher currently working in this area, I am putting a big bet on moderate badness happening (in that I could be working on something else, and my time has value).
Also, there is counterparty risk if you bet on everyone dying.
(Yeah, yeah, you can bet on something like other peoples belief in the impednding apocalypse going up before it actually happens).“Rapid takeoff” hypotheses are particularly hard to bet on.
If I was going to play this game with an AI, I’d also feed it my genomic data, which would reveal I have a version of the HLA genes that makes me more likely to develop autoimmune diseases.
Probably, if some AI were to recommend additional blood testing I could manage to persuade the wctual medical professionals to do it. Recent conversation went some thing like this:
Me: “can I have my thyroid levels checked pleas? And the consultant endocrinologist said he’d like to see a liver function test done next time i give a blood sample.”Nurse (taking my blood sample and pulling my medical record up in the computer) “you take carbimazole right?”
Me: “yes”
Nurse (ticking boxes on a form on the computer) “… and full blood panel, and electrolytes…”
Probably wouldn’t be hard to get suggestions from an AI added to the list.
Things I might spend more money on, if the were better AI’s to spend it on,
1. I am currently having a lot of blood tests done, with a genuine qualified medical doctor interpreting the results. Just for fun, I can see if AI gives a similar interpretation of the test results (its not bad).Suppose we had AI that was actually better than human doctors, and cheaper. (Sounds like that might be here real soon, to be honest). I would probably pay money for that.
2. Some work things I am doing involve formally proving correctness of software. AI is not there, quite yet. If it was, I could probably get DARPA to pay the license fee for it, assuming cost isnt absolutely astronomical.
Etc.
On the other hand, this would imply that most doctors, and mathematicians, are out of work.
More generally: changing the set point of any of these system might cause the failure of some critical component that depends on the old value of the set point,