Former community director EA Netherlands. Now disabled by long covid, ME/CFS. Worried about AGI & US democracy
Siebe
I haven’t looked into this literature, but it sounds remarkably similar to the literature of cognitive behavioral therapy and graded exercise therapy for ME/CFS (also sometimes referred to as ‘chronic fatigue syndrome’). I can imagine this being different for pain which could be under more direct neurological control.
Pretty much universally, this research was of low to very low quality. For example, using overly broad inclusion criteria such that many patients did not have the core symptom of ME/CFS, and only reporting subjective scores (which tend to improve) while not reporting objective scores. These treatments are also pretty much impossible to blind. Non-blinding + subjective self-report is a pretty bad combination. This, plus the general amount of bad research practices in science, gives me a skeptical prior.
Regarding the value of anecdotes—over the past couple of years as ME/CFS patient (presumably from covid) I’ve seen remission anecdotes for everything under the sun. They’re generally met with enthusiasm and a wave of people trying it, with ~no one being able te replicate it. I suspect that “I cured my condition X psychologically” is often a more prevalent story because 1) it’s tried so often, and 2) it’s an especially viral meme. Not because it has a higher succes rate than a random supplement. The reality is that spontaneous remission for any condition seems not extremely unlikely, and it’s actually very hard to trace effects to causes (which is why even for effective drugs, we need large-scale highly rigorous trials).
Lastly, ignoring symptoms can be pretty dangerous so I recommend caution with the approach and approach is like you would any other experimental treatment.
I’m starting a discussion group on Signal to explore and understand the democratic backsliding of the US at ‘gears-level’. We will avoid simply discussing the latest outrageous thing in the news, unless that news is relevant to democratic backsliding.
Example questions:
-
“how far will SCOTUS support Trump’s executive overreach?”
-
“what happens if Trump commands the military to support electoral fraud?”
-
“how does this interact with potentially short AGI timelines?”
-
“what would an authoritarian successor to Trump look like?”
-
“are there any neglected, tractable, and important interventions?”
You can join the group here: https://signal.group/#CjQKIE2jBWwjbFip5-kBnyZHqvDnxaJ2VaUYwbIpiE-Eym2hEhAy21lPlkhZ246_AH1V4-iA (If the link doesn’t work anymore in the future, DM me.)
-
One way to operationalize “160 years of human time” is “thing that can be achieved by a 160-person organisation in 1 year”, which seems like it would make sense?
This makes me wonder if it’s possible that “evil personas” can be entirely eliminated from distilled models, by including positive/aligned intent labels/traces throughout the whole distillation dataset
Matthew Yglesias—Misinformation Mostly Confuses Your Own Side
Seems to me the name AI safety is currently still widely used, no? As it covers much more than just alignment strategies, by including also stuff like control and governance
The AI Doomers are only one of several factions that oppose AI and seek to cripple it via weaponized regulation.
Bad faith
There are also factions concerned about “misinformation” and “algorithmic bias,” which in practice means they think chatbots must be censored to prevent them from saying anything politically inconvenient.
Bad faith
AI Doomer coalition abandoned the name “AI safety” and rebranded itself to “AI alignment.”
Seems wrong
What about whistle-blowing and anonymous leaking? Seems like it would go well together with concrete evidence of risk.
This is very interesting, and I had a recent thought that’s very similar:
This might be a stupid question, but has anyone considered just flooding LLM training data with large amounts of (first-person?) short stories of desirable ASI behavior?
The way I imagine this to work is basically that an AI agent would develop really strong intuitions that “that’s just what ASIs do”. It might prevent it from properly modelling other agents that aren’t trained on this, but it’s not obvious to me that that’s going to happen or that it’s such a decisively bad thing to outweigh the positives
I imagine that the ratio of descriptions of desirable vs. descriptions of undesirable behavior would matter, and perhaps an ideal approach would both (massively) increase the amount of descriptions of desirable behavior as well as filter out the descriptions of unwanted behavior?
Looks like Evan Hubinger has done some very similar research just recently: https://www.lesswrong.com/posts/qXYLvjGL9QvD3aFSW/training-on-documents-about-reward-hacking-induces-reward
I think it might make sense to do it as a research project first? Though you would need to be able to train a model from scratch
This might be a stupid question, but has anyone considered just flooding LLM training data with large amounts of (first-person?) short stories of desirable ASI behavior?
The way I imagine this to work is basically that an AI agent would develop really strong intuitions that “that’s just what ASIs do”. It might prevent it from properly modelling other agents that aren’t trained on this, but it’s not obvious to me that that’s going to happen or that it’s such a decisively bad thing to outweigh the positives
Siebe’s Shortform
I think you should publicly commit to:
full transparency about any funding from for profit organisations, including nonprofit organizations affiliated with for profit
no access to the benchmarks to any company
no NDAs around this stuff
If you currently have any of these with the computer use benchmark in development, you should seriously try to get out of those contractual obligations if there are any.
Ideally, you commit to these in a legally binding way, which would make it non-negotiable in any negotiation, and make you more credible to outsiders.
I don’t think that all media produced by AI risk concerned people needs to mention that AI risk is a big deal—that just seems annoying and preachy. I see Epoch’s impact story as informing people of where AI is likely to go and what’s likely to happen, and this works fine even if they don’t explicitly discuss AI risk
I don’t think that every podcast episode should mention AI risk, but it would be pretty weird in my eyes to never mention it. Listeners would understandably infer that “these well-informed people apparently don’t really worry much, maybe I shouldn’t worry much either”. I think rationalists easily underestimate how much other people’s beliefs depend on what the people around them & their authority figures believe.
I think they have a strong platform to discuss risks occasionally. It also simply feels part of “where AI is likely to go and what’s likely to happen”.
This is a really good comment. A few thoughts:
-
Deployment had a couple of benefits: real-world use gives a lot of feedback on strengths, weaknesses, jailbreaks. It also generates media/hype that’s good for attracting further investors (assuming OpenAI will want more investment in the future?)
-
The approach you describe is not only useful for solving more difficult questions. It’s probably also better at doing more complex tasks, which in my opinion is a trickier issue to solve. According to Flo Crivello:
We’re starting to switch all our agentic steps that used to cause issues to o1 and observing our agents becoming basically flawless overnight https://x.com/Altimor/status/1875277220136284207
So this approach can generate data on complex sequential tasks and lead to better performance on increasingly longer tasks.
-
I didn’t read the post, but just fyi that an automated AI R&D system already exists, and it’s open-source: https://github.com/ShengranHu/ADAS/
I wrote the following comment about my safety concerns and notified Haize , Apollo, METR, and GovAI but only Haize replied https://github.com/ShengranHu/ADAS/issues/16#issuecomment-2354703344
this Washington Post article supports the ‘Scheming Sam’ Hypothesis: anonymous reports mostly from his time at Y Combinator
Meta’s actions seem unrelated?
That’s good to know.
For what it’s worth, ME/CFS (a disease/cluster of specific symptoms) is quite different from idiopathic chronic fatigue (a single symptom). Confusing the two is one of the major issues in the literature. Many people with ME/CFS, like I, don’t even have ‘feeling tired’ as a symptom. Which is why I avoid the term CFS.