I operate by Crocker’s rules.
niplav
Related thought: Having a circular preference may be preferable in terms of energy expenditure/fulfillability, because it can be implemented on a reversible computer and fulfilled infinitely without deleting any bits. (Not sure if this works with instrumental goals.)
Interesting! Are you willing to share the data?
It might be something about polyphasic sleep not being as effective as my oura thinks I go into deep sleep sometimes in deep meditation so inconclusive but most likely a negative data point here.
I’m pretty bearish on polyphasic sleep to be honest. Maybe biphasic sleep, since that may map onto some general mammalian sleep patterns.
Technically yes, it reduces sleep duration. My best guess is that this is co-occurring with a reduction in sleep need as well, but I haven’t calculated this—I only started collecting reaction speed data earlier this year. I could check my fitbit data for e.g. heart rate.
Ideally one’d do an RCT, but I have my hands full of those already.
Their epistemics led them to do a Monte Carlo simulation to determine if organisms are capable of suffering (and if so, how much) then got a value of 5 shrimp = 1 human and then not bat an eye at this number.
Neither a physicalist nor a functionalist theory of consciousness can reasonably justify a number like this. Shrimp have 5 orders of magnitude fewer neurons than humans, so whether suffering is the result of a physical process or an information processing one, this implies that shrimp neurons do 4 orders of magnitude more of this process per second than human neurons.
epistemic status: Disagreeing on object-level topic, not the topic of EA epistemics.
I disagree, especially functionalism can justify a number like this. Here’s an example for reasoning on this:
Suffering is the structure of some computation, and different levels of suffering correspond to different variants of that computation.
What matters is whether that computation is happening.
The structure of suffering is simple enough to be represented in the neurons of a shrimp.
Under that view, shrimp can absolutely suffer in the same range as humans, and the amount of suffering is dependent on crossing some threshold of number of neurons. One might argue that higher levels of suffering require computations with higher complexity, but intuitively I don’t buy this—more/purer suffering appears less complicated to me, on introspection (just as higher/purer pleasure appears less complicated as well.)
I think I put a bunch of probability mass on a view like above.
(One might argue that it’s about the number of times the suffering computation is executed, not whether it’s present or not, but I find that view intuitively less plausible.)
You didn’t link the report and I’m not able to make it out from all of the Rethink Priorities moral weight research, so I can’t agree/disagree on the state of EA epistemics shown in there.
Yeah, there’s also reports on Tai Chi doing the same, see @cookiecarver’s report.
Meditation and Reduced Sleep Need
See also the counter-arguments by Gwern.
A very related experiment is described in Yudkowsky 2017, and I think one doesn’t even need LLMs for this—I started playing with an extremely simple RL agent trained on my laptop, but then got distracted by other stuff before achieving any relevant results. This method of training an agent to be “suspicious” of too high rewards would also pair well with model expansion; train the reward-hacking-suspicion circuitry fairly early as to avoid ability to sandbag this, and lay traps for reward hacking again and again during the gradual expansion process.
Thank you for running the competition! It made me use & appreciate squiggle more, and I expect that a bunch of my estimation workflows in the future will be generating and then tweaking an AI-generated squiggle model.
My best guess is that the intended reading is “90% of the code at Anthropic”, not in the world at large—if I remember the context correctly that felt like the option that made the most sense. (I was confused about this at first, and the original context on this is not clear whether the claim is about the world at large or about Anthropic specifically.)
Link in the first line of the post probably should also be https://www.nationalsecurity.ai/.
Looked unlikely to me given the most-publicly-associated-with-MIRI person is openly & loudly advocating for funding this kind of work. But maybe the association isn’t as strong as I think.
Great post, thank you. Ideas (to also mitigate extremely engaging/addictive outputs in long conversations):
Don’t look at the output of the large model, instead give it to a smaller model and let the smaller model rephrase it.
I don’t think there’s useful software for this yet, though that might not be so hard? Could be a browser extension. To do for me, I guess.
Don’t use character.ai and similar sites. Allegedly, users spend on average two hours a day talking on there (though I find that number hard to believe). If I had to guess they’re fine-tuning models to be engaging to talk to, maybe even doing RL based on conversation length. (If they’re not yet doing it, a competitor might, or they might in the future).
“One-shotting is possible” is a live hypothesis that I got from various reports from meditation traditions.
I do retract “I learned nothing from this post”, the “How does one-shotting happen” section is interesting, and I’d like it to be more prominent. Thanks for poking, I hope I’ll find the time to respond to your other comment too.
Please don’t post 25k words of unformatted LLM (?) output.
I gave your post to Claude and gave it the prompt “Dearest Claude, here’s the text for a blogpost I’ve written for LessWrong. I’ve been told that “it sounds a lot like an advertisement”. Can you give me feedback/suggestions for how to improve it for that particular audience? I don’t want to do too much more research, but a bit of editing/stylistic choices.”
(All of the following is my rephrasing/rethinking of Claude output plus some personal suggestions.)
Useful things that came out of the answer were explaining more about the method you’ve used to achieve this, since your bullet-point list in the beginning isn’t detailed enough for anyone to try to replicate the method.
Also notable is that you only have positive examples for your method, which activates my filtered evidence detectors. Either make clear that you indeed did only have positive results, or name how many people you coached, for how long, and that they were all happy with what you provided.
Finally, some direct words from Claude that I just directly endorse:
For LessWrong specifically, I’d also recommend:
Adding a section on falsifiability—how would you know if your approach doesn’t work?
Discussing potential failure modes of your approach
Including more technical details on your methodology (not just results)
Especially, how would you be able to distinguish between your approach convincing your customers they were helped, instead of actually changing their behavior? That feels like the failure mode of most self-help techniques—they’re “self-recommending”.
Just FYI, I am considering downvoting this (and see that other people have downvoted it) because it reads like an advertisement (and maybe just is an advertisement?).
I don’t feel like I learned anything new from the post.
Similarly, you can just wear a leather jacket and sunglasses.
Reasons for thinking that later TAI would be better:
General human progress, e.g. increased wealth, wealthier people take fewer risks (aged populations also take fewer risks)
Specific human progress, e.g. on technical alignment (though the bottleneck may be implementation, much current work is specific to a paradigm), and human intelligence augmentation
Current time of unusually high geopolitical tension, in a decade PRC is going to be the clear hegemon
Reasons for thinking that sooner TAI would be better:
AI safety community has an unusually strong influence at the moment, and decided to deploy most of that influence now (more influence in the anglosphere, lab leaders have heard of AI safety ideas/arguments); it might lose that kind of influence and mindshare
Current paradigm is likely unusually safe (LLMs starting with world-knowledge, non-agentic at first, visible thoughts), later paradigms plausibly much worse65%
PRC being the hegemon would be bad because of risks from authoritarianism
Hardware overhangs less likely, leading to a more continuous development