I blog at https://dynomight.net where I like to strain my credibility by claiming that incense and ultrasonic humidifiers might be bad for you.
dynomight
Thanks for the response! I must protest that I think I’m being misinterpreted a bit. Compare my quote:
the point of RCTs is to avoid resorting to regression coefficients on non-randomized sample
To the:
The point of RCTs is not to avoid resorting to regression coefficients.
The “non-randomized sample” part of that quote is important! If semaglutide had no impact on the decision to participate, then we can argue about about the theory of regressions. Yes, the fraction that participated happened to be close, but with small numbers that could easily happen by chance. The hypothesis of this research is that semaglutide would reduce the urge to drink! If the decision to participate was random, and I believed the conclusion of the experiment, then that conclusion would seem to imply that the decision to participate wasn’t random after all. It just seems incredibly strange to assume that there’s no impact of semaglutide on the probability of agreeing to the experiment, and very unlikely the other variables in the regression fix this, which is why I’m dubious that the regression coefficients reflect any causal relationship.
That said, I think the participation bias could go in either direction. I said (and maintain) that the lab experiment does provide some evidence in favor of semaglutide’s effectiveness. I just think that given the non-random selection, small sample, and general weirdness of having people drink in a room in a hospital as a measurement, it’s quite weak evidence. Given the dismal results from the drinking records (which have less of all of these issues) I think that makes the overall takeaway from this paper pretty negative.
The first RCT for GLP-1 drugs and alcoholism isn’t what we hoped
Counterintuitive effects of minimum prices
It ranges from 0% to 100%.
Small nitpick that doesn’t have any significant consequences—this isn’t technically true, it could be higher than 100%.
Wow, I didn’t realize bluesky already supports user-created feeds, which can seemingly use any algorithm? So if you don’t like “no algorithm” or “discover” you can create a new ranking method and also share it with other people?
Anyone want to create a lesswrong starter pack? Are there enough people on bluesky for that to be viable?
Well done, yes, I did exactly what you suggested! I figured that an average human lifespan was “around 80 years” and then multiplied and divided by 1.125 to get 80×1.125=90 and 80⁄1.125=71.111.
(And of course, you’re also right that this isn’t quite right since (1.125 − 1⁄1.125) / (1/1.125) = (1.125)²-1 = .2656 ≠ .25. This approximation works better for smaller percentages...)
Interesting. Looks like they are starting with a deep tunnel (530 m) and may eventually move to the deepest tunnel in Europe (1444 m). I wish I could find numbers on how much weight will be moved or the total energy storage of the system. (They say quote 2 MW, but that’s power, not energy—how many MWh?)
According to this article, a Swiss company is building giant gravity storage buildings in China and out of 9 total buildings, there should be a total storage of 3700 MWh, which seems quite good! Would love to know more about the technology.
You’re 100% right. (I actually already fixed this due to someone emailing me, but not sure about the exact timing.) Definitely agree that there’s something amusing about the fact that I screwed up my manual manipulation of units while in the process of trying to give an example of how easy it is to screw up manual manipulations of units...
You mentioned a density of steel of 7.85 g/cm^3 but used a value of 2.7 g/cm^3 in the calculations.
Yes! You’re right! I’ve corrected this, though I still need to update the drawing of the house. Thank you!
Arithmetic is an underrated world-modeling technology
Word is (at least according to the guy who automated me) that if you want an LLM to really imitate style, you really really want to use a base model and not an instruction-tuned model like ChatGPT. All of ChatGPT’s “edge” has been worn away into bland non-offensiveness by the RLHF. Base models reflect the frightening mess of humanity rather than the instructions a corporation gave to human raters. When he tried to imitate me using instruction-tuned models it was very cringe no matter what he tried. When he switched to a base model it instantly got my voice almost exactly with no tricks needed.
I think many people kinda misunderstand the capabilities of LLMs because they only interact with instruction-tuned models.
Why somewhat? It’s plausible to me that even just the lack of DHA would give the overall RCT results.
Yeah, that seems plausible to me, too. I don’t think I want to claim that the benefits are “definitely slightly lower”, but rather that they’re likely at least a little lower but I’m uncertain how much. My best guess is that the bioactive stuff like IgA does at least something, so modern formula still isn’t at 100%, but it’s hard to be confident.
My impression was that the backlash you’re describing is causally downstream of efforts by public health people to promote breastfeeding (and pro-breastfeeding messages in hospitals, etc.) Certainly the correlation is there (https://www.researchgate.net/publication/14117103_The_Resurgence_of_Breastfeeding_in_the_United_States) but I guess it’s pretty hard to prove a strict cause.
I’m fascinated that caffeine is so well-established (the most popular drug?) and yet these kinds of self-experiments still seem to add value over the scientific literature.
Anyway, I have a suspicion that tolerance builds at different rates for different effects. For example, if you haven’t had any caffeine in a long time (like months), it seems to create a strong sense of euphoria. But this seems to fade very quickly. Similarly, with prescription stimulants, people claim that tolerance to physical effects happens gradually, but full tolerance never develops for the effect on executive function. (Though I don’t think there are any long-term experiments to prove this.)
These different tolerances are a bit hard to understand mechanistically: Doesn’t caffeine only affect adenosine receptors? Maybe the body also adapts at different places further down the causal chain.
Nursing doubts
(Many months later) Thanks for this comment, I believe you are right! Strangely, there do seem to be many resources that list them as being hydrogen bonds (e.g. Encyclopedia Brittanica: https://www.britannica.com/science/unsaturated-fat which makes me question their editorial process.) In any case, I’ll probably just rephrase to avoid using either term. Thanks again, wish I had seen this earlier!
Thanks, any feedback on where the argument fails? (If anywhere in particular.)
Datasets that change the odds you exist
I would dissuade no one from writing drunk, and I’m confident that you too can say that people are penguins! But I’m sorry to report that personally I don’t do it by drinking but rather writing a much longer version with all those kinds of clarifications included and then obsessively editing it down.
What premises would I have to accept for the comparison to be fair? Suppose I think that available compute will continue to grow along previous trends and that we’ll continue to find new tricks to turn extra compute into extra capabilities. Does conditioning on that make it fair? (Not sure I accept those premises, but never mind that.)