I write software at survivalandflourishing.com. Previously MATS, Google, Khan Academy.
Joel Burget
Quick Thoughts on Scaling Monosemanticity
[Question] How is GPT-4o Related to GPT-4?
Measuring the composition of fryer oil at different times certainly seems like a good way to test both the original hypothesis and the effect of altitude.
You’re right, my original wording was too strong. I edited it to say that it agrees with so many diets instead of explains why they work.
One thing I like about the PUFA breakdown theory is that it agrees with aspects of so many different diets.
Keto avoids fried food because usually the food being fried is carbs
Carnivore avoids vegetable oils because they’re not meat
Paleo avoids vegetable oils because they weren’t available in the ancestral environment
Vegans tend to emphasize raw food and fried foods often have meat or cheese in them
Low-fat diets avoid fat of all kinds
Ray Peat was perhaps the closest to the mark in emphasizing that saturated fats are more stable (he probably talked about PUFA breakdown specifically, I’m not sure).
Edit: I originally wrote “neatly explains why so many different diets are reported to work”
If this was true, how could we tell? In other words, is this a testable hypothesis?
What reason do we have to believe this might be true? Because we’re in a world where it looks like we’re going to develop superintelligence, so it would be a useful world to simulate?
[Question] How to Model the Future of Open-Source LLMs?
From the latest Conversations with Tyler interview of Peter Thiel
I feel like Thiel misrepresents Bostrom here. He doesn’t really want a centralized world government or think that’s “a set of things that make sense and that are good”. He’s forced into world surveillance not because it’s good but because it’s the only alternative he sees to dangerous ASI being deployed.
I wouldn’t say he’s optimistic about human nature. In fact it’s almost the very opposite. He thinks that we’re doomed by our nature to create that which will destroy us.
Paul Christiano named as US AI Safety Institute Head of AI Safety
Three questions:
What format do you upload SAEs in?
What data do you run the SAEs over to generate the activations / samples?
How long of a delay is there between uploading an SAE and it being available to view?
This is fantastic. Thank you.
Thanks! I added a note about LeCun’s 100,000 claim and just dropped the Chollet reference since it was misleading.
Thanks for the correction! I’ve updated the post.
Highlights from Lex Fridman’s interview of Yann LeCun
I assume the 44k PPM CO2 exhaled air is the product of respiration (I.e. the lungs have processed it), whereas the air used in mouth-to-mouth is quickly inhaled and exhaled.
What’s your best guess for what percentage of cells (in the brain) receive edits?
Are edits somehow targeted at brain cells in particular or do they run throughout the body?
I don’t have a well-reasoned opinion here but I’m interested in hearing from those who disagree.
How would you distinguish between weak and strong methods?
Re Na:K : Potassium Chloride is used as a salt substitute (which tastes surprisingly like regular salt). This makes it really easy to tweak the Na:K ratio (if it turns out to be important). OTOH, it’s some evidence that it’s not important, otherwise I’d expect someone to notice that people lose weight when they substitute it for table salt.
To me the strongest evidence that fine-tuning is based on LoRA or similar is the fact that pricing is based just on training and input / output and doesn’t factor in the cost of storing your fine-tuned models. Llama-3-8b-instruct is ~16GB (I think this ought to be roughly comparable, at least in the same ballpark). You’d almost surely care if you were storing that much data for each fine-tune.