gwern

Karma: 79,743

https://gwern.net/

gwern May 28, 2025, 8:32 PM
7 points
0
in reply to: Steven Byrnes’s comment on: Alignment Proposal: Adversarially Robust Augmentation and Distillation
As I’ve said before, I think you greatly overrate the difficulty of putting search into neural nets, and this is an example of it. It seems to me like it is entirely possible to make a generic LLM implement an equivalent to AlphaZero and be capable of expert iteration, without an elaborate tree scaffolding. A tree search is just another algorithm which can be reified as a sequence, like all algorithms (because they are implemented on a computer).

All AlphaZero is, is a way of doing policy iteration/Newton updates by running a game state forward for a few plies, evaluating, and updating estimates. It’s not magic, and can obviously be encoded into a LLM’s generative process.

Here’s a concrete example of how in-principle I think a LLM can do AlphaZero-style expert iteration for Go: A LLM can serialize a board with value estimates as simply a few hundred tokens (361 points, 361 value estimates, miscellaneous metadata); this means in a frontier LLM like Claude-4-opus with 200k ctx, you can fit in easily 200 board states; so you can serialize out the lookahead of a bunch of possible moves and resulting board states (eg. take the top 14 moves and imagine the resulting board state and then imagine their next 14 top moves, for comparison, TD-Gammon looked forward like 1 move); and can back-propagate an updated value estimate, and spit out the original board state with better value estimates. “Move #4 was better than it looked, so I will +0.01 to the value estimate for it.” This improved board is now in context, and can be dynamically-evaluated to update the LLM: now it has to predict the new board state with the final improved estimates, and that improves the policy. The LLM finishes by setting up the next planning step: pick a deeper board state to evaluate next, and if the next board state is the end of the game, then it starts over with a fresh game. Run this indefinitely.

It repeatedly iterates through a possible game, evaluating each position to a certain depth, updating its weights to incorporate the policy improvement from the evaluation, and restarting with a fresh game. All serialized out as a long array/sequence, the tree just being implicitly represented by successive board states. (And then now that you have that in mind, you can imagine how to do things like deep rollouts: 200 moves is around a normal game of Go, so random rollouts are doable from most board states, and the LLM can just toggle between a shallow tree search and deep randomized rollouts if necessary eg by adding a ⁰⁄₁ token prefix.)

At no point do you need explicit tree scaffolding as you bootstrap from a LLM clueless about playing Go to the high performance that we know LLMs trained by imitation learning on board states/values/policies can reach, and at no point have I invoked a cognitive operation which is not easier than a lot of things we see LLMs do routinely, or where it’s implausible that they could do it. It is probably a lot less efficient and has other practical issues like how you integrate the rules of Go akin to AlphaZero/MuZero, etc, but in principle I think this algorithm is well-defined, concrete, and would work.

gwern May 27, 2025, 2:33 AM
15 points
3
on: Is Building Good Note-Taking Software an AGI-Complete Problem?
My earlier commentary on what I think note-taking tools tend to get wrong: https://gwern.net/blog/2024/tools-for-thought-failure

gwern May 26, 2025, 5:40 PM
5 points
2
on: AI #117: OpenAI Buys Device Maker IO
Here is another way to defend yourself against bot problems:
Turned out to be fake, BTW. His friend just pranked him.

gwern May 24, 2025, 1:43 AM
9 points
0
in reply to: silentbob’s comment on: silentbob’s Shortform

for text, you might realize that different parts of the text refer to each other, so need a way to effectively pass information around, and hence you end up with something like the attention mechanism

If you are trying to convince yourself that a Transformer could work and to make it ‘obvious’ to yourself that you can model sequences usefully that way, it might be a better starting point to begin with Bengio’s simple 2003 LM and MLP-Mixer. Then Transformers may just look like a fancier MLP which happens to implement a complicated way of doing token-mixing inspired by RNNs and heavily tweaked empirically to eke out a bit more performance with various add-ons and doodads.

(AFAIK, no one has written a “You Could Have Invented Transformers”, going from n-grams to Bengio’s LM to MLP-Mixer to RNN to Set Transformer to Vaswani Transformer to a contemporary Transformer, but I think it is doable and useful.)

gwern May 22, 2025, 2:18 PM
1 point
2
in reply to: Said Achmiz’s comment on: Semen and Semantics: Understanding Porn with Language Embeddings
Or just clipped out. It takes 2 seconds to clip it out and you’re done. Or you just fast forward, assuming you saw the intro at all and didn’t simply skip the first few minutes. Especially as ‘incest’ becomes universal and viewers just roll their eyes and ignore it. This is something that is not true of all fetishes: there is generally no way to take furry porn, for example, and strategically clip out a few pixels or frames and make it non-furry. You can’t easily take a video of an Asian porn star and make them white or black. And so on and so forth.

gwern May 21, 2025, 3:21 PM
9 points
2
in reply to: Warty’s comment on: Warty’s Shortform

But if a metric is trivially gameable, surely that makes it sus and less impressive, even if someone is not trivially, or even at all gaming it.

Why would you think that? Surely the reason that a metric being gameable matters is if… someone is or might be gaming it?

Plenty of metrics are gameable in theory, but are still important and valid given that you usually can tell if they are. Apply this to any of the countless measurements you take for granted. Someone comes to you and say ‘by dint of diet, hard work (and a bit of semaglutide), my bathroom scale says I’ve lost 50 pounds over the past year’. Do you say ‘do you realize how trivially gameable that metric is? how utterly sus and unimpressive? You could have just been holding something the first time, or taken a foot off the scale the second time. Nothing would be easier than to fake this. Does this bathroom scale even exist in the first place?’ Or, ‘my thermometer says I’m running a fever of 105F, I am dying, take me to the hospital right now’ - ‘you gullible fool, do you have any idea how easy that is to manipulate by dunking it in a mug of tea or something? sus. Get me some real evidence before I waste all that time driving you to the ER.’

gwern May 21, 2025, 2:04 AM
17 points
6
in reply to: Warty’s comment on: Warty’s Shortform
Good calibration is impressive and an interesting property because many prediction sources manage to not clear even that minimal bar (almost every human who has not undergone extensive calibration training, for example, regardless of how much domain expertise they have).

Further, you say one shouldn’t be impressed by those sources because they could be flipping a coin, but then you refuse to give any examples of ‘impressive’ sources which are doing just the coin-flip thing or an iota of evidence for this bold claim, or to say what they are unimpressive compared to.

gwern May 20, 2025, 4:18 PM
LW: 24 AF: 8
3
AF
in reply to: Daniel Kokotajlo’s comment on: Thomas Kwa’s Shortform

I think I would have predicted that Tesla self-driving would be the slowest

For graphs like these, it obviously isn’t important how the worst or mediocre competitors are doing, but the best one. It doesn’t matter who’s #5. Tesla self-driving is a longstanding, notorious failure. (And apparently is continuing to be a failure, as they continue to walk back the much-touted Cybertaxi launch, which keeps shrinking like a snowman in hell, now down to a few invited users in a heavily-mapped area with teleop.)

I’d be much more interested in Waymo numbers, as that is closer to SOTA, and they have been ramping up miles & cities.

gwern May 20, 2025, 2:25 AM
61 points
6
on: Semen and Semantics: Understanding Porn with Language Embeddings

The trends reflect the increasingly intense tastes of the highest spending, most engaged consumers.

https://logicmag.io/play/my-stepdad’s-huge-data-set/

While a lot of people (most likely you and everyone you know) are consumers of internet porn (i.e., they watch it but don’t pay for it), a tiny fraction of those people are customers. Customers pay for porn, typically by clicking an ad on a tube site, going to a specific content site (often owned by MindGeek), and entering their credit card information.

This “consumer” vs. “customer” division is key to understanding the use of data to perpetuate categories that seem peculiar to many people both inside and outside the industry. “We started partitioning this idea of consumers and customers a few years ago,” Adam Grayson, CFO of the legacy studio Evil Angel, told AVN. “It used to be a perfect one-to-one in our business, right? If somebody consumed your stuff, they paid for it. But now it’s probably 10,000 to one, or something.”

There’s an analogy to be made with US politics: political analysts refer to “what the people want,” when in fact a fraction of “the people” are registered voters, and of those, only a percentage show up and vote. Candidates often try to cater to that subset of “likely voters”— regardless of what the majority of the people want. In porn, it’s similar. You have the people (the consumers), the registered voters (the customers), and the actual people who vote (the customers who result in a conversion—a specific payment for a website subscription, a movie, or a scene). Porn companies, when trying to figure out what people want, focus on the customers who convert. It’s their tastes that set the tone for professionally produced content and the industry as a whole.

By 2018, we are now over a decade into the tube era. That means that most LA-area studios are getting their marching orders from out-of-town business people armed with up-to-the-minute customer data. Porn performers tend to roll their eyes at some of these orders, but they don’t have much choice. I have been on sets where performers crack up at some of the messages that are coming “from above,” particularly concerning a repetitive obsession with scenes of “family roleplay” (incest-themed material that uses words like “stepmother,” “stepfather,” and “stepdaughter”) or what the industry calls “IR” (which stands for “interracial” and invariably means a larger, dark-skinned black man and a smaller light-skinned white woman, playing up supposed taboos via dialogue and scenarios).

These particular “taboo” genres have existed since the early days of commercial American porn. For instance, see the stellar performance by black actor Johnnie Keyes as Marilyn Chambers’ orgy partner in 1972’s cinematic Behind the Green Door, or the VHS-era incest-focused sensation Taboo from 1980. But backed by online data of paid customers seemingly obsessed with these topics, the twenty-first-century porn industry—which this year, to much fanfare, was for the first time legally allowed to film performers born in this millennium—has seen a spike in titles devoted to these (frankly old-fashioned) fantasies.

Most performers take any jobs their agents send them out for. The competition is fierce—the ever-replenishing supply of wannabe performers far outweighs the demand for roles—and they don’t want to be seen as “difficult” (particularly the women). Most of the time, the actors don’t see the scripts or know any specific details until they get to set. To the actors rolling their eyes at yet another prompt to declaim, “But you’re my stepdad!” or, “Show me your big black dick,” the directors shrug, point at the emailed instructions and say, “That’s what they want…”

So my interpretation here is that it’s not that there’s suddenly a huge spike in people discovering they love incest in 2017 where they were clueless in 2016 or that they were all brainwashed to no longer enjoy vanilla that year, it’s that that is when the hidden oligopoly turned on various analytics and started deliberately targeting those fetishes as a fleet-wide business decision. And this was because they had so thoroughly commodified regular porn to a price point of $0, that the only paying customers that are left are the ones with extreme fetishes who cannot be supplied by regular amateur or pro supply.

They may or may not have increased in absolute number compared to pre-2017, but it doesn’t matter, because everyone else vanished, and their relative importance skyrocketed: “If somebody consumed your stuff, they paid for it. But now it’s probably 10,000 to one, or something.”

(For younger readers who may be confused by how a ratio like 10000:1 is even hypothetically possible because ‘where did that 10k come from when no one pays for porn?‘, it’s worth recalling that renting porn videos used to be big business that would be done by a lot of men, and it kept many non-Blockbuster video rental stores afloat and it was an ordinary thing for your local store to have a ‘back room’ that the kiddies were strictly forbidden from, and while it would certainly stock a lot of fetish stuff like interracial porn, it also rented out tons of normal stuff. If you have no idea what this was like, you may enjoy reading “True Porn Clerk Stories”, Ali Davis 2002.)

I think there is a similar effect with foot fetishes & furries: they are oddly well-heeled and pay a ton of money for new stuff, because they are under-supplied and demand new ones. There is not much ‘organic’ supply of women photographing their feet in various lascivious ways; it’s not that it’s hard, they just don’t do it, but can be incentivized to do so. (I recall reading an article on Wikifoot where IIRC they interviewed a contributor who said he got some photos by simply politely emailing or DMing the woman to ask for her to take some foot photos, and she would oblige. “send foots kthnxbai” apparently works. And probably it’s fairly easy to pay for or commission feet images/videos: almost everyone has two feet already, and you can work in feet into regular porn easily by simply choosing different angles or postures, and a closeup of a foot won’t turn off regular porn consumers either, so you can have your cake & eat it too. Similarly for incest: saying “But you’re my stepdad!” is cheap and easy and anyone can do it if the Powers That Be tell them to in case a few ‘customers’ will pay actual $$$ for it, while those ‘consumers’ not into that plot roll their eyes and ignore it as so much silly ‘porn movie plot’ framing as they get on with business.)

gwern May 20, 2025, 2:06 AM
12 points
4
in reply to: AnthonyC’s comment on: A widely shared AI productivity paper was retracted, is possibly fraudulent
I think aside from the general implausibility of the effect sizes and the claimed AI tech (GANs?) delivering those effect sizes across so many areas of materials, one of the odder claims which people highlighted at the time was that supposedly the best users got a lot more productivity enhancement than the worst ones. This is pretty unusual: usually low performers get a lot more out of AI assistance, for obvious reasons. And this lines up with what I see anecdotally for LLMs: until very recently, possibly, they were just a lot more useful for people not very good at writing or other stuff, than for people like me who are.

gwern May 19, 2025, 2:57 PM
5 points
0
on: October The First Is Too Late
I appreciate everyone’s comments here, they were very helpful. I’ve heavily revised the story to fix the issues with it, and hopefully it will be more satisfactory now.

gwern May 18, 2025, 1:10 AM
6 points
0
in reply to: Sheikh Abdur Raheem Ali’s comment on: LLM AGI will have memory, and memory changes alignment
I agree at this point: it is not per-user finetuning. The personalization has been prodded heavily, and it seems to boil down to a standard RAG interface plus a slightly interesting ‘summarization’ approach to try to describe the user in text (as opposed to a ‘user embedding’ or something else). I have not seen any signs of either lightweight or full finetuning, and several observations strongly cut against it: for example, users describe a ‘discrete’ behavior where the current GPT either knows something from another session, or it doesn’t, but it is never ‘in between’, and it only seems to draw on a few other sessions at any time; this points to RAG as the workhorse (the relevant other snippet either got retrieved or it didn’t), rather than any kind of finetuning where you would expect ‘fuzzy’ recall and signs of information leaking in from all recent sessions.

Perhaps for that reason, it has not made a big impact (at least once people got over the narcissistic rush of asking GPT about the summary of you, either flatteringly sycophantic or not). It presumably is quietly helping behind the scenes, but I haven’t noticed any clear big benefits to it. (And there are some drawbacks.)

gwern May 17, 2025, 1:37 AM
11 points
1
in reply to: MichaelDickens’s comment on: StefanHex’s Shortform
Why can’t the mode-collapse just be from convergent evolution in terms of what the lowest-common denominator rater will find funny? If there are only a few top candidates, then you’d expect a lot of overlap. And then there’s the very incestuous nature of LLM training these days: everyone is distilling and using LLM judges and publishing the same datasets to Hugging Face and training on them. That’s why you’ll ask Grok or Llama or DeepSeek-R1 a question and hear “As an AI model trained by OpenAI...”.

gwern May 15, 2025, 6:53 PM
10 points
6
in reply to: Selfmaker662’s comment on: Saul Munn’s Shortform
This is true of all teas. The decaf ones all are terrible. I spent a while trying them in the hopes of cutting down my caffeine consumption, but the taste compromise is severe. And I’d say that the black decaf teas were the best I tried, mostly because they tend to have much more flavor & flavorings, so there was more left over from the water or CO2 decaffeination...

gwern May 15, 2025, 2:24 PM
50 points
27
in reply to: Daphne_W’s comment on: How to Make Superbabies

I would expect you to already know that chimpanzees have an IQ lower than 60 and are capable of taking care of themselves and having a decent life

You are comparing apples and oranges in a bait-and-switch. No one knows that, nor should you expect them to, because it is blatantly false. Chimpanzees are not Curious George or noble savages: they are large animals. Chimpanzees are capable of ‘taking care of themselves and having a decent life’ only to the standards and the very narrow, limited context of a chimpanzee (in a zoo, or perhaps a forest, strictly protected by rangers from the rest of the world so they stop going extinct so fast). That is not the standards and context of a 60 IQ human… unless you are suggesting, of course, that we lock up all such humans in a zoo or exile them to a park in central Africa where they will go about nude, eat raw food, have a large fraction of their children (if any) die, be devoid of anything recognizable as culture or most of the things we consider that make a human life worth living, and probably die themselves in a decade or two of preventable diseases or being murdered by a fellow primate (perhaps, as Frans de Waal memorably described one chimpanzee interaction, by having their testicles bitten off in a dominance struggle)? Not what I would consider a decent human life, personally.

In reality, if you put a chimpanzee in the human context of people and expect them to ‘take care of themselves’, they will not be able to, because they would be unemployed, unemployable, completely fail at basic standards of human life, starving, homeless, less able to be reasoned with or communicated with than someone in a schizophrenic psychotic break, and probably in jail or shot by police within the year after mauling or killing another human. This is why chimpanzee refuges have to put muzzles on chimps when interacting with humans, sedate them for flights to said refuges, and even chimpanzees raised from birth with humans and given every advantage we know how have a rather alarming rate of eating the faces of caregivers or just random strangers (a rate that would be even higher if more people were foolish enough to attempt such a thing or to persist after initial warning shots of face-eating behavior rather than dumping them on refuges equipped to handle such dangerous animals), eg Project Nim.
What links here?
- Noosphere89's comment on How to Make Superbabies by GeneSmith (May 16, 2025, 3:21 PM; 9 points)

gwern May 13, 2025, 10:00 PM
26 points
0
on: AI Doomerism in 1879
Previous discussion: https://www.lesswrong.com/posts/goANJNFBZrgE9PFNk/shadows-of-the-coming-race-1879

gwern May 12, 2025, 6:43 PM
10 points
5
in reply to: dirk’s comment on: Extended Interview with Zhukeepa on Religion
He did do something, though: he allowed it, and hasn’t killed or rendered it useless, the way he has so many other things on Twitter, even though it regularly undercuts him (“167 of his posts have received a note since Community Notes began”). He has talked about neutering it every once in a while, but as far as I know, never has.

gwern May 11, 2025, 11:38 PM
2 points
0
in reply to: XelaP’s comment on: Statistical Prediction Rules Out-Perform Expert Human Judgments
Why wouldn’t it be? It’s just the same linear model but with a different link function. None of the various points about human underperformance or base rate neglect etc seems to change if it’s a binomial vs a negative binomial vs a over-dispersed Poisson vs a logistic vs a… It’s all just a general linear model.

gwern May 11, 2025, 2:20 AM
LW: 17 AF: 7
12
AF
in reply to: ryan_greenblatt’s comment on: Slow corporations as an intuition pump for AI R&D automation
Maybe it would be helpful to start using some toy models of DAGs/tech trees to get an idea of how wide/deep ratios affect the relevant speedups. It sounds like so far that much of this is just people having warring intuitions about ‘no, the tree is deep and narrow and so slowing down/speeding up workers doesn’t have that much effect because Amdahl’s law so I handwave it at ~1x speed’ vs ‘no, I think it’s wide and lots of work-arounds to any slow node if you can pay for the compute to bypass them and I will handwave it at 5x speed’.

gwern May 8, 2025, 9:38 PM
9 points
4
in reply to: Wei Dai’s comment on: faul_sname’s Shortform
What sample of Satoshi writings would you use that o3 wouldn’t already know was written by Satoshi Nakamoto?