gwern
I think it’s substantially better and calmer—even just the thumbnails look calmer now.
I still think you are going a bit overboard on visual complexity, things like slashed-zeros aren’t bad (I like them), just too much of a good thing and using up your visual complexity budget where there may be a better deal elsewhere: the question I ask myself is, “do I want to look at this for the next 20 years? If I add X, will it ‘look painfully 2025’ in a few years?” Elements which don’t seem excessive in the excitement of implementation may, with the passage of time, gradually come to rub you the wrong and need to be sanded down or removed. If you aren’t careful, you may slowly, one accretion at a time, come to resemble a site like “Agora of Flancia”: encrusted with neat experimental features which are too half-baked to be genuinely useful, but which burden and deter readers. (I try to avoid a reluctance to ‘kill my darlings’ by cultivating the mindset that every addition is an experiment, and thus if I have to remove them, they’re not wasted, they’re just a new ‘Design Graveyard’ writeup, is all.) But I suppose you’ll find out your own preferences on this matter over the next few years as the novelty wears off.
I want to do something about the desktop logo animation being distracting. I don’t know what that is, yet. I can’t play/pause the GIF on hover because GIFs don’t allow that (AFAIK). I’ll probably find a way to move it to a WEBM while also making it autoplay across browsers, at which point I can implement the feature.
I also still think that the logo should probably not play by default, and for animations like this, it’s better to take an Apple-like attitude about them being enhancements, opted into by user actions, to ‘spark joy’, but not to be used by default. What do the worst websites do? They animate tons of stuff gratuitously. How much more delightful it is to discover a website with taste & restraint, where there are easter eggs and features to discover as you surf, where, say, the animated logo plays only when you hover over it… Truly an oasis or quiet little pond amidst the howling desert of the contemporary Internet. (I’m reminded of a Family Guy meme I re-ran into recently: why does Peter Griffin dislike The Godfather? Because “It insists upon itself.” A website animating the logo unasked for insists upon itself.) And this helps instill a design feature: you the reader are in control, and you express this control in part because you can hover over everything to learn more or focus on some things.
However, if you insist upon it, perhaps you could reduce its impact by some sort of limitation. Let it cycle a few times or seconds, and then slow it down or fade it or stop it. If the reader hasn’t appreciated it by then, why keep flickering it in the corner of their eye? Another idea would be a site-wide limitation on animation: on Gwern.net, we have a ‘demonstration mode’ feature which tracks how many times something has happened / been shown, and changes it (usually to disable it) after n times, tracking n site-wide by using a cookie in LocalStorage counting that particular event. We use it to do things like simplify obtrusive text labels, or to disable the educational theme-toggle-bar animation after a few animations.
Also, have you considered animating the dark-mode setting to move the sun/star around? The sun over the pond to the right, then the moon to behind the castle and swap the palette for a ‘night’ palette. (The ‘night’ label can be omitted, typeset vertical, or you can do it by horizontal mirroring of the whole image.) If you don’t do something like that, that would be a real shame. The perfect twist on dark-mode for The Pond.
like the ability to listen to AI-generated readings in my voice (for the less math-y articles)
This sounds like it would make for a better project & writeup than actual long-term website feature intended for actual reader use. Is this something readers want or will use? It never once occurred to me that I might want a spoken version of your articles (human, AI sexy-Scarlett-Johansson, or AI-Turntrout—ok maybe Scarlett). These aren’t poems or autobiographical essays or even fiction, where there can be value to hearing the author read it. (Although even there most of the reasons to want that are ones where an AI can’t substitute: if you want to hear the author’s unique intonations or pacing for subtle insights into the text or the author*, obviously an AI model voicing it is useless.)
On a sidenote, ahem, you might want to rethink how the popups work. I was unaware you even had popups, until I noticed in the network tab that a lot of pages were being prefetched when I hovered over them. It might be an oversight to prefetch if you aren’t doing a popup. But regardless, since I’m on a fullscreen 4k portrait monitor, and I have to zoom out before I can see popups at all, you may have gone overboard in your width requirements.
* when I questioned the value of Dwarkesh Patel et al investing in making a spoken version of our interview, this was given as one of the reasons that people valued audio over the transcript. We split the difference by the voice actor, Chris Painter, having the audio and performing it. It’s uncanny for me to listen to.
It’s about grief, with central metaphors that add exactly zero to anyone’s aesthetic understanding of grief (stuff being underground, things not staying buried)
It is about grief, but it didn’t have to be. This would’ve been more obvious if I could’ve shown you the session, but I’ll copy it out:
2. Brainstorming Ideas:
A child confronting a local superstition after witnessing something traumatic.
A funeral narrated by an animal’s perspective.
A celebrity’s fall from grace caught on live camera.
A girl who collects superstitions until one unexpectedly comes true.
A gardener who secretly buries cursed objects.
Friends reuniting silently after a crime.
A model who steals other people’s identities through makeup.
A ghost attending its own funeral, confused.
Farmers discovering an ominous black stone.
Neighbors noticing unsettling coincidences.
Someone who compulsively invents last messages for strangers.
A woman discovering her double living her unlived life.
An eerie superstition circulating quietly through a town.
Villagers silently celebrating the death of a tyrant.
Two women performing friendship publicly, enemies privately.
Someone obsessively organizing goodbyes.
A funeral revealing hidden grudges through seating arrangements.
Friends trapped in a toxic nostalgia loop.
A boy who accidentally predicts deaths.
An abandoned motel that traps memories.
3. Selection
Idea: A gardener who secretly buries cursed objects.
Rationale:
Symbolic and psychological depth.
Opportunity for visual and thematic resonance.
Tension around secrecy, superstition, and consequences.
Now that I look at it, the 5 flash stories I happened to copy into seem to mostly hit on themes of death & grief (the last one is literally titled “obit”!), so I think you are unfairly dinging 4.5 here—if ‘good’ flash fiction, by your own standards, keep revolving around death and grief, how can we blame 4.5 (or r1) for doing likewise?
Anyway, some of these ideas seem promising. “A funeral narrated by an animal’s perspective” is one I like, I don’t think I’ve ever seen that.
And of course, if the failure mode is so common, throw it into the prompt. (When I yell at 4.5 to avoid grief/death/funerals and brainstorm some more, it picks out ‘”The Parking Attendant Matchmaker”: A seemingly ordinary parking attendant quietly manipulates parking assignments at a large business complex to engineer chance encounters and romances among strangers.’ Yeah sure why not.)
Like, what does it possibly mean for mourners to “trust my silence” here. What is it they’re trusting? How does the earth’s hunger contrast to that?
Balderdash. There’s a lot to criticize here, but you’re straining to come up with criticisms now. That’s possibly the least objectionable sentence in the whole thing. If this had been written by a human, you wouldn’t hesitate in the slightest to accept that. It is perfectly sensible to speak of trusting the confidentiality of a confessor/witness figure, and the hungry earth is a cliche so straightforward and obvious that it is beyond cliche and loops around to ordinary fact, and if a human had written it, you would have no trouble in understanding the idea of ‘even if I were to gossip about what I saw, the earth would have hidden or destroyed the physical evidence’.
I also see it a lot in the ClaudePlaysPokemon twitch chat, this idea that simply adding greater situational awareness or more layers of metacognition would make Claude way better at the game.
I do agree that the Claude-Pokemon experiment shows a limitation of LLMs that isn’t fixed easily by simply a bit more metadata or fancier retrieval. (I think it shows, specifically, the serious flaws in relying on frozen weights and refusing to admit neuroplasticity is a thing which is something that violates RL scaling laws, because those always assume that the model is, y’know, learning as it gains more experience, because who would be dumb enough to deploy frozen models in tasks far exceeding their context window and where they also aren’t trained at all? - and why we need things like dynamic evaluation. I should probably write a comment on that—the pathologies like the deliberate-fainting are, I think, really striking demonstrations of the problems with powerful but frozen amnesiac agents.)
I’m much less convinced that we’re seeing anything like that with LLMs writing fiction. What is the equivalent of the Claude pathologies, like the fainting delusion, in fiction writing? (There used to be ‘write a non-rhyming poem’ but that seems solved at this point.) Especially if you look at the research on people rating LLM outputs, or LMsys; if they are being trained on lousy preference data, and this is why they are like they are, that’s very different from somehow being completely incapable of “extracting the actual latent features of good flash fiction”. (What would such a latent feature look like? Do you really think that there’s some property of flash fiction like “has a twist ending” that you can put two flash stories into 4.5 or o1-pro, with & without, and ask it to classify which is which and it’ll perform at chance? Sounds unlikely to me, but I’d be interested to see some examples.)
I think it’s something of a trend relating to a mix of ‘tools for thought’ and imitation of some websites (LW2, Read The Sequences, Asterisk, Works in Progress & Gwern.net in particular), and also a STEM meta-trend arriving in this area: you saw this in security vulnerabilities where for a while every major vuln would get its own standalone domain + single-page website + logo + short catchy name (eg. Shellshock, Heartbleed). It is good marketing which helps you stand out in a crowded ever-shorter-attention-span world.
I also think part of it is that it reflects a continued decline of PDFs as the preferred ‘serious’ document format due to preferring Internet-native things with mobile support. (Adobe has, in theory, been working on ‘reflowable’ PDFs and other fixes, but I’ve seen little evidence of that anywhere.)
Most of these things would have once been released as giant doorstop whitepaper-book PDFs. (And you can see that some things do poorly because they exist only as PDFs—the annual Stanford AI report would probably much more read if they had a better HTML story. AFAIK it exists only as giant PDFs everyone intends to read but never get around to doing so, and so everyone only sees a few graphs copied out of it and put in media articles or social media squibs.) Situational Awareness, for example, a few years ago would’ve definitely been a PDF of some sort. But, PDFs suck on mobile, and now everyone is on mobile.
If you release something as a PDF rather than a semi-competent responsive website which is readable on mobile without opening a separate app & rotating my phone & constantly thumbing up & down a two-column layout designed when cellphones required a car to be attached to, you cut your readership at least in half. I wish I didn’t have to support mobile or dark-mode, but I can see in my analytics that it’s at least half my readers, and I notice that almost every time someone screenshots Gwern.net on social media, it is from the mobile version (and as often as not, the dark-mode too). Nor are these trash readers—many of them are elite readers, especially of the sort who are creating virality or referencing it or creating downstream readers in various ways. (Ivanka Trump was tweeting SA; do you think she and everyone else connected to the Trump Administration are sitting down at their desktop PC and getting in a few hours of solid in-depth reading? Probably not...) People will even exclusively use the Arxiv HTML versions of papers, despite the fact that the LaTeX->HTML pipeline has huge problems like routinely silently deleting large fractions of papers (so many problems I gave up a while ago filing bug reports on it).
Having a specialized website can be a PITA in the long run, of course, but if you design it right, it should be largely fire-and-forget, and in any case, in many of these releases (policy advocacy, security vulns), the long run is not important.
(I don’t think reasoning/coding models have yet had too much to do with this trend, as they tend to either be off-the-shelf or completely bespoke. They are not what I would consider ‘high-effort’: the difference between something like SA and Gwern.net is truly vast; the former is actually quite simple and any ‘fancy’ appearance is more just its clean minimalist design and avoiding web clutter. At best, as tireless patient superhumanly knowledgeable consultants, LLMs might remove some friction and enable people unsure if they can make a whole website on their own, and thus cause a few more at the margin. But many of these predate coding LLMs entirely and I’m fairly sure Leopold didn’t need much, if any, LLM assistance to do the SA website, as he is a bright guy good at coding and the website is simple.)
I read the ‘stars’ as simply very dense low-orbiting satellites monitoring the ground 24⁄7 for baseline humans to beam low-latency optical propaganda at. The implied King’s Pact presumably is something like, “the terrestrial Earth will be left unmodified and no AI are allowed to directly communicate or interact with or attempt to manipulate baseline humans”, and so satellites, being one-way broadcasts outside the Earth, don’t violate it. This then allows the bootstrap of all the other attacks: someone looks up at night long enough, they get captured, start executing the program. But because it’s all one-way and ‘blind’, the attacks have to be blackbox, like evolutionary algorithms, and work poorly and inefficiently, and with little feedback. (If a glitcher doesn’t work, but can only attract other animals rather than humans, where did your attack go wrong? How hard are you, bound by the King’s Pact, even allowed to think about your attack?) The soft-glitchers are a bypass, a mesa-optimizer: you load the minimal possible mesa-optimizer (which as we know from demo scene or hacking can be relatively few bytes), an interest in glitchers, which exploits the native human intelligence to try to figure out an interpreter for the powerful but non-human-native (for lack of feedback or direct access to humans to test on) programs in the hard-glitchers. Once successful (ie. once they figure out what some ill-chosen gestures or noises were actually supposed to mean, fixing the remaining errors in the attack), they can then successfully interpret and run the full attack program. (Which might include communication back to the AI attackers and downloading refined attacks etc.)
Ice-nucleating bacteria: https://www.nature.com/articles/ismej2017124 https://www.sciencefocus.com/planet-earth/bacteria-controls-the-weather
If you can secrete the right things, you can potentially cause rain/snow inside clouds. You can see why that might be useful to bacteria swept up into the air: the air may be a fine place to go temporarily, and to go somewhere, but like a balloon or airplane, you do want to come down safely at some point, usually somewhere else, and preferably before the passengers have begun to resort to cannibalism. So given that even bacteriophage viruses are capable of surprisingly sophisticated community-wide decisions about when to kill their bacteria hosts and find greener pastures, and that bacteria communities can do similar calculations about dispersal or biofilm formation, it would not be too surprising if bacteria in a cloud storm might be computing things like timers or counting the average rate of organic matter floating upwards, to decide when to ‘try to land’ by everyone secreting special ice-nucleating molecules in the hopes of triggering the storm that will deliver them safely to the foreign ground, rather than waiting passively for a random storm which might put them down too late or somewhere bad.
No, it is on the ChatGPT end. I was surprised since I can’t recall ever seeing that before. The usual share-button pops up the share box, but with the red-background message
This shared link has been disabled by moderation.
I don’t know if it’s perhaps the copyrighted stories (given the Bing search engine integration, entirely possible for these December stories to show up and be flagged) or some of the content, haven’t cared enough to try to ablate it because the exact text of the session isn’t terribly important here IMO—you see the prompt, you see the final result, you get the idea.
I generally agree that r1′s fiction is not that great and tends to a simple-minded ‘edgelord’ vibe with lots of portentous phrases that fall apart on genuine reading, but I feel like you didn’t give Deepseek-r1 a fair shot at all here. You don’t describe your prompt but I’m guessing it was something very simple like “write a flash fiction story of at least 500 words”. No description of goals, no requirements, no planning, no editing or revision… no humans writes the way you expect the LLM to. Especially given that this is for short fiction, a much more reasonable approach would be to include some examples, ask it to think about what a good ‘flash fiction’ is, list out more than one idea, pick the best one, write it out, and revise it a few times. This would be very easy and scarcely take you any time at all.
I whipped up a very quick example in GPT-4.5, which unfortunately ‘moderation’ somehow forbids me from sharing, but my initial prompt went like this:
Describe microfiction or flash fiction. What is it like? What makes it good? What are writers aiming for? What are their favorite gimmicks or twists or tropes? After thinking about flash fiction, read the following stories for inspiration. Then list 20 ideas for a new flash fiction story. Pick the best idea, outline a detailed description, and then write it. Once you have written a new story, revise it again. Then go line by line: quote the line, critique it, and editing it to make it better. Try to make each line memorable, stylish, and detailed; ‘show, don’t tell’; cut anything which isn’t carrying its weight.
“These Things We Not Saying”, by Banchiwosen Woldeyesus, December 17, 2024...
...
“Obit”, by Mitch James, December 16, 2024...
Reminder: summarize the genre and useful things to know, brainstorm ideas, pick the best one, outline, write, revise, and edit it until it’s great.
(Probably would’ve benefited from putting a bunch more stories in but oh well.)
And then the final result was
Final Cleaned-up Version: “Under the Soil”
Every spring, townsfolk leave their haunted keepsakes at my gate. A fisherman brings pearls pried from his drowned wife’s throat. A teacher leaves spectacles that have witnessed too much. Twins, hand-in-hand, offer scorched clothing wrapped neatly in twine.
They trust my silence, or maybe just the earth’s endless hunger.
I bury their ghosts deep within my garden. Marigolds cover sins of pride; lilies conceal unmourned losses. In return, the garden blooms violently—petals painfully bright, vines gripping fiercely over buried truths. People marvel, grateful not to look too closely.
Recently, the earth has begun returning these offerings. Last week, spectacles surfaced unbroken among snapdragons; yesterday, charred fabric rose like flags among chrysanthemums. Dreams filled with whispered anguish; sleep collapsed beneath their weight.
Today, frantic, I tore earth bare until nails split among tangled roots. And there it was—pristine after decades—a wooden box carved with my initials. Inside was no betrayal but something crueler—a single child’s tooth wrapped gently in faded silk.
My throat tightened around grief I’d thought buried. This secret had grown too large for silence. Roots twisted, petals curled into fists; the earth trembled beneath me.
The garden, I understood now, was never forgiving—it was patient.
And now, unburdened, it waited for my confession.
Still some of the ChatGPT tics like the Hallmark ending (it’d be nice to try this with the new creative-writing model Sam Altman has teased), but I’m guessing this is better than however your DS-r1 flash fiction went.
(This is of course just the very first stab I took at it. You’d really want to few-shot a lot more stories, include more scaffolding to induce stylistic changes and imitate authors and imagine new contexts, iterate longer, and then select the best out of 100 such sessions—and only then have you started to approximate a fair comparison of a LLM-generated story with a magazine curated selection of human stories from probably hundreds or thousands of submissions by authors who have themselves written and extensively revised many… Expecting a novel, exciting, stylistically-amazing story out of a LLM with no forethought on the simplest prompt possible is to expect extremely superhuman fiction writing capability.)
The diminishing returns isn’t too surprising, because you are holding the model size fixed (whatever that is for Houdini 3), and the search sigmoids hard. Hence, diminishing returns as you jump well past the initial few searches with the largest gains, to large search budgets like 2k vs 4k (and higher).
This is not necessarily related to ‘approaching perfection’, because you can see the sigmoid of the search budget even with weak models very far from the known oracle performance (as well as stronger models); for example, NNs playing Hex: https://arxiv.org/pdf/2104.03113#page=5 Since it’s a sigmoid, at a certain point, your returns will steeply diminish and indeed start to look like a flat line and a mere 2x increase in search budget does little. This is why you cannot simply replace larger models with small models that you search the hell out of: because you hit that sigmoid where improvement basically stops happening.
At that point, you need a smarter model, which can make intrinsically better choices about where to explore, and isn’t trapped dumping endless searches into its own blind spots & errors. (At least, that’s how I think of it qualitatively: the sigmoiding happens because of ‘unknown unknowns’, where the model can’t see a key error it made somewhere along the way, and so almost all searches increasingly explore dead branches that a better model would’ve discarded immediately in favor of the true branch. Maybe you can think of very large search budgets applied to a weak model as the weak model ‘approaching perfection… of its errors’? In the spirit of the old Dijkstra quip, ‘a mistake carried through to perfection’. Remember, no matter how deeply you search, your opponent still gets to choose his move, and you don’t; and what you predict may not be what he will select.)
Fortunately, ‘when making an axe handle with an axe, the model is indeed near at hand’, and a weak model which has been ‘policy-improved’ by search is, for that one datapoint, equivalent to a somewhat larger better model—if only you can figure out how to keep that improvement around...
Well, if you want to try to use video game playing as a measure of anything, it’s worth noting that his preferences have, fairly recently, shifted from strategy games (the original Civilization when younger, but even as of 2020--2021, he was still playing primarily strategy games AFAICT from Isaacson & other media coverage—specifically, being obsessed with Polytopia) to twitch-fests like Elden Ring or Path of Exile 2… and most recently, he’s infamously started cheating on those too.
Could just be aging or lack of time, of course.
colonizing Himalayan mountain slopes
One way to tell that you’re at the edge of viability for actual living at this point, as opposed to simply passing through or enduring it until better conditions arise, is that Antarctic mountain slopes appear to be completely sterile and free of microbes:
We analyzed 204 ice-free soils collected from across a remote valley in the Transantarctic Mountains (84–85°S, 174–177°W) and were able to identify a potential limit of microbial habitability. While most of the soils we tested contained diverse microbial communities, with fungi being particularly ubiquitous, microbes could not be detected in many of the driest, higher elevation soils—results that were confirmed using cultivation-dependent, cultivation-independent, and metabolic assays. While we cannot confirm that this subset of soils is completely sterile and devoid of microbial life, our results suggest that microbial life is severely restricted in the coldest, driest, and saltiest Antarctic soils. Constant exposure to these conditions for thousands of years has limited microbial communities so that their presence and activity is below detectable limits using a variety of standard methods.
Presumably if you brought microbes there, they would be able to endure for a while (and given aerial dispersal, they must be arriving constantly). But apparently no meaningful form of sustainable life at the microbial scale or higher is possible. (Similar to bacteria, spores, or tardigrades being able to survive exposure to space, and thus hitchhike to other planets or cause panspermia—but can’t actually grow, reproduce, or even just sustain a constant population in space.) Air might be similar: not so much because of the horrible salts in air, as its general lack of moisture and, well, everything else too.
So air can be a great medium for dispersal (as it is for even larger organisms like spiders), and there’s evidence about bacteria manipulating weather for this purpose, but it’s no place to live for biological life as we know it.
(Which of course says little about mechanical life: they don’t necessarily need any water, they can engage in complex logistics to move around atoms that they need like piping feedstocks up into the sky, they can create structures which bacteria would be utterly unable to like kites supporting solar panels or mirrors, they can use it bacteria-style for covert dispersal and do heavy industry on the ground, etc. They aren’t selfish little replicators which must evolve tiny fitness-incrementing step by step from little blobs of self-bootstrapping organic goo solely to maximize reproductive fitness under constraints of heavy predation & defection, among other limitations.)
The radiator story might be real, apparently. I was reading a random review of an Astounding issue (November 1944) and was surprised to see this part:
“Time for a Universe” by R. S. Richardson looks at how the age of the universe has been calculated by various means (expansion of the universe, uranium clock, dynamics of clusters, and statistics of binaries) and the differences in the results.
There is also a good anecdote about the necessity of being cautious about data:
There is a story told about Robert lvirchoff [presumably Gustav Kirchhoff, with whom Bunsen worked], the physicist, and Wilhelm von Bunsen, inventor of the Bunsen burner, that is worth repeating. The two were strolling across the campus of the University of Heidelberg one sunny afternoon deep in conversation upon some abstruse subject. As they passed a silver-coated globe set on the lawn as an ornament Bunsen absent-mindedly ran his fingers over the reflecting surface. To his amazement the side exposed directly to the sun was cooler than the side in shadow.
Immediately the two stopped and began excitedly to investigate this anomalous heating effect. Here perhaps was a new phenomenon in heat conduction involving some mysterious interaction between solar radiation and the reflecting properties of silver. While they were busy devising a theory to account for it the school janitor came by and reversed the position of the globe.
“I have to keep turning it around every once in a while on these hot days,” he remarked.
The story is pretty good except that it seems doubtful whether a reflecting surface as good as silver would heat up so seriously. p. 107
Some quick googling doesn’t turn up any variants of the story, although WP does note of Bunsen that “Despite his lack of pretension, Bunsen was a vivid “chemical character”, had a well-developed sense of humour, and is the subject of many amusing anecdotes.”, and the cited paper, Jensen 2013, goes so far as to dub the genre of anecdotes as “Bunseniana”, and describes many published sources in German, so it is entirely possible that all the sources are in German rather than English.
Unfortunately, while entertaining, and noting that Oesper had personal acquaintance with Bunsen students, Jensen 2013 doesn’t contain any story like the sphere, and quickly checking 1 Oesper paper which should cover Bunsen’s Heidelberg period, it doesn’t either.
This doesn’t imply that R. S. Richardson is telling tales; in the 1940s, it was still usually a requirement for chemistry majors, among others, to learn either French or German, due to the dominance of those in STEM, before WWII and decades of American growth resulted in that requirement being discarded. So he could easily have learned it in German directly, or heard the story from someone who did. Oesper also apparently did a lot of research & publishing about chemist biographies, most of which (like his major book The Human Side of Scientists) is not easy to access right now, so it could easily be somewhere in there too. I don’t care enough to look into it further, but maybe someone else will.
but I expect that the RLHFed models would try to play the moves which maximize their chances of winning
RLHF doesn’t maximize probability of winning, it maximizes a mix of token-level predictive loss (since that is usually added as a loss either directly or implicitly by the K-L) and rater approval, and god knows what else goes on these days in the ‘post-training’ phase muddying the waters further. Not at all the same thing. (Same way that a RLHF model might not optimize for correctness, and instead be sycophantic. “Yes master, it is just as you say!”) It’s not at all obvious to me that RLHF should be expected to make the LLMs play their hardest (a rater might focus on punishing illegal moves, or rewarding good-but-not-better-than-me moves), or that the post-training would affect it much at all: how many chess games are really going into the RLHF or post-training, anyway? (As opposed to the pretraining PGNs.) It’s hardly an important or valuable task.
Given the Superalignment paper describes being trained on PGNs directly, and doesn’t mention any kind of ‘chat’ reformatting or encoding metadata schemes, you could also try writing your games quite directly as PGNs. (And you could see if prompt programming works, since PGNs don’t come with Elo metadata but are so small a lot of them should fit in the GPT-4.5 context window of ~100k: does conditioning on finished game with grandmaster-or-better players lead to better gameplay?)
I agree: if you’ve ever played any of the Pokemon games, it’s clear that a true uniform distribution over actions would not finish any time that a human could ever observe it, and the time would have to be galactic. There are just way too many bottlenecks and long trajectories and reset points, including various ways to near-guarantee (or guarantee?) failure like discarding items or Pokemon, and if you’ve looked at any Pokemon AI projects or even just Twitch Plays Pokemon, this becomes apparent—they struggle to get out of Pallet Town in a reasonable time, never mind the absurdity of playing through the rest of game and beating the Elite Four etc, and that’s with much smarter move selection than pure random.
I think this implies that we collectively have no more than about 50 * 8 = 400 billion bits per second of control over the world.
I don’t know how to think about this statement. Should I find this a ‘small’ number or unimpressive in some respect?
This allows for each second, picking a possibility out of ‘no more than about’ 2400,000,000,000 high-level possibilities, which is a possibility-space so large I don’t think I can write it out in decimal without crashing LW2 or hitting size limits. (GHCi tries to evaluate it but I killed it after a bit when the RAM consumption started to worry me.) Even at the individual level, this implies that in, say, 1 year, I get to pick a possibility out of 2^(365.25 * 24 * 60 * 60 = ~36000000)^ outcomes. (Which at least GHCi can evaluate and print out as a decimal reasonably well, it just takes patience and what looks like thousands of screens of output.)
Are these ‘small’ amounts of control to have? Is there some important task for which this level of control is clearly inadequate, or at what amount of control would one consider it a ‘large’ amount of control?
OP didn’t say secret, it just said ‘many facts’. I took it as a reference to the super-knowledgeability of LLMs: similar to how SAT vocab tests work because it is difficult for dumb people to know so many words that a random selection of rare words will include the few they happen to know. The Internet is stuffed with archaic memes, jokes, web pages, images, and writings that possibly no human alive could recognize, much less quote… but LLMs, trained on trillions of words scraped indiscriminately from every corner of the Internet accessible to crawlers, and capable of memorizing text after a single exposure, could contain many such things. (Imagine all those old Blogger blogs started in 2003 and read by the author and 5 friends, abandoned after a year, where even the author has forgotten them or is dead. One could connect the dots to obituaries, which are often posted online.) Quote 1 or 2 such things, and given how often LLMs are running, and how few humans could or would know such things and then write them, and you achieve ~100% posterior confidence the author was a LLM and not a human.
Sounds somewhat like a bucket brigade market economy.
I didn’t mean Marcus had said anything about Sabine. What I meant by “whose expertise has little to do with AI (nor is regarded as such like a Gary Marcus)” is that ‘a Gary Marcus’ is ‘regarded as’ having ‘expertise [much] to do with AI’ and that is why, even though Marcus has been wrong about pretty much everything and has very little genuine expertise about AI these days, ie. DL scaling (and is remarkably inept at even the most basic entry-level use of LLMs) and his writings are intrinsically not worth the time it takes to read them, he is still popular and widely-regarded-as-an-expert and so it is useful to keep tabs on ‘oh great, what’s Marcus saying now that everyone is going to repeat for years to come?’ You can read someone because they are right & informative, or you can read someone because they are wrong & uninformative but everyone else reads them; but you shouldn’t read someone who is neither right nor read. So, you grit your teeth and wade into the Marcus posts that go viral...
I have misgivings about the text-fragment feature as currently implemented. It is at least now a standard and Firefox implements reading text-fragment URLs (just doesn’t conveniently allow creation without a plugin or something), which was my biggest objection before; but there are still limitations to it which show that a lot of what the text-fragment ‘solution’ is, is a solution to the self-inflicted problems of many websites being too lazy to provide useful anchor IDs anywhere in the page. (I don’t know how often I go to link a section of a blog post, where the post is written in a completely standard hierarchical table-of-contents way, and the headers turn out to be… nothing but
<h2>
s with not anid=
anywhere in sight.) We would be a lot better off if pages had more meaningful IDs and selecting text did something like, pick the nearest preceding ID. (This could be implemented in LW2 or Gwern.net right now, incidentally. If the user selects some text, just search through the tree to find the first previous ID, and update the current browser-bar URL toURL#ID
.)Hacking IDs onto an unwilling page, whose author neither knows nor cares nor can even find out what IDs are in use (or what they may be breaking by editing this or that word), is a recipe for long-term breakage: your archive.is example works simply because archive.is is an archive website, and the pages, in theory, never change (even though the original URLs certainly can, and often quite dramatically). That’s less true for LW comments or articles. There are also downstream effects: text-fragments are long and verbose and can’t be written by hand because they’re trying to specify arbitrary ranges which are robust to corruption, and they are unwieldy to search. (How does a tool handle different hash-anchors in a URL? Most choose to define them as unique URLs different from each other… so what happens when two users selecting from the same section inevitably wind up selecting slightly different text ranges every time, and every user has a unique text-fragment anchor? Now suddenly every URL is unique—no more useful backlinks, no more consolidated discussions of the same URL, etc. And if the URL content changes, you don’t get anything out of it. It’s now just a bunch of trailing junk causing problems forever, like all that
?utm_foo_bar
junk.)Somewhat like the fad for abusing
#
for the stupid#!
JS thing (which pretty much everyone, Twitter included, came to regret), I worry that this is still a half-baked tech designed for a very narrow use case (Google’s convenience in providing search results) where we don’t know how well it will work in the wild long-term or what side-effects it will have. So I personally have been holding off on it and making a point of deleting those archive.is anchors.
A hack like that would just have other EDT failure modes: instead of confabulating evidence from my dataset or personal examples, it might just confabulate references. “Yes, this was predicted by Foo et al 1990, and makes perfect sense.”