Aprillion
If I may take the liberty for a somewhat broader take, what I value from human literature review over chatbot assistant slop (*cough* “research by LRMs/agents with internet search”) is:
judgement (I mean having any spine whatsoever, even if “good judgement” were a better desiderata .. but that would be asking for sycophancy so I am not asking for any self-defining/defying qualities) and
faithful reasoning (frankly, I am not going to follow the steps to do my own research, but if I imagine I would be a better person and do my own research, it makes me feel better to see the steps / a recipe how I could form my own conclusions following a sound methodology … what are the cruxes where “reasonable people can disagree” vs what are the tar pits where there’s “no fucking way anyone shall possibly believe anything other than X”) - I don’t care whether I agree or disagree with an opinion, but I want to see firm hinges in arguments to help me understand multiple perspectives, anything is better than extreme vagueness that doesn’t say anything of substance using too many words to not say it (called “AI slop” these days but that style of prose was invented long before AI and I am allergic to it … I don’t believe you are at any risk of producing that, so please keep that quality whatever else you might change)
..for example, if there are multiple reasons why a study is bad, it would be enough (for me) to explain in details only the worst reason without a long list of all bad things (if sample size was small, but there was also a bigger problem for which increasing sample size would not help anyway, it’s fine to summarize all minor flaws in 1 sarcastic sentence and go into explanation just for the worst mistake they made, why their methodology could not possibly prove anything about the topic one way or the other … or moving all extra info into appendix A-J or <details> element might be helpful to keep stuff short(er)((ish)))
and literal paper still exists too .. for people who need a break from their laptops (eeh, who am I kidding, phones) 📝
I heard rumors about actual letter sending even, but no one in my social circles has seen it for real.. yet.
I somehow completely agree with both of your perspectives, have you tried to ban the word “continuous” in your discussions yet? (on the other hand, I don’t think it should be a crux, probably just ambiguous meaning like “sound” in the “when a tree falls” thingy … but I would be curious if you would be able to agree on the 2 non-controversial meanings between the 2 of you)
It reminds me of stories about gradualism / saltationism debate in evolutionary biology after gradualism won and before the idea of punctuated equilibrium… Parents and children are pretty discreet units, but gene pools over millions of years are pretty continuous from the perspective of an observer long long time later who is good at spotting low-frequency patterns ¯\_(ツ)_/¯
For a researcher, even GPT 3.5 to 4 might have been a big jump in terms of compute budget approval process (and/or losing a job from disbanding a department). And the same event on a benchmark might look smooth—throughout multiple big architecture changes a la the charts that illustrate Moore’s law—the sweat and blood of thousands of engineers seems kinda continuous if you squint enough.
And what even is “continuous”—general relativity is a continuous theory, but my phone calculates my GPS coordinates with numerical methods, time dilation from gravity field/the geoid shape is just approximated and nanosecond(-ish) precision is good enough to pin me down as much as I want (TBH probably more precision that I would choose myself as a compromise with my battery life). Real numbers are continuous, but they are not computable (I mean in practice in our own universe, I don’t care about philosophical possibilities), so we approximate them with a finite set of kinda shitty rational-ish numbers for which even
0.1 + 0.2 == 0.3
is false (in many languages, including JS in a browser console and in Python)..Some stuff will work “the same” in the new paradigm, some will be “different”—does it matter whether we call it (dis)continuous, or do we know already what to predict in more detail?
hm, as a non-expert onlooker, I found the paraphrase pretty accurate.. for sure it sounds more reasonable in your own words here compared to the oversimplified summary (so thank you for clarification!), but as far as accuracy of summaries go, this one was top tier IMHO (..have you seen the stuff that LLMs produce?!)
Thanks for the elaboration. My position is that Bob’s first statement is literally false but that it implies a truth and the implication is what has been important for Alice throughout the interaction; that Bob’s second statement does NOT claim that the first one should have been originally understood as something else that is true—only in the explicit context of additional evidence; that Bob does NOT claim that his earlier statement retroactively changed.
I would still say that Bob did communicate that he was “too enthusiastic originally” and that he would have wanted to make “more carefully worded argument” if the stakes were higher—the one that was implied by the original statement and that would not have been literally false but would have been literally true if worded more carefully. But since the implied argument that he meant to say is clear to all participants at the time of the second statement, Bob knows that Alice knows what Bob meant to say, thus I say it’s OK for Bob to say that the thing in his mind had the correct shape all along—in all important aspects given the stakes of the discussion—he really did mean the thing that they together discovered as the truth even if the exact thing he said turned out literally false, but it’s false in a way that doesn’t matter now that they both know better.
I believe both of us would agree that if Bob said “All I really meant to say was that I had blue pens at my house”, that would have been a better way to comment on his own mistake, without skipping the part that disambiguates between the 2 interpretations: whether he believes in equivalence between 2 literal statements that are incompatible; or whether he would have wished to think about his words more carefully and not try to bet on a statement that is too narrow (a la the “introverted librarian” vs “librarian” fallacy which I forgot the name of—edit: conjunction fallacy)...
But I believe our disagreement is that I took 1 extra charitable step and I don’t believe Bob did any status-defending at the cost of clarity. I think Bob was clear in his communication as long as I don’t take an adversarial stance towards the imperfectly-worded “All I really meant was that I had blue pens at my house”. That the imperfect wording is excusable given the low stakes.
Though I would not endorse similar imperfect wording in a scientific paper errata.
huh? I see the bizarreness in the opposite way—why would you score Bob in such an anti-useful way?
to me, it seems blindingly obvious what Bob was trying to communicate and that Alice would be completely on Bob’s side without any untrustworthiness points about any words or anything else between them, not even a hint… they looked into the drawer because that’s where Bob remembered leaving them, then they both concluded that the exact statement is wrong and that they should focus on the gist of the problem they are trying to solve—if they both see the blue pens on the table, they both agree about the existence of blue BIC pens and they both agree that either Bob’s memory is faulty or that someone else moved the pens, but that it doesn’t matter, that the argument Bob should have made was a weaker version of what he actually said when he was too enthusiastic, that a more carefully worded argument would have been warranted but that what he meant was that he can prove the existence of blue pens by showing them to Alice
I believe Bob has a grip, and that Alice does not even need to be that extra charitable, just reasonable / not adversarial. There is no need to paint Bob in any wrong light here, the structure of this story is completely within the bounds of intellectual standards proposed in the OP, I don’t see what could possibly be said against Bob here in any honest manner.
If Bob’s prediction was written down “for the record” somewhere, that place should be updated by the new definition that both Bob and Alice completely agree about, for the benefit of onlookers—yes, I can agree with that! But in this story, Bob should NOT be accused of anything—not even that he was sloppy—given the everyday-life stakes of this story, the level of precision was entirely appropriate IMHO.
a) Thanks for this post, I would never have noticed that the design was intended to be quite nice … and I would completely miss the “Earth with ominous red glow” on https://ifanyonebuildsit.com without reading this 😱
b) I bet you didn’t admire the beautiful design when trying to enjoy your morning coffee with the sun behind your back on a 10yo 4K monitor after a Win11 laptop refused to recognize that it supports 10bpc so it only uses 8bpc, in Firefox (that still doesn’t support gradient dithering), after some jerk forced dark mode on your favorite morning-news-before-work site that usually respects your OS settings (light mode during the day, dark before sleep) and they also removed the usual menu option to toggle it back, so you had to lean around the reflections to spot the subtle X button...
I can only give you my word that the terrible purple hues from my reading-warm color settings looked nothing like a sunset (and nothing like the intended indigo if I can judge the color from how it looks today on an iPad and/or Android, though they both only have sRGB displays and not showing real indigo hues like a rainbow or summer-night sky), and that I didn’t even notice the subtle scroll animations because I didn’t scroll anything...
...but feel free to judge the gradient banding for yourself:Hopefully this comment is a useful data point when deciding to “do more cool things with the frontpage for other things in the future” ;)
Example of a referent for “works on my computer”—shared understanding of a joke about software, when the same code is run in different environments and a bug in the system that contains the code is reproducible only in deployment environments and not on a developer’s local machine.
The new proposed terminology doesn’t seem like a 1:1 mapping from Joe’s dichotomy to me—when reading Joe’s writing, it felt like a linear probe into multi-dimensional concept space, while the distinction here sounds like mode collapse 🤔
I cannot use a thermometer to measure the “temperature” of non-equilibrium plasma during super-alfvenic slipping reconnection, even though the ions have some “average speed” inside solar flares—even when everyone would agree about what “really happens” near the Sun about each individual particle, some parts of the bickering about the definitions of what is “thermal” vs “magnetic” could be considered “real thinking” and other parts “fake thinking” based on the usefulness for making predicitions with approximate models, there is no hope to run the physics models on the level of Shroediner equation any time soon and general relativity is continuos theory, so not even computable without approximations.
Fearing sounds of burglars have nothing to do with pressure waves and everything to do with losing money or life—both of which are social reality constructs, not “physical.”
Whether or not I shall be forgiven for bluntness, the concept of “physical world” sounds to me like an example of “fake thinking”—as if we wanted to throw away a century of post-modernism instead of learning from it, as if we wanted to regress into less nuance instead of more nuance...
What I find useful about this perspective is that it does point to something about stuff “in the environment” that is opposed to “useless internal thinking loops” when I imagine it applied to thinking about embeded agents—I just don’t see the terminology of “physical world” or “objective reality” as new steppings stones towards better understanding—IMHO those steppings stones gave us all the low hanging fruits already in game theory—who even cares about “physical,” there is nothing “physical” about the mind, only about the brain and there is no theory of how the mind rises from the brain yet, so 🤷 in that sense, all thinking is “fictional,” but some of it is more “useful” and the words “fake vs real” seem better approximation for that intuition compared to “fictional vs real.”
All confused human ontologies are equal, but some confused human ontologies are more equal than others.
Interesting point of view about the occasional intense overlap between the 2 concepts, I could probably safely explore more of my agency over the world than what is my comfort zone 🤔
Nevertheless, I will keep to my expectation of a train delay originating from a departures table and not from my own intent.
Question from a conference discussion—do you have real-world-data examples in a shape like this illustration of how I understand Goodhart’s law?
E.g. when new LLM versions got better at a benchmark published before their training but not on a benchmark published later… (and if larger models generalize better to future benchmarks, does the typology here provide an insight why? does the bitter lesson of scale provide better structure, function, or randomness calibration? or if the causes are so unknown that this typology does not provide such an insight yet, what is missing? (because Goodhart law predicts that benchmark gaming should be happening to “some extent” ⇒ improvements in understanding of Goodhart law should preserve that quality, right?))
When the king is aligned with the kingdom, how would you distinguish the causal path that the king projected their power and their values onto the kingdom (that previously had different values or was a tabula rasa) and not that the kingdom had selected from a pool of potential kings?
After all, regicide was not that uncommon (both literally in the past and figuratively speaking when a mother company can dismiss a decision of a board of directors over who should be the CEO)...
(I’m not saying anything about Wizard power being more or less effective)
Interesting—the first part of the response seems to suggest that it looked like I was trying to understand more about LLMs… Sorry for confusion, I wanted to clarify an aspect of your worflow that was puzzling to me. I think I got all info for what I was asking about, thanks!
FWIW, if the question was an expression of actual interest and not a snarky suggestion, my experience with chatbots has been positive for brainstorming, dictionary “search”, rubber ducking, description of common sense (or even niche) topics, but disappointing for anything that requires application of commons sense. For programmming, one- or few-liner autocomplete is fine for me—then it’s me doing the judgement, half of the suggestions are completely useless, half are fine, and the third half look fine at first before I realise I needed the second most obvious thing this time.. but it can save time for the repeating part of almost-repeating stuff. For multi file editing,, I find it worse than useless when it feels like doing code review after a psychopath pretending to do programming (AFAICT all models can explain
everythingmost stuff correctly and then write the wrong code anyway .. I don’t find it useful when it tries to appologize later if I point it out or to pre-doubt itself in CoT in 7 paragraphs and then do it wrong anyway) - I like to imagine as if it was trained on all code from GH PRs—both before and after the bug fix… or as if it was bored, so it’s trying to insert drama into a novel about my stupid programming task, when the second chapter will be about heroic AGI firefighting the shit written by previous dumb LLMs...
it’s from https://gradual-disempowerment.ai/mitigating-the-risk … I’ve used
"just"
(including scare quotes) for the concept of something being very hard, yet simpler to the thing in comparisonand now that concept has more color/flavour/it sparkled a glimmer of joy for me (despite/especially because it was used to illuminate such a dark and depressing scene—gradual disempowerment is like putting a dagger to one’s liver where the mere(!) misaligned ASI was a stab between the ribs, lose thy hope mere mortals, you were grabbing for water)
“merely(!)” is my new favourite word
I can see that if Moloch is a force of nature, any wannabe singleton would collapse under internal struggles… but it’s not like that would show me any lever AI safety can pull, it would be dumb luck if we live in a universe where the ratio of instrumentally convergent power concentration to it’s inevitable schism is less than 1 ¯\_(ツ)_/¯
Have you tried to make a mistake in your understanding on purpose to test out whether it would correct you or agree with you even when you’d get it wrong?
(and if yes, was it “a few times” or “statistically significant” kinda test, please?)
it seems contradictory to believe both in vacuum decay and that space combat would be defense dominant .. what’s your definition of the latter that excludes “poisoning the wells”, please? (seems like you have some sub-category of conflict in mind, not the totalizing variant)