Raemon comments on JargonBot Beta Test

Raemon 2 Nov 2024 2:40 UTC
3 points
0
it becomes just another purveyor of AI “extruded writing product”.
If it happened here the way it happened on the rest of the internet, (in terms of what the written content was like) I’d agree it’d be straightforwardly bad.
For things like jargon-hoverovers, the questions IMO are:
- is the explanation accurate?
- is the explanation helpful for explaining complex posts, esp. with many technical terms?
- does the explanation feel like soulless slop that makes you feel ughy the way a lot of the internet is making you feel ughy these days?
If the answer to the first two is “yep”, and the third one is “alas, also yep”, then I think an ideal state is for the terms to be hidden-by-default but easily accessible for people who are trying to learn effectively, and are willing to put up with somewhat AI-slop-sounding but clear/accurate explanations.
If the answer to the first two is “yep”, and the third one is “no, actually is just reads pretty well (maybe even in the author’s own style, if they want that)”, then IMO there’s not really a problem.
I am interested in your actual honest opinion of, say, the glossary I just generated for Unifying Bargaining Notions (1/2) (you’ll have to click option-shift-G to enable the glossary on lesswrong.com). That seems like a post where you will probably know most of the terms to judge them on accuracy, while it still being technical enough you can imagine being a person unfamiliar with game theory trying to understand the post, and having a sense of both how useful they’d be and how aesthetically they feel.
My personal take is that they aren’t quite as clear as I’d like and not quite as alive-feeling as I’d like, but over the threshold of both that I much rather having them than not having them, esp. if I knew less game theory than I currently do.
- Raemon 2 Nov 2024 2:43 UTC
  4 points
  2
  Parent
  Part of the uncertainties we’re aiming to reduce here are “can we make thinking tools or writing tools that are actually good, instead of bad?” and our experiments so far suggest “maybe”. We’re also designing with “six months from now” in mind – the current level of capabilities and quality won’t be static.
  Our theory of “secret sauce” is “most of the corporate Tech World in fact has bad taste in writing, and the LLM fine-tunings and RLHF data is generated by people with bad taste. Getting good output requires both good taste and prompting skill, and you’re mostly just not seeing people try.”
  We’ve experimented with jailbroken Base Claude which does a decent job of actually having different styles. It’s harder to get to work reliably, but, not so much harder that it feels intractable.
  The JargonHovers currently use regular Claude, not jailbroken claude. I have guesses of how to eventually get them to write it in something like the author’s original style, although it’s a harder problem so we haven’t tried that hard yet.