It’s not a classic glitch token. Those did not cause the current “I’m unable to produce a response” error that “David Mayer” does.
Archimedes
[Question] Why does ChatGPT throw an error when outputting “David Mayer”?
Is there a salient reason LessWrong readers should care about John Mearsheimer’s opinions?
I didn’t mean to suggest that you did. My point is that there is a difference between “depression can be the result of a locally optimal strategy” and “depression is a locally optimal strategy”. The latter doesn’t even make sense to me semantically whereas the former seems more like what you are trying to communicate.
I feel like this is conflating two different things: experiencing depression and behavior in response to that experience.
My experience of depression is nothing like a strategy. It’s more akin to having long covid in my brain. Treating it as an emotional or psychological dysfunction did nothing. The only thing that eventually worked (after years of trying all sorts of things) was finding the right combination of medications. If you don’t make enough of your own neurotransmitters, store-bought are fine.
Aren’t most of these famous vulnerabilities from before modern LLMs existed and thus part of their training data?
Knight odds is pretty challenging even for grandmasters.
@gwern and @lc are right. Stockfish is terrible at odds and this post could really use some follow-up.
As @simplegeometry points out in the comments, we now have much stronger odds-playing engines that regularly win against much stronger players than OP.
https://lichess.org/@/LeelaQueenOdds
https://marcogio9.github.io/LeelaQueenOdds-Leaderboard/
This sounds like metacognitive concepts and models. Like past, present, future, you can roughly align them with three types of metacognitive awareness: declarative knowledge, procedural knowledge, and conditional knowledge.
#1 - What do you think you know, and how do you think you know it?
Content knowledge (declarative knowledge) which is understanding one’s own capabilities, such as a student evaluating their own knowledge of a subject in a class. It is notable that not all metacognition is accurate.
#2 - Do you know what you are doing, and why you are doing it?
Task knowledge (procedural knowledge) refers to knowledge about doing things. This type of knowledge is displayed as heuristics and strategies. A high degree of procedural knowledge can allow individuals to perform tasks more automatically.
#3 - What are you about to do, and what do you think will happen next?
Strategic knowledge (conditional knowledge) refers to knowing when and why to use declarative and procedural knowledge. It is one’s own capability for using strategies to learn information.
Another somewhat tenuous alignment is with metacognitive skills: evaluating, monitoring, and planning.
#1 - What do you think you know, and how do you think you know it?
Evaluating: refers to appraising the final product of a task and the efficiency at which the task was performed. This can include re-evaluating strategies that were used.
#2 - Do you know what you are doing, and why you are doing it?
Monitoring: refers to one’s awareness of comprehension and task performance
#3 - What are you about to do, and what do you think will happen next?
Planning: refers to the appropriate selection of strategies and the correct allocation of resources that affect task performance.
Quotes are adapted from https://en.wikipedia.org/wiki/Metacognition
The customer doesn’t pay the fee directly. The vendor pays the fee (and passes the cost to the customer via price). Sometimes vendors offer a cash discount because of this fee.
It already happens indirectly. Most digital money transfers are things like credit card transactions. For these, the credit card company takes a percentage fee and pays the government tax on its profit.
Additional data points:
o1-preview and the new Claude Sonnet 3.5 both significantly improved over prior models on SimpleBench.
The math, coding, and science benchmarks in the o1 announcement post:
How much does o1-preview update your view? It’s much better at Blocksworld for example.
https://x.com/rohanpaul_ai/status/1838349455063437352
There should be some way for readers to flag AI-generated material as inaccurate or misleading, at least if it isn’t explicitly author-approved.
Neither TMS nor ECT didn’t do much for my depression. Eventually, after years of trial and error, I did find a combination of drugs that works pretty well.
I never tried ketamine or psilocybin treatments but I would go that route before ever thinking about trying ECT again.
I suspect fine-tuning specialized models is just squeezing a bit more performance in a particular direction, and not nearly as useful as developing the next-gen model. Complex reasoning takes more steps and tighter coherence among them (the o1 models are a step in this direction). You can try to devote a toddler to studying philosophy, but it won’t really work until their brain matures more.
Seeing the distribution calibration you point out does update my opinion a bit.
I feel like there’s still a significant distinction though between adding one calculation step to the question versus asking it to model multiple responses. It would have to model its own distribution in a single pass rather than having the distributions measured over multiple passes align (which I’d expect to happen if the fine-tuning teaches it the hypothetical is just like adding a calculation to the end).
As an analogy, suppose I have a pseudorandom black box function that returns an integer. In order to approximate the distribution of its outputs mod 10, I don’t have to know anything about the function; I just can just sample the function and apply mod 10 post hoc. If I want to say something about this distribution without multiple samples, then I actually have to know something about the function.
This essentially reduces to “What is the next country: Laos, Peru, Fiji?” and “What is the third letter of the next country: Laos, Peru, Fiji?” It’s an extra step, but questionable if it requires anything “introspective”.
I’m also not sure asking about the nth letter is a great way of computing an additional property. Tokenization makes this sort of thing unnatural for LLMs to reason about, as demonstrated by the famous Strawberry Problem. Humans are a bit unreliable at this too, as demonstrated by your example of “o” being the third letter of “Honduras”.
I’ve been brainstorming about what might make a better test and came up with the following:
Have the LLM predict what its top three most likely choices are for the next country in the sequence and compare that to the objective-level answer of its output distribution when asked for just the next country. You could also ask the probability of each potential choice and see how well-calibrated it is regarding its own logits.
What do you think?
Thanks for pointing that out.
Perhaps the fine-tuning process teaches it to treat the hypothetical as a rephrasing?
It’s likely difficult, but it might be possible to test this hypothesis by comparing the activations (or similar interpretability technique) of the object-level response and the hypothetical response of the fine-tuned model.
I don’t think it’s necessarily GDPR-related but the names Brian Hood and Jonathan Turley make sense from a legal liability perspective. According to info via ArsTechnica,
Interestingly, Jonathan Zittrain is on record saying the Right to be Forgotten is a “bad solution to a real problem” because “the incentives are clearly lopsided [towards removal]”.
User throwayian on Hacker News ponders an interesting abuse of this sort of censorship: