brambleboy

Karma: 176

brambleboy Apr 21, 2025, 4:43 PM
1 point
0
in reply to: Jiro’s comment on: A Dissent on Honesty
Presenting fabricated or cherry-picked evidence might have the best odds of persuading someone of something true, and so you could argue that doing so “maximizes the truth of the belief” they get, but that doesn’t make it honest.

brambleboy Apr 16, 2025, 8:37 PM
3 points
0
in reply to: eggsyntax’s comment on: Show, not tell: GPT-4o is more opinionated in images than in text
Just tried it. The description is in fact completely wrong! The only thing it sort of got right is that the top left square contains a rabbit.

brambleboy Apr 16, 2025, 3:38 AM
3 points
1
in reply to: eggsyntax’s comment on: Show, not tell: GPT-4o is more opinionated in images than in text
Your ‘just the image’ link is the same as the other link that includes the description request, so I can’t test it myself. (unless I’m misunderstanding something)

brambleboy Apr 14, 2025, 11:58 PM
9 points
0
in reply to: eggsyntax’s comment on: Show, not tell: GPT-4o is more opinionated in images than in text
I see, I didn’t read the thread you linked closely enough. I’m back to believing they’re probably the same weights.
I’d like to point out, though, that in the chat you made, ChatGPT’s description gets several details wrong. If I ask it for more detail within your chat, it gets even more details wrong (describing the notebook as white and translucent instead of brown, for example). In one of my other generations it also used a lot of vague phrases like “perhaps white or gray”.
When I sent the image myself it got all the details right. I think this is good evidence that it can’t see the images it generates as well as user-provided images. Idk what this implies but it’s interesting ¯\_(ツ)_/¯
What links here?
- eggsyntax's comment on Show, not tell: GPT-4o is more opinionated in images than in text by Daniel Tan (Apr 16, 2025, 7:41 PM; 3 points)

brambleboy Apr 14, 2025, 4:19 PM
1 point
0
on: Slopworld 2035: The dangers of mediocre AI
I think these sort of concerns will manifest in the near future, but it’ll be confusing because AI’s competence will continue to be unevenly distributed and unintuitive. I expect some AI systems will be superhuman, such as automated vehicles and some AI diagnosticians, and that incompetent AIs will gain unwarranted trust by association while the competent AIs get unwarranted distrust by association. Sometimes trusting AI will save lives, other times it will cost them.

brambleboy Apr 13, 2025, 8:00 PM
2 points
0
in reply to: eggsyntax’s comment on: Show, not tell: GPT-4o is more opinionated in images than in text
This thread shows an example of ChatGPT being unable to describe the image it generated, though, and other people in the thread (seemingly) confirm that there’s a call to a separate model to generate the image. The context has an influence on the images because the context is part of the tool call.

brambleboy Apr 10, 2025, 3:43 PM
3 points
0
in reply to: Gunnar_Zarncke’s comment on: AI #111: Giving Us Pause
We should always be able to translate latent space reasoning aka neuralese (see COCONUT) to a human language equivalent representation.
I don’t think this is true at all. How do you translate, say, rotating multiple shapes in parallel into text? Current models already use neuralese as they refine their answer in the forward pass. Why can’t we translate that yet? (Yes, we can decode the model’s best guess at the next token, but that’s not an explanation.)
Chain-of-thought isn’t always faithful, but it’s still what the model actually uses when it does serial computation. You’re directly seeing a part of the process that produced the answer, not a hopefully-adequate approximation.

brambleboy Mar 27, 2025, 4:30 AM
2 points
0
on: Fun With GPT-4o Image Generation
The rocket image with the stablediffusionweb watermark on it is interesting for multiple reasons:
1. It shows they haven’t eliminated watermarks randomly appearing in generated images yet, which is an old problem that seems like it should’ve been solved by now.
2. It actually looks like it was generated by an older Stable Diffusion model, which means this model can emulate the look of other models.

brambleboy Mar 23, 2025, 6:03 PM
12 points
2
in reply to: ryan_greenblatt’s comment on: lc’s Shortform
I think some long tasks are like a long list of steps that only require the output of the most recent step, and so they don’t really need long context. AI improves at those just by becoming more reliable and making fewer catastrophic mistakes. On the other hand, some tasks need the AI to remember and learn from everything it’s done so far, and that’s where it struggles- see how Claude Plays Pokémon gets stuck in loops and has to relearn things dozens of times.

brambleboy Mar 9, 2025, 4:53 AM
9 points
2
in reply to: MrCheeze’s comment on: So how well is Claude playing Pokémon?
Claude finally made it to Cerulean after the “Critique Claude” component correctly identified that it was stuck in a loop, and decided to go through Mt. Moon. (I think Critique Claude is prompted specifically to stop loops.)

brambleboy Mar 3, 2025, 9:37 PM
1 point
0
in reply to: Carl Feynman’s comment on: Why it’s so hard to talk about Consciousness
I’m glad you shared this, it’s quite interesting. I don’t think I’ve ever had something like that happen to me and if it did I’d be concerned, but I could believe that it’s prevalent and normal for some people.

brambleboy Feb 26, 2025, 4:38 PM
1 point
0
on: Dream, Truth, & Good
I don’t think your truth machine would work because you misunderstand what makes LLMs hallucinate. Predicting what a maximum-knowledge author would write induces more hallucinations, not less. For example, say you prompted your LLM to predict text supposedly written by an omniscient oracle, and then asked “How many fingers am I holding behind my back?” The LLM would predict an answer like “three” or something, because an omniscient person would know that, even though it’s probably not true.
In other words, you’d want the system to believe “this writer I’m predicting knows exactly what I do, no more, no less”, not “this writer knows way more than me”. Read Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations? for evidence of this.
What would work even better would be for the system to simply be Writing instead of Predicting What Someone Wrote, but nobody’s done that yet. (because it’s hard)

brambleboy Jan 13, 2025, 4:12 PM
3 points
0
on: Nudging My Way Out Of The Intellectual Mosh Pit
I’ve been trying to put all my long-form reading material in one place myself, and found a brand-new service called Reader which is designed specifically for this purpose. It has support for RSS, Newsletters, YouTube transcripts, and other stuff. $10 annually / $13 monthly.

brambleboy Jan 8, 2025, 7:45 PM
8 points
0
in reply to: roland’s comment on: Open Thread Winter 2024/2025
Don’t confuse probabilities and likelihoods?

brambleboy Jan 2, 2025, 11:56 PM
5 points
2
in reply to: johnswentworth’s comment on: Values Are Real Like Harry Potter
Thanks for responding.
I agree with what you’re saying; I think you’d want to maintain your reward stream at least partially. However, the main point I’m trying to make is that in this hypothetical, it seems like you’d no longer be able to think of your reward stream as grounding out your values. Instead it’s the other way around: you’re using your values to dictate the reward stream. This happens in real life sometimes, when we try to make things we value more rewarding.
You’d end up keeping your values, I think, because your beliefs about what you value don’t go away, and your behaviors that put them into practice don’t immediately go away either, and through those your values are maintained (at least somewhat).
If you can still have values without reward signals that tell you about them, then doesn’t that mean your values are defined by more than just what the “screen” shows? That even if you could see and understand every part of someone’s reward system, you still wouldn’t know everything about their values?

brambleboy Jan 2, 2025, 7:58 PM
3 points
0
on: Values Are Real Like Harry Potter
This conception of values raises some interesting questions for me.
Here’s a thought experiment: imagine your brain loses all of its reward signals. You’re in a depression-like state where you no longer feel disgust, excitement, or anything. However, you’re given an advanced wireheading controller that lets you easily program rewards back into your brain. With some effort, you could approximately recreate your excitement when solving problems, disgust at the thought of eating bugs, and so on, or you could create brand-new responses. My questions:
- What would you actually do in this situation? What “should” you do?
- Does this cause the model of your values to break down? How can you treat your reward stream as evidence of anything if you made it? Is there anything to learn about the squirgle if you made the video of it?
My intuition says that life does not become pointless, now that you’re the author of your reward stream. This suggests the values might be fictional, but the reward signals aren’t the one true source—in the same way that Harry Potter could live on even if all the books were lost.

brambleboy Dec 24, 2024, 10:49 PM
27 points
15
in reply to: Thane Ruthenis’s comment on: Thane Ruthenis’s Shortform
While I don’t have specifics either, my impression of ML research is that it’s a lot of work to get a novel idea working, even if the idea is simple. If you’re trying to implement your own idea, you’ll be banging your head against the wall for weeks or months wondering why your loss is worse than the baseline. If you try to replicate a promising-sounding paper, you’ll bang your head against the wall as your loss is worse than the baseline. It’s hard to tell if you made a subtle error in your implementation or if the idea simply doesn’t work for reasons you don’t understand because ML has little in the way of theoretical backing. Even when it works it won’t be optimized, so you need engineers to improve the performance and make it stable when training at scale. If you want to ship a working product quickly then it’s best to choose what’s tried and true.

brambleboy Dec 19, 2024, 3:24 PM
1 point
0
on: Write Good Enough Code, Quickly
At the start of my Ph.D. 6 months ago, I was generally wedded to writing “good code”. The kind of “good code” you learn in school and standard software engineering these days: object oriented, DRY, extensible, well-commented, and unit tested.
I think you’d like Casey Muratori’s advice. He’s a software dev who argues that “clean code” as taught is actually bad, and that the way to write good code efficiently is more like the way you did it intuitively before you were taught OOP and stuff. He advises “Semantic Compression” instead- essentially you just straightforwardly write code that works, then pull out and reuse the parts that get repeated.

brambleboy Dec 19, 2024, 1:45 PM
3 points
0
in reply to: Kaj_Sotala’s comment on: Don’t Associate AI Safety With Activism
Yeah, I think the mainstream view of activism is something like “Activism is important, of course. See the Civil Rights and Suffrage movements. My favorite celebrity is an activist for saving the whales! I just don’t like those mean crazy ones I see on the news.”

brambleboy Dec 18, 2024, 12:42 AM
1 point
0
on: Pacing: inexplicably good
Pacing is a common stimming behavior. Stimming is associated with autism / sensory processing disorder, but neurotypical people do it too.