Trevor Hill-Hand

Karma: 101

I deeply value evidence, reason, and letting people draw their own conclusions. I dislike telling anyone what to think or do.

I believe you, yes YOU, are capable of reading and understanding everything you want to read and understand.

Trevor Hill-Hand Apr 4, 2025, 5:59 PM
8 points
2
on: AI CoT Reasoning Is Often Unfaithful
I’ve actually noticed this in a hobby project, where I have some agents running around a little MOO-like text world and talking to each other. With DeepSeek-R1, just because it’s fun to watch them “think” like little characters, I noticed I see this sort of thing a lot (maybe 1-in-5-ish, though there’s a LOT of other scaffolding and stuff going on around it which could also be causing weird problems):
```
<think>
Alright I need to do this very specific thing "A" which I can see in my memories I've been trying to do for a while instead of thing B. I will do thing A, by giving the command "THING A".
</think>

THING B
```

Trevor Hill-Hand Mar 26, 2025, 1:54 PM
1 point
0
in reply to: Mis-Understandings’s comment on: Recent AI model progress feels mostly like bullshit
This is a good point! Typically I start from a clean commit in a fresh chat, to avoid this problem from happening too easily, proceeding through the project in the smallest steps I can get Claude to make. That’s what makes the situation feel so strange; it feels just like this problem, but it happens instantly, in Claude’s first responses.

Trevor Hill-Hand Mar 24, 2025, 11:57 PM
3 points
0
on: Recent AI model progress feels mostly like bullshit
I happened to be discussing this in the Discord today. I have a little hobby project that was suddenly making fast progress with 3.7 for the first few days, which was very exciting, but then a few days ago it felt like something changed again and suddenly even the old models are stuck in this weird pattern of like… failing to address the bug, and instead hyper-fixating on adding a bunch of surrounding extra code to handle special cases, or sometimes even simply rewriting the old code and claiming it fixes the bug, and the project is suddenly at a complete standstill. Even if I eventually yell at it strongly enough to stop adding MORE buggy code instead of fixing the bug, it introduces a new bug and the whole back-and-forth argument with Claude over whether this bug even exists starts all over. I cannot say this is rigorously tested or anything- it’s just one project, and surely the project itself is influencing its own behavior and quirks as it becomes bigger, but I dunno man, something just feels weird and I can’t put my finger on exactly what.

Trevor Hill-Hand Mar 15, 2025, 4:04 PM
4 points
1
on: LLMs may enable direct democracy at scale
I do think it’s helpful that managers now have a reliable way to summarize large amounts of comments, instead of making some poor intern with Excel try and figure out “sentiment analysis” to “read” thousands of comments without having to pay for a proper data scientist, and I wonder if that’s already had some effects in the world.

Trevor Hill-Hand Feb 16, 2025, 5:34 PM
3 points
0
on: Celtic Knots on Einstein Lattice
Ah, what a fun idea! I wonder if coloring or marking the ropes and/or edges somehow would make it easier to assemble ad hoc- I think Veritaseum’s video about non-periodic tilings included some sort of little markers on the edges that helped him orient new tiles, but that was on Penrose tiles and I’m not sure this shape has the same option.

Trevor Hill-Hand Feb 6, 2025, 11:56 PM
1 point
2
on: AI #102: Made in America
This is absolutely a selfish request, so bear that in mind, but could you include screenshots and/or quotes of all X.com posts, and link to what the post links to when applicable? I have it blocked.

Trevor Hill-Hand Feb 6, 2025, 6:28 PM
1 point
0
in reply to: cata’s comment on: Wired on: “DOGE personnel with admin access to Federal Payment System”
I thought these were pretty… let’s say “exciting”… reads, but I’d be interested to hear more people’s opinion of this as a trustworthy source.

Trevor Hill-Hand Jan 19, 2025, 6:50 PM
1 point
3
on: The Gentle Romance
Thank you.

Trevor Hill-Hand Dec 20, 2024, 2:56 PM
3 points
0
on: No Internally-Crispy Mac and Cheese
I wonder what effect an all-edges pan would have; how did it taste near the edges?

Trevor Hill-Hand Dec 15, 2024, 10:41 PM
1 point
0
on: Is AI alignment a purely functional property?
It seems like if there is any non-determinism at all, there’s always going to be an unavoidable potential for naughty thoughts, so whatever you call the “AI” must address them as part of its function anyway- either that or there is a deterministic solution?

Trevor Hill-Hand Dec 2, 2024, 7:45 PM
1 point
0
on: The Three Warnings of the Zentradi
You can read the fanfiction this is for at: https://www.fanfiction.net/s/14412246/1/Miss-Macross-My-Life-as-The-Star—I’ll get around to cross-posting it someday.

The Three Warnings of the Zentradi

Trevor Hill-HandNov 21, 2024, 8:28 PM

13 points

1 comment5 min readLW link

Trevor Hill-Hand Sep 16, 2024, 3:26 PM
1 point
0
on: SCP Foundation—Anti memetic Division Hub
All of their work is great, but for my favorite I highly recommend ‘Ra’, for similar reasons of feeling what it’s like to interrogate your own thoughts, senses, and reality itself.

https://qntm.org/ra

Also this fun little story (Valuable Humans in Transit) about an AI: https://qntm.org/transi

Trevor Hill-Hand Jul 23, 2024, 3:42 PM
1 point
0
on: The Garden of Eden
I didn’t know what to expect, and this was an interesting read. What was the context for when and where it was delivered? EDIT: nm just saw the Fiction tag. Still interested in context though; I do not know who James Windrow is, except for what I can speculate on from this story.

Trevor Hill-Hand Jun 24, 2024, 11:37 PM
11 points
0
in reply to: philip_b’s comment on: On Claude 3.5 Sonnet
In Anthropic’s support page for “I want to opt out of my prompts and results being used for training” they say:

We will not use your Inputs or Outputs to train our models, unless: (1) your conversations are flagged for Trust & Safety review (in which case we may use or analyze them to improve our ability to detect and enforce our Usage Policy, including training models for use by our Trust and Safety team, consistent with Anthropic’s safety mission), or (2) you’ve explicitly reported the materials to us (for example via our feedback mechanisms), or (3) by otherwise explicitly opting in to training.

Notably, this doesn’t provide an opt out method, and the same messaging is repeated across similar articles/questions. The closest thing to an opt out seems to be “you have the right to request a copy of your data, and object to our usage of it”.

Trevor Hill-Hand Nov 23, 2023, 12:01 AM
18 points
12
on: So you want to save the world? An account in paladinhood
I see people upvoting this, and I think I can see some good insights in this post, but MAN are glowfics obnoxious to read, and this feels really hard to read in a very similar way. I’m sad it is not easier to read.

Trevor Hill-Hand May 18, 2023, 5:29 PM
2 points
0
in reply to: Ericf’s comment on: When should I close the fridge?
Something that may help build a better model/intuition is this video from Technology Connections: https://www.youtube.com/watch?v=CGAhWgkKlHI

I mentally visualize the cold air as a liquid when I open the door, or maybe picturing it looking similar to the fog from dry ice.

Since it’s cold, it falls downward, “pouring” out onto the floor, and probably does not take more than a few seconds, though I would love to see someone capture it on video with a thermal camera.

After that, I figure it doesn’t really matter how long the door is open, until you start talking about leaving it open for 10+ minutes where you can then start to worry about the food’s temperature rising, and the fridge wasting energy trying to cool the open space.

On the timescale of just a few moments while you grab stuff, the damage is already done once you open it the first time, and leaving it open or opening/closing it again doesn’t really affect anything.

This is also why grocery stores and restaurant kitchens tend to have reach-in fridges, open from the top like a chest freezer, instead of vertical doors (though, that’s also for convenience).

Trevor Hill-Hand Apr 22, 2023, 3:35 AM
2 points
0
in reply to: bhauth’s comment on: ask me about my battery
I don’t think it would be TOO long, I happily read through very long posts on here.

However, that said, I was curious enough to read that blog post, and that’s about the length and level of detail I expect in a normal short-to-medium size LW post, but it also stopped short of where I wanted it to. I hope that helps calibrate a little? I don’t know how “typical” I am as an example LW reader though.

Oh, and because I know it annoys me when people get distracted away from the main question by this sort of stuff, question is “Can you share the experimental results with just enough explanation to understand the methodology”, because I think everything else will flow naturally from questions about the experiment and the results.

Trevor Hill-Hand Mar 1, 2023, 4:07 PM
1 point
0
on: Transcript: Testing ChatGPT’s Performance in Engineering
I’ve been doing similar things with my day-to-day work like making stuff in CSS/Bootstrap or Excel, and my hobbies like mucking about in Twine or VCV Rack, and have noticed:
- a similar vibe of there seems to be a “goldilocks prompt narrowness” that gives really good results
- that goldilocks band is different for different topics
- plausible-sounding errors sneak in at all levels except the broadest, where it tends more towards very hedged “fluffy” statements like “be careful!”
However, if you treat it almost like a student, and inform it of the errors/consequences of whatever it suggested, it’s often surprisingly good at correcting the error, but here is where differences between how much it “understands” domains like “CSS” vs. “Twine’s Harlowe 3.3.4 macro format” become easier to see- it seems much more likely to make up function and features of Harlowe that resemble things from more popular languages.

For whatever reason, it’s really fun to engage it on things you have expertise in and correct it and/or rubber duck off of it. It gives you a weird child of expertise and outsider art.

Trevor Hill-Hand Jan 15, 2023, 10:29 PM
−5 points
0
on: A simple proposal for preserving free speech on twitter
My first thought, and the first thought my wife had, was that this idea feels really good at first, and the reasoning sound, but it also feels a bit like what you would do if you wanted to intentionally exacerbate echo-chamber effects. Whether it would it actually have that effect, I don’t know.

Trevor Hill-Hand

The Three Warn­ings of the Zentradi

The Three Warnings of the Zentradi