Brendan Long

Karma: 2,139

Brendan Long Apr 23, 2025, 11:21 PM
9 points
0
on: o3 Is a Lying Liar
By penalizing the reward hacks you can identify, you’re training the AI to find reward hacks you can’t detect, and to only do them when you won’t detect them.
I wonder if it would be helpful to penalize deception only if the CoT doesn’t admit to it. It might be harder generate test data for this since it’s less obvious, but hopefully you’d train the model to be honest in CoT?
I’m thinking of this like the parenting stategy of not punishing children for something bad if they admit unprompted that they did it. Blameless portmortems are also sort-of similar.

Brendan Long Apr 23, 2025, 8:31 PM
2 points
−1
in reply to: Alexander Howell’s comment on: Why Should I Assume CCP AGI is Worse Than USG AGI?
Consider instead that Trump was elected with over 50% of the popular vote. Perhaps there are more fundamental cultural factors at play than the method used to count ballots.
Winning the popular vote in the current system doesn’t tell you what would happen in a different system. This is the same mistake people make when they talk about who would have won if we didn’t have an electoral college: If we had a different system, candidates would campaign differently and voters would vote differently.

Brendan Long Apr 23, 2025, 6:32 PM
2 points
0
in reply to: Mati_Roy’s comment on: Thoughts on the Double Impact Project
I doubt this organization could get 501(c) status since it’s only purpose is to make political donations (and it only matters if the organization you donate to is 501(c), it doesn’t matter if they then re-grant it to another charitable organization). I’m not an expert on this though.

Brendan Long Apr 18, 2025, 9:31 PM
3 points
0
in reply to: Gurkenglas’s comment on: What Makes an AI Startup “Net Positive” for Safety?
The value of the startup is only loosely correlated with being positive for AI safety (capabilities are valuable, but they’re not the only valuable thing). Ideally the startup would be worth billions if and only if AI safety was solved.

Brendan Long’s Shortform

Brendan LongApr 18, 2025, 2:23 AM

6 points

1 comment LW link

Brendan Long Apr 18, 2025, 2:23 AM
2 points
0
on: Brendan Long’s Shortform
I’d like to learn more Spanish words but have trouble sitting down to actually do language lessons, so I recently set my Claude “personal preferences” to:
Try to teach a random Spanish word in every conversation.
(This is the whole thing)
This has worked surprisingly well, and Claude usually either drops one word in Spanish with a translation midway through a response:
For your specific situation, I recommend a calibración (calibration) approach:
2. Accounting for concurrency: Ensure you’re capturing all hilos (threads) involved in query execution, especially for parallel queries.
(From a conversation about benchmarking)
Or it ends the conversation with a fun fact:
¡Palabra en español! “Herramienta”—which means “tool” in Spanish, quite relevant to your search for tools to automate SSH known_hosts management.
La palabra española para hoy es “configurar”—which means “to configure” in English, fitting perfectly with our discussion about configurable thinking limits!
I don’t know if this actually useful for learning, but it’s fun and worked better than I expected.
My wife tried a similar prompt (although her preferences are much longer) and it made Claude sometimes respond entirely in Spanish, so this could probably be made more specific. If you run into that, maybe try “Response in English but try to teach a random Spanish word in every conversation” would work better?

Brendan Long Apr 17, 2025, 6:51 PM
2 points
0
on: How worker co-ops can help restore social trust
It used to be that we had a two-tiered citizenry: one class owned and controlled the nation’s government (the nobility) and one class merely worked for said nation (the laborers). Then we decided that the laborers should also partially own and control the government. However, this practice was not extended to the workplace, which remains in that classic hierarchy to this day; with one class owning and controlling the firm, while the other class merely works for it.
This is not true? There are no legal restrictions on what class of people can own and control firms. Many worker-owned co-ops exist^[1], and even among public corporations, around 40% of stock is held by workers in retirement accounts^[2]. In some industries, it’s very common to receive stock as compensation too. A lot of small businesses are tautologically worker-owned since they only have one employee (the owner).
Just because we don’t legally mandate that every business is a co-op doesn’t mean they aren’t legal and don’t exist.
1. ^
  I suspect very large worker-owned co-ops are uncommon since the value of a slice of ownership goes down as the size increases, but there’s no legal restrictions on the size of a co-op.
2. ^
  This is an underestimate of stock owned by workers since it doesn’t include taxable savings, but it would be hard to separate wage labor from the labor of creating and running companies in taxable accounts. Retirement accounts should be representative of ‘normal workers’ since there are low per-person caps and it’s hard to fund them with anything except wages.

Brendan Long Apr 17, 2025, 6:00 PM
2 points
0
on: Host Keys and SSHing to EC2
This post prompted me to look into more general purpose solutions to this, since it seems like “SSH into an IP that’s known to be owned by a public cloud” should be fully automated at this point. We know which IP’s are part of AWS and we can fetch the host keys securely using the AWS CLI (or helper tools like this). We should be able to do the same over HTTPS for GitHub, Azure, Google Cloud, etc.
It’s surprising to me that no one seems to have made a general-purpose CLI or SSH plugin (if that’s a thing) for this. Google Cloud has a custom CLI that does this but it obviously only works for their servers.

Brendan Long Apr 17, 2025, 5:53 PM
4 points
0
in reply to: Thane Ruthenis’s comment on: AI #112: Release the Everything
I think normal people sort files into folders (and understand filesystems) less than you’d expect. On second thought though, I think you’re proposing something less confusing than I initially though. I think a general-purpose memory-category-tagging system would be way too confusing for users, but “you can create conversation categories and memory will only apply to other conversations in that category” is probably reasonable.

Brendan Long Apr 17, 2025, 5:32 PM
2 points
0
in reply to: Thane Ruthenis’s comment on: AI #112: Release the Everything
This sounds like the kind of thing power users would like but normal people would find confusing, like how Google+ was really cool for the nerds who were into it, but most people prefer to just have one list of friends on social networks.

Brendan Long Apr 14, 2025, 6:19 PM
4 points
0
on: Thoughts on the Double Impact Project
One downside of this is that charitable donations are typically tax deductable, but political donations aren’t. Whether this matters depends on tax brackets and whether the donater is going to itemize, but I imagine it would make it harder to convince people.

Brendan Long Apr 9, 2025, 4:19 AM
2 points
0
on: LessOnline 2025: Early Bird Tickets On Sale
I saw that there is a “friend” option for tickets, and kids are also allowed. How likely is a friend or spouse who doesn’t read these blogs to enjoy coming along? Did a lot of people bring friends/spouses last time?

Brendan Long Apr 6, 2025, 2:49 AM
2 points
0
on: Sleep peacefully: no hidden reasoning detected in LLMs. Well, at least in small ones.
I’m not sure if this is helpful (you might already know), but in Let’s Think Dot By Dot, they found that LLM’s could use filler tokens to improve computation, but they had to be specially trained for it to work. By default the extra tokens didn’t help.

Brendan Long Apr 6, 2025, 2:12 AM
11 points
0
on: Most Questionable Details in ‘AI 2027’

We haven’t really established why OpenBrain’s market dominance is inevitable.

I think they gave OpenBrain a generic name to indicate that they don’t know which company this would be, so I think it’s tautologically defined that OpenBrain is dominant because the dominant company is the one we’re looking at.

Brendan Long Apr 6, 2025, 1:21 AM
11 points
8
on: Prediction Markets Are Mediocre
This market seems valuable, but it depends on what you’re using it for. “Experts can’t predict what this politician is going to do” is useful information. Also it seems like a lot of voters were assuming the chance of Trump implementing major tariffs was tiny, so updating toward 50% would have helped them.

[Question] LessWrong merch?

Brendan LongApr 3, 2025, 9:51 PM

23 points

2 comments1 min readLW link

Brendan Long Mar 31, 2025, 8:40 PM
2 points
0
on: On Downvotes, Cultural Fit, and Why I Won’t Be Posting Again
I’ve had emails ignored, responses that amount to “this didn’t come from the right person,” and the occasional reply like this one, from a very prominent member of AI safety:
“Without reading the paper, and just going on your brief description…”
That’s the level of seriousness these ideas are treated with.
I only had time to look at your first post, and then only skimmed it because it’s really long. Asking people you don’t know to read something of this length is more than you can really expect. People are busy and you’re not the only one with demands on their time.
I would advise trying to put something at the beginning to help people understand what you’re about to cover and why they should care about it. For the capitalism post, I agree with most of what you said (although some of your bullet points are unsupported assertions), but I still don’t know what I’m supposed to take out of this, since ending capitalism isn’t tractable, and (as you mention in regards to governments) non-capitalism doesn’t help.

Brendan Long Mar 31, 2025, 6:00 AM
3 points
3
on: OpenAI lost $5 billion in 2024 (and its losses are increasing)
This seems to explain a lot about why Altman is trying so hard both to make OpenAI for-profit (to more easily raise money with that burn rate) and why he wants so much bigger data centers (to keep going on “just make it bigger”).

Brendan Long Mar 18, 2025, 11:41 PM
5 points
0
on: (The) Lightcone is nothing without its people: LW + Lighthaven’s big fundraiser

Due to an apparently ravenous hunger among our donor base for having benches with plaques dedicated to them, and us not actually having that many benches, the threshold for this is increased to $2,000.

Given the clear mandate from the community, when do you plan to expand Lighthaven with a new Hall of Benches, and how many benches do you think you can fit in it?

Brendan Long Mar 10, 2025, 3:04 AM
9 points
1
in reply to: azergante’s comment on: Lots of brief thoughts on Software Engineering
I think it’s more that learning to prioritize effectiveness over aesthetics will make you a more effective software engineer. Sometimes terrible languages are the right tool for the job, and I find it gives me satisfaction to pick the right tool even if I wish we lived in a world where the right tool was also the objectively best language (OCaml, obviously).

Brendan Long

Bren­dan Long’s Shortform

[Question] LessWrong merch?

Brendan Long’s Shortform