ozziegooen

Karma: 4,233

I’m currently researching forecasting and epistemics as part of the Quantified Uncertainty Research Institute.

ozziegooen May 28, 2025, 6:22 PM
5 points
0
in reply to: johnswentworth’s comment on: johnswentworth’s Shortform
Personally, I’m fairly committed to [talking a lot]. But I do find it incredibly difficult to do at parties. I’ve been trying to figure out why, but the success rate for me plus [talking a lot] at parties seems much lower than I would have hoped.

ozziegooen May 23, 2025, 7:18 PM
2 points
0
in reply to: jimrandomh’s comment on: Jimrandomh’s Shortform Posts
Quickly:
1. I imagine that strong agents should have certain responsibilities to inform certain authorities. These responsibilities should ideally be strongly discussed and regulated. For example, see what therapists and lawyers are asked to do.
2. “doesn’t attempt to use command-line tools” → This seems like a major mistake to me. Right now an agent running on a person’s computer will attempt to use that computer to do several things to whistleblow. This obviously seems inefficient, at very least. The obvious strategy is just to send one overview message to some background service (for example, something a support service to one certain government department), and they would decide what to do with it from there.
3. I imagine a lot of the problem now is just that these systems are pretty noisy at doing this. I’d expect a lot of false positives and negatives.

ozziegooen May 12, 2025, 9:48 PM
2 points
0
in reply to: Elizabeth’s comment on: Elizabeth’s Shortform
Part of me wants to create some automated process for this. Then part of me thinks it would be pretty great if someone could offer a free service (even paid could be fine) that has one person do this hunting work. I presume some of it can be delegated, though I realize the work probably requires more context than it first seems.

ozziegooen May 6, 2025, 12:14 AM
4 points
2
in reply to: Buck’s comment on: Interpretability Will Not Reliably Find Deceptive AI
CoT monitoring seems like a great control method when available, but I think it’s reasonably likely that it won’t work on the AIs that we’d want to control, because those models will have access to some kind of “neuralese” that allows them to reason in ways we can’t observe.

Small point, but I think that “neuralese” is likely to be somewhat interpretable, still.
1. We might advance at regular LLM interpretability, in which case those lessons might apply.
2. We might pressure LLM systems to only use CoT neuralese that we can inspect.

There’s also a question of how much future LLM agents will rely on CoT vs. more regular formats for storage. For example, I believe that a lot of agents now are saving information in English into knowledge bases of different kinds. It’s far easier for software people working with complex LLM workflows to make sure a lot of the intermediate formats are in languages they can understand.

All that said, personally, I’m excited for a multi-layered approach, especially at this point when it seems fairly early.

ozziegooen Apr 25, 2025, 7:29 PM
2 points
2
in reply to: habryka’s comment on: MichaelDickens’s Shortform
There are a few questions here.

1. Do Jaime’s writings that that he cares about x-risk or not?
→ I think he fairly clearly states that cares.

2. Does all the evidence, when put together, imply that actually, Jaime doesn’t care about x-risk?
→ This is a much more speculative question. We have to assess how honest he is in his writing. I’d bet money that Jaime at least believes that he cares and is taking corresponding actions. This of course doesn’t absolve him of full responsibility—there are many people who believe they do things for good reasons, but causally actually do things for selfish reasons. But now we’re getting to a particularly speculative area.

“I also think it should be our dominant prior that someone is not motivated by reducing x-risk unless they directly claim they do.” → Again, to me, I regard him as basically claiming that he does care. I’d bet money that if we ask him to clarify, he’d claim that he cares. (Happy to bet on this, if that would help)

At the same time, I doubt that this is your actual crux. I’d expect that even if he claimed (more precisely) to care, you’d still be skeptical of some aspect of this.

---

Personally, I have both positive and skeptical feelings about Epoch, as I do other evals orgs. I think they’re doing some good work, but I really wish they’d lean a lot more on [clearly useful for x-risk] work. If I had a lot of money to donate, I could picture donating some to Epoch, but only if I could get a lot of assurances on which projects it would go to.

But while I have reservations about the org, I think some of the specific attacks against them (and defenses or them) are not accurate.

ozziegooen Apr 25, 2025, 3:08 PM
2 points
0
in reply to: habryka’s comment on: MichaelDickens’s Shortform
I did a bit of digging, because these quotes seemed narrow to me. Here’s the original tweet of that tweet thread.
Full state dump of my AI risk related beliefs:
- I currently think that we will see ~full automation of society by Median 2045, with already very significant benefits by 2030
- I am not very concerned about violent AI takeover. I am concerned about concentration of power and gradual disempowerment. I put the probability that ai ends up being net bad for humans at 15%.
- I support treating ai as a general purpose tech and distributed development. I oppose stuff like export controls and treating AI like military tech. My sense is that AI goes better in worlds where we gradually adopt it and it’s seen as a beneficial general purpose tech, rather than a key strategic tech only controlled by a small group of people—
I think alignment is unlikely to happen in a robust way, though companies could have a lot of sway on AI culture in the short term.
- on net I support faster development of AI, so we can benefit earlier from it.
It’s a hard problem, and I respect people trying their hardest to make it go well.
Then right after:

All said, this specific chain doesn’t give us a huge amount of information. It totals something like 10-20 sentences.

> He says it so plainly that it seems as straightforwardly of a rejection of AI x-risk concerns that I’ve heard:

This seems like a major oversimplification to me. He says “I am concerned about concentration of power and gradual disempowerment. I put the probability that ai ends up being net bad for humans at 15%.” There is a cluster in the rationalist/EA community that believes that “gradual disempowerment” is an x-risk. Perhaps you wouldn’t define “concentration of power and gradual disempowerment” as technically an x-risk, but if so, that seems a bit like a technicality to me. It can clearly be a very major deal.

It sounds a lot to me that Jaime is very concerned about some aspects of AI risk but not others.

In the quote you reference, he clearly says, “Not that it should be my place to unilaterally make such a decision anyway.”. I hear him saying, “I disagree with the x-risk community about the issue of slowing down AI, specifically. However, I don’t think this disagreement a big concern, given that I also feel like it’s not right for me to personally push for AI to be sped up, and thus I won’t do it.”

ozziegooen Apr 24, 2025, 6:10 PM
2 points
0
in reply to: Ben Goldhaber’s comment on: bgold’s Shortform
however there’s a cynical part of me that sounds like some combo of @ozziegooenand Robin Hanson which notes we have methods now (like significantly increased surveillance and auditing) which we could use for greater trust and which we don’t employ.
Quick note: I think Robin Hanson is more on the side of “we’re not doing this because we don’t actually care”. I’m more on the side of, “The technology and infrastructure just isn’t good enough.”

What I mean by that is that I think it’s possible to get many of the benefits of surveillance without minimal costs, using a combination of Structured Transparency and better institutions. This would be a software+governance challenge.

ozziegooen Apr 23, 2025, 1:24 AM
2 points
0
in reply to: Ben Goldhaber’s comment on: bgold’s Shortform
Happy to see thinking on this.

I like the idea of getting a lot of small examples of clever uses of LLM in the wild, especially by particularly clever/experimental people.

I recently made this post to try to gather some of the techniques common around this community.

One issue that I have though is that I’m really unsure what it looks like to promote neat ideas like these, outside of writing long papers or making semi-viral or at least [loved by a narrow community] projects.

The most obvious way is via X/Twitter. But this often requires building an X audience, which few people are good at. Occasionally particularly neat images/clips by new authors go viral, but it’s tough.

I’d also flag:
- It’s getting cheaper to make web applications.
- I think EA has seen more success in making blog posts and web apps than we did things like [presenting neat ideas in videos/tweets].
- Often, [simple custom applications] are pretty crucial for actually testing out an idea. You can generate wireframes, but this only tells you a very small amount.

I guess what I’m getting at is that I think [web applications] are likely a major part of the solution—but that we should favor experimenting with many small ones, rather than going all-in on 2-4 ideas or so.

ozziegooen Apr 10, 2025, 11:02 PM
15 points
11
in reply to: Alexander Gietelink Oldenziel’s comment on: abramdemski’s Shortform
I’m curious whether you know of any examples in history where humanity purposefully and succesfully steered towards a significantly less competitive [economically, militarily,...] technology that was nonetheless safer.
This sounds much like a lot of the history of environmentalism and safety regulations? As in, there’s a long history of [corporations selling X, using a net-harmful technology], then governments regulating. Often this happens after the technology is sold, but sometimes before it’s completely popular around the world.

I’d expect that there’s similarly a lot of history of early product areas where some people realize that [popular trajectory X] will likely be bad and get regulated away, so they help further [safer version Y].

Going back to the previous quote:
“steer the paradigm away from AI agents + modern generative AI paradigm to something else which is safer”
I agree it’s tough, but would expect some startups to exist in this space. Arguably there are already several claiming to be focusing on “Safe” AI. I’m not sure if people here would consider this technically part of the “modern generative AI paradigm” or not, but I’d imagine these groups would be taking some different avenues, using clear technical innovations.

There are worlds where the dangerous forms have disadvantages later on—for example, they are harder to control/oversee, or they get regulated. In those worlds, I’d expect there should/could be some efforts waiting to take advantage of that situation.

Nuanced Models for the Influence of Information

ozziegooenApr 10, 2025, 6:28 PM

8 points

0 comments LW link

ozziegooen Apr 4, 2025, 3:03 PM
6 points
2
in reply to: Chris_Leong’s comment on: Chris_Leong’s Shortform
I’m sure they thought about it.

I think this is dramatically tougher than a lot of people think. I wrote more about it here.

https://www.facebook.com/ozzie.gooen/posts/pfbid0377Ga4W8eK89aPXDkEndGtKTgfR34QXxxNCtwvdPsMifSZBY8abLmhfybtMUkLd8Tl

ozziegooen Mar 28, 2025, 11:07 PM
3 points
0
in reply to: Vlad Sitalo’s comment on: Working in Virtual Reality: A Review
I have a Quest 3. The setup is a fair bit better than the Quest 2, but it still has a long way to go.
I use it in waves. Recently I haven’t used it much, maybe a few hours a month or so.
Looking forward to future headsets. Right now things are progressing fairly slowly, but I’m hopeful there might be some large market moment, followed by a lot more success. Though at this point it seems possible that could happen post-TAI, so maybe it’s a bit of a lost cause.

All that said, there is a growing niche community of people working/living in VR, so it seems like it’s a good fit for some people.

ozziegooen Mar 13, 2025, 6:13 PM
2 points
2
in reply to: johnswentworth’s comment on: johnswentworth’s Shortform
Obvious point—I think a lot of this comes from the financial incentives. The more “out of the box” you go, the less sure you can be that there will be funding for your work.
Some of those that do this will be rewarded, but I suspect many won’t be.
As such, I think that funders can help more to encourage this sort of thing, if they want to.

ozziegooen Mar 10, 2025, 9:05 PM
2 points
0
in reply to: Seth Herd’s comment on: when will LLMs become human-level bloggers?
“The missing step in the process you describe is figuring out when the research did produce surprising insights, which might be a class of novel problems (unless a general formulaic approach works and someone scaffolds that in).”
-> I feel optimistic about the ability to use prompts to get us fairly far with this. More powerful/agentic systems will help a lot to actually execute those prompts at scale, but the core technical challenge seems like it could be fairly straightforward. I’ve been experimenting with LLMs to try to detect what information that they could come up with that would later surprise them. I think this is fairly measurable.

ozziegooen Mar 10, 2025, 9:03 PM
3 points
0
in reply to: DaemonicSigil’s comment on: when will LLMs become human-level bloggers?
Thanks for the clarification!

I think some of it is that I find the term “original seeing” to be off-putting. I’m not sure if I got the point of the corresponding blog post.

In general, going forward, I’d recommend people try to be very precise on what they mean here. I’m suspicious that “original seeing” will mean different things to different people. I’d expect that trying to more precisely clarify what tasks or skills involved would make it easier to pinpoint which parts of it are good/bad for LLMs.

ozziegooen Mar 10, 2025, 4:22 AM
2 points
0
in reply to: danielechlin’s comment on: Marcello’s Shortform
By “aren’t catching” do you mean “can’t” or do you mean “wikipedia company/editors haven’t deployed an LLM to crawl wikipedia, read sources and edit the article for errors”?
Yep.
My guess is that this would take some substantial prompt engineering, and potentially a fair bit of money.
I imagine they’ll get to it eventually (as it becomes easier + cheaper), but it might be a while.

ozziegooen Mar 10, 2025, 4:20 AM
26 points
5
on: when will LLMs become human-level bloggers?
Some quick points:
1. I think there is an interesting question here and am happy to see it be discussed.
2. “This would, obviously, be a system capable of writing things that we deem worth reading.” → To me, LLMs produce tons of content worth me reading. I chat to LLMs all the time. Often I prefer LLM responses to LessWrong summaries, where the two compete. I also use LLMs to come up with ideas, edit text, get feedback, and a lot of other parts of writing.
3. Regarding (2), my guess is that “LessWrong Blog Posts” might become “Things we can’t easily get from LLMs”—in which case it’s a very high bar for LLMs!
4. There’s a question on Manifold about “When will AIs produce movies as well as humans?” I think you really need to specify a specific kind of movie here. As AIs improve, humans will use AI tools to produce better and better movies—so “completely AI movies” will have a higher and higher bar to meet. So instead of asking, “When will AI blog posts be as good as human blog posts?” I’d ask, “When will AI blog posts be as good as human blog posts from [2020]” or similar. Keep the level of AI constant in one of these options.
5. We recently held the $300 Fermi challenge, where the results were largely generated with AIs. I think some of the top ones could make good blog posts.
6. As @habryka wrote recently, many readers will just stop reading something if it seems written by an LLM. I think this trend will last, and make it harder for useful LLM-generated content to be appreciated.

ozziegooen 10 Mar 2025 4:08 UTC
20 points
11
in reply to: DaemonicSigil’s comment on: when will LLMs become human-level bloggers?
I feel like I’ve heard this before, and can sympathize, but I’m skeptical.

I feel like this prescribes an almost magical thinking to how many blog posts are produced. The phrase “original seeing” sounds much more profound than I’m comfortable with for such a discussion.

Let’s go through some examples:
- Lots of Zvi’s posts are summaries of content, done in a ways that’s fairly formulaic.
- A lot of Scott Alexander’s posts read to me like, “Here’s an interesting area that blog readers like but haven’t investigated much. I read a few things about it, and have some takes that make a lot of sense upon some level of reflection.”
- A lot of my own posts seem like things that wouldn’t be too hard to come up with some search process to create.
Broadly, I think that “coming up with bold new ideas” gets too much attention, and more basic things like “doing lengthy research” or “explaining to people the next incremental set of information that they would be comfortable with, in a way that’s very well expressed” gets too little.

I expect that future AI systems will get good at going from a long list of [hypotheses of what might make for interesting topics] and [some great areas, where a bit of research provides surprising insights] and similar. We don’t really have this yet, but it seems doable to me.

(I similarly didn’t agree with the related post)

ozziegooen 10 Mar 2025 3:45 UTC
2 points
0
in reply to: Marcello’s comment on: Marcello’s Shortform
That seems like a good example of a clear math error.

I’m kind of surprised that LLMs aren’t catching things like that yet. I’m curious how far along such efforts are—it seems like an obvious thing to target.

ozziegooen 8 Mar 2025 19:36 UTC
5 points
0
on: ozziegooen’s Shortform
If you’ve ever written or interacted with Squiggle code before, we at QURI would really appreciate it if you could fill out our Squiggle Survey!

https://docs.google.com/forms/d/e/1FAIpQLSfSnuKoUUQm4j3HEoqPmTYiWby9To8XXN5pDLlr95AiKa2srg/viewform

We don’t have many ways to gauge or evaluate how people interact with our tools. Responses here will go a long way to deciding on our future plans.

Also, if we get enough responses, we’d like to make a public post about ways that people are (and aren’t) using Squiggle.

ozziegooen

Nuanced Models for the In­fluence of Information

Nuanced Models for the Influence of Information