β-redex

Karma: 326

β-redex Jul 15, 2025, 9:04 PM
3 points
2
on: Do confident short timelines make sense?
During the live debate Tsvi linked to, TJ (an attendee of the event) referred to the modern LLM paradigm providing a way to take the deductive closure of human knowledge: LLMs can memorize all of existing human knowledge, and can leverage chain-of-thought reasoning to combine that knowledge iteratively, making new conclusions. RLVF might hit limits, here, but more innovative techniques might push past those limits to achieve something like the “deductive closure of human knowledge”: all conclusions which can be inferred by some combination of existing knowledge.
This “deductive closure” concept feels way too powerful to me. This is even hinted at later in the conversation talking about mathematical proofs, but I don’t think it’s taken to its full conclusion: such a deductive closure would contain all provable mathematical statements, which I am skeptical even an ASI could achieve.^[1]
To spell this out more precisely: the deductive closure of just “the set theory axioms” would be “all of mathematics”, including (dis)proofs for all our currently unproven conjectures^[2] (e.g. P ≠ NP), and all possible mathematical statements.^[2]
1. ^
  Well, as long as we want to stick to some reasonable algorithmic complexity. Otherwise the “just try all possible proofs in sequence” algorithm is something we already have and works perfectly.
2. ^
  Well, as long as they are not undecidable.

β-redex Jun 30, 2025, 4:17 PM
2 points
0
in reply to: johnswentworth’s comment on: johnswentworth’s Shortform
Conditional on LLMs scaling to AGI, I feel like it’s a contradiction to say that “LLMs offer little or negative utility AND it’s going to stay this way”. My model is that we are either dying in a couple years to LLMs getting us to AGI, and we are going to have a year or two or of AIs that can provide incredible utility, or we are not dying to LLMs and the timelines are longer.

I think I read somewhere that you don’t believe LLMs will get us to AGI, so this might already be implicit in your model? I personally am putting at least some credence on the ai-2027 model, which predicts superhuman coders in the near future. (Not saying that I believe this is the most probable future, just that I find it convincing enough that I want to be prepared for it.)

Up until recently I was in the “LLMs offer zero utility” camp (for coding), but now at work we have a Cursor plan (still would not pay for it for personal use probably), and with a lot of trial and error I feel like I am finding the kinds of tasks where AIs can offer a bit of utility, and I am slowly moving towards the “marginal utility” camp.

One kind of thing I like using it for is small scripts to automate bits of my workflow. E.g. I have an idea for a script, I know it would take me 30m-1h to implement it, but it’s not worth it because e.g. it would only save me a few seconds each time. But if I can reduce the time investment to only a few minutes by giving the task to the LLM, it can suddenly be worth it.

I would be interested in other people’s experiences with the negative side effects of LLM use. What are the symptoms/warning signs of “LLM brain rot”? I feel like with my current use I am relatively well equipped to avoid that:
- I only ask things from LLMs that I know I could solve in a few hours tops.
- I code review the result, tell it if it did something stupid.
- 90% of my job is stuff that is currently not close to being LLM automatable anyway.

β-redex Jun 10, 2025, 10:29 PM
1 point
0
in reply to: habryka’s comment on: β-redex’s Shortform
Out of curiosity, why do you post on Twitter? Is it network effects, or does it just have such a unique culture that’s not available anywhere else? (Or something else?) Do you not feel any kind of aversion towards the platform, which would potentially discourage you from interacting? (I don’t mean this to sound accusatory, if your position is that “yes Twitter is awesome and more rationalists should join” I would also like to hear about that.)

β-redex Jun 10, 2025, 3:55 PM
1 point
0
in reply to: CstineSublime’s comment on: β-redex’s Shortform
As you highlight, asking everyone to set up their own cross-posting solution is probably not viable. But if there was some service run by the LW team that had a simple guide for setting it up (e.g. go to your LW account, get your Twitter API key, copy it here, grant permission, done.) and it took ~5 minutes, that would lower the barrier to entry a lot and would be a huge step forward.

β-redex Jun 10, 2025, 3:42 PM
3 points
2
in reply to: quetzal_rainbow’s comment on: β-redex’s Shortform
Not even fitting for Quick Takes? We could have a “Quicker Quick Takes” or “LW Shitposts” section for all I care. (More seriously if you really wanted some separate category for this you could name it something like “LW Microblog”.)

Also a lot of Eliezer’s tweets are quite high effort, to the point where some of them get cross-posted as top level posts. (E.g. https://www.lesswrong.com/posts/fPvssZk3AoDzXwfwJ/universal-basic-income-and-poverty )

β-redex’s Shortform

β-redexJun 9, 2025, 9:20 PM

4 points

31 comments LW link

β-redex Jun 9, 2025, 9:20 PM
36 points
27
on: β-redex’s Shortform
Why is a significant amount of content by some rationality adjacent people only posted on X/Twitter?
I hope I don’t have to explain why some people would rather not go near X/Twitter with a ten foot pole.
The most obvious example is Eliezer, who is much more active on Twitter than LW. I “follow” some people on Twitter by reading their posts using Nitter (e.g. xcancel.com ).
What triggered me to post this today is that it seems @Aella set her account to followers-only (I assume due to some recent controversy), so now the only way for me to read her tweets would be to create a Twitter account.
Why can’t some of this content be mirrored e.g. as LW Quick Takes, or on basically any other platform outside Twitter’s walled garden? A lot of internet personalities cross-post their stuff to multiple platforms these days. One of the best examples I have seen is Molly White, she has her own self-hosted microblog feed which is pushed to Twitter, Bluesky, and Mastodon simultaneously: https://www.mollywhite.net/micro
I think even just having Eliezer’s content available somewhere else would be valuable enough to the community for the LW team to possibly assist with some technical solution here.

β-redex May 25, 2025, 9:16 AM
1 point
0
on: Dating Roundup #3: Third Time’s the Charm
A guide to taking the perfect dating app photo. This area of your life is important, so if you intend to take dating apps seriously then you should take photo optimization seriously, and of course you can then also use the photos for other things.

It has been ~1 year since this was posted, and photorealistic image generation has now gone mainstream with e.g. ChatGPT introducing it. People can now generate “improved” photos of themselves.
How has this affected dating apps? Could anyone actively using them weigh in on this?
I imagine the equilibrium to be everyone having extremely attractive AI generated photos of themselves. (If the person is attractive to begin with the AI version probably only has some minor tweaks compared to the reference photos, but it they are not so attractive originally the difference could be quite jarring.)
How far are we from that equilibrium right now and how fast are things changing? Does anything other than a race to the bottom with AI photos seem to be happening? (E.g. do people already have an aversion to AI photos in sufficient numbers that they penalize photos that look “too good”?)

β-redex May 12, 2025, 10:32 PM
10 points
0
in reply to: tcheasdfjkl’s comment on: How to title your blog post or whatever

What? I am telling you it is.

and preferentially read things that take less cognitive effort (all else equal, of course)

Sorry, no offense meant, I am just genuinely surprised. But I believe you now, I guess our experiences are just very different in this regard.

β-redex May 12, 2025, 9:40 PM
2 points
0
in reply to: tcheasdfjkl’s comment on: How to title your blog post or whatever
that sentence is semantically dense and grammatically complicated. I have to put in some work to break it down into noun phrases and such and figure out how it fits together. requiring cognitive work of potential readers before they’ve even decided if they want to read your thing is extremely anti-memetic

Sorry, but I call bullshit on this being a problem for you, or any other LW reader.

Now you are probably right that if you take the general population, for a significant number of people parsing anything but the simplest grammatical structures is going to impart noticeable extra cognitive load, lowering overall memetic fitness.

But as the post outlined, we are not optimizing for the number clicks, we are optimizing for something like P(loves article|clicked). See also https://www.lesswrong.com/posts/vidXh2DJtnqH5ysrZ/a-blog-post-is-a-very-long-and-complex-search-query-to-find

So if you are worried about someone bouncing off the title because of its grammatical complexity, you better also write the article with simple grammar (and simple content). Are there situations where your main goal is to reach as many people as possible? Sure, but for that you probably want to optimize both the title and content with that in mind. And at this point what you are doing is probably more like “political communication” than “writing something for like-minded people”.

the tone of it sounds like a dry academic paper. those are typically not very fun to read. it signals that this will also not be fun to read

For me it signals more positive things like seriousness and better epistemics^[1], but you probably have a point that there is space to signal the tone of the article in the title. Still, I don’t think reducing its information density is the right way to do it.
1. ↩︎
  Well, on blog posts at least. On actual academic papers everyone is expected to write in a serious sounding academic style, so there is much less signal there.

β-redex May 12, 2025, 7:41 PM
7 points
−3
on: How to title your blog post or whatever
I’m worried about Chicago

In what world is this a good title? It basically gives zero information about the topic of the article, it is exactly the kind of clickbait title that purposefully omits any relevant information from the title to try to make people click. I personally associate such clickbait titles with terrible content, and they make me much less likely to click.^[1]

What would be wrong with just using the subtitle as the actual title? It’s much more informative:

Slowing national population growth plus remote work spell big trouble for Midwestern cities
1. ↩︎
  Unfortunately various platforms actively push authors towards clickbait titles, and even if the author dislikes them they often don’t have a choice. Often good content is actually hiding behind terrible clickbait titles (and thumbnails) nowadays, so I sometimes have to make an active effort to click on clickbait because otherwise I could be missing out on discovering some good content.

β-redex May 8, 2025, 10:05 PM
9 points
2
on: Orienting Toward Wizard Power
Interesting. While the post resonates with me, I feel like I am trying to go in the opposite direction right now, trying to avoid getting nerd sniped by all the various fields I could be getting into, and instead strategically choosing the skills so that they are the most useful for solving the bottlenecks for my other goals that are not “learning cool technical things”.

Which is interesting, because so far based on your posts you struck me as the kind of person I am trying to be more like in this regard, being more strategic about my goals. So maybe the pendulum swings back, and eventually you find out that letting yourself get nerd sniped by random things does have some hidden benefits? I guess I will find out in a few years (if we are alive).

β-redex Apr 26, 2025, 2:39 PM
12 points
1
in reply to: faul_sname’s comment on: faul_sname’s Shortform
Wonder if correctness proofs (checked by some proof assistant) can help with this.^[1]

I think the main bottleneck in the past for correctness proofs was that it takes much more effort to write the proofs than it takes to write the programs themselves, and current automated theorem provers are nowhere near good enough.

Writing machine checked proofs is a prime RL target, since proof assistant kernels should be adversarially robust. We have already seen great results from stuff like AlphaProof.
1. ↩︎
  One counterargument I could see is that writing the correctness properties themselves could turn out to be a major bottleneck. It might be that for most real world systems you can’t write succinct correctness properties.

β-redex Apr 25, 2025, 7:17 PM
7 points
5
in reply to: Lucius Bushnaq’s comment on: Alexander Gietelink Oldenziel’s Shortform
Just to add another data point, as a software engineer, I also find it hard to extract utility from LLMs. (And this has not been for a lack of trying, e.g. at work we are being pushed to use LLM enabled IDEs.) I am constantly surprised to hear when people on the internet say that LLMs are a significant productivity boost for them.

My current model is that LLMs are better if you are working on some mainstream problem domain using a mainstream tech stack (language, library, etc.). This is approximately JavaScript React frontend development in my mind, and as you move away from that the less useful LLMs get. (The things I usually work on are using a non-mainstream tech stack and/or have a non-mainstream problem domain (but in my mind all interesting problems are non-mainstream in that sense), so this would explain my lack of success.)

β-redex Apr 17, 2025, 5:31 PM
5 points
0
in reply to: niplav’s comment on: niplav’s Shortform
I would definitely be interested if you found a way to self-review recordings of your social interactions for improvement. The main roadblock I see is that either you tell the other parties that you are recording, which will probably influence their behavior a lot and erase most of the signal you were looking for in the first place, or you don’t, which does feel a bit unethical.

β-redex Mar 31, 2025, 9:50 PM
1 point
0
in reply to: Yair Halberstadt’s comment on: Jan Betley’s Shortform
What does “obscure” mean here? (If you label the above “obscure”, I feel like every query I consider “non-trivial” could be labeled obscure.)

I don’t think Lean is obscure, it’s one of the most popular proof assistants nowadays. The whole Lean codebase should be in the AIs training corpus (in fact that’s why I deliberately made sure to specify an older version, since I happen to know that the olean header changed recently.) If you have access to the codebase, and you understand the object representation, the solution is not too hard.

Here is the solution I wrote just now:^[1]
```
import sys, struct
assert sys.maxsize > 2**32
f = sys.stdin.buffer.read()
def u(s, o, l): return struct.unpack(s, f[o:o+l])
b = u("Q", 48, 8)[0]
def c(p, i): return u("Q", p-b+8*(i+1), 8)[0]
def n(p):
    if p == 1: return []
    assert u("iHBB", p-b, 8)[3] == 1 # 2 is num, not implemented
    s = c(p, 1) - b
    return n(c(p, 0)) + [f[s+32:s+32+u("Q", s + 8, 8)[0]-1].decode('utf-8')]
a = c(u("Q", 56, 8)[0], 1) - b
for i in range(u("Q", a+8, 8)[0]):
    print('.'.join(n(u("Q", a+24+8*i, 8)[0])))
```
(It’s minified to both fit in the comment better and to make it less useful as future AI training data, hopefully causing this question to stay useful for testing AI skills.)
1. ↩︎
  I wrote this after I wrote the previous comment, my expectation that this should be a not-too-hard problem was not informed by me actually attempting it, only by rough knowledge of how Lean represents objects and that they are serialized pretty much as-is.

β-redex Mar 31, 2025, 9:41 PM
1 point
0
in reply to: Yair Halberstadt’s comment on: Tormenting Gemini 2.5 with the [[[]]][][[]] Puzzle
Yep, that “standard library” part sure seems problematic, I am not sure if an algorithm for listing primes is shorter than just the above lookup table.

β-redex Mar 31, 2025, 7:05 PM
1 point
0
in reply to: β-redex’s comment on: Jan Betley’s Shortform
Just to give an example, here is the kind of prompt I am thinking of. I am being very specific about what I want, I think there is very little room for misunderstanding about how I expect the program to behave:

Write a Python program that reads a .olean file (Lean v4.13.0), and outputs the names of the constants defined in the file. The program has to be standalone and only use modules from the python standard library, you cannot assume Lean to be available in the environment.

o3-mini gives pure garbage hallucination for me on this one, like it’s not even close.

β-redex Mar 31, 2025, 6:34 PM
1 point
0
in reply to: Jan Betley’s comment on: Jan Betley’s Shortform

If your answer to question A is “a specific thing,” and your answer to B is “yes, I’m very clear on what I want,” then just explain it thoroughly, and you’re likely to get satisfying results. Impressive examples like “rewrite this large complex thing that particular way” fall into this category.

Disagree. It sounds like by “being specific” you mean that you explain how you want the task to be done to the AI, which in my opinion can only be mildly useful.

When I am specific to an AI about what I want, I usually still get buggy results unless the solution is easy. (And asking the AI to debug is only sometimes successful, so if I want to fix it I have to put in a lot of work to understand the code the AI wrote carefully to debug it.)

β-redex Mar 30, 2025, 5:13 AM
3 points
0
in reply to: β-redex’s comment on: Tormenting Gemini 2.5 with the [[[]]][][[]] Puzzle
I guess the reasoning for why the solution given in the post is more “valid” than this one is “something something Occam’s razor” or that it is “more elegant” (whatever “elegant” means), but if someone can make a more precise argument I would be interested to hear. (In particular, in my mind Occam’s razor is something to do with empiricism, while what we are doing here is pure logic, so not sure how it exactly applies?)

β-redex

β-re­dex’s Shortform

β-redex’s Shortform