samshap

Karma: 259

samshap Mar 22, 2025, 3:12 PM
1 point
0
in reply to: Mis-Understandings’s comment on: Mis-Understandings’s Shortform
I’m having trouble understanding your suggestion, especially the second paragraph. Could you spell it out a bit more?

samshap Oct 24, 2024, 12:48 AM
1 point
0
in reply to: Lucius Bushnaq’s comment on: Lucius Bushnaq’s Shortform
There are, but what does having a length below 10^90 have to do with the solomonoff prior? There’s no upper bound on the length of programs.

samshap Oct 23, 2024, 12:17 PM
5 points
−3
in reply to: Lucius Bushnaq’s comment on: Lucius Bushnaq’s Shortform
Yes, you are missing something.
Any DEADCODE that can be added to a 1kb program can also be added to a 2kb program. The net effect is a wash, and you will end up with a $2^{1000}$ ratio over priors

samshap Oct 16, 2024, 2:00 PM
7 points
4
on: Change My Mind: Thirders in “Sleeping Beauty” are Just Doing Epistemology Wrong
Thirder here (with acknowledgement that the real answer is to taboo ‘probability’ and figure out why we actually care)
The subjective indistinguishability of the two Tails wakeups is not a counterargument - it’s part of the basic premise of the problem. If the two wakeups were distinguishable, being a halfer would be the right answer (for the first wakeup).
Your simplified example/analogies really depend on that fact of distinguishability. Since you didn’t specify whether or not you have it in your examples, it would change the payoff structure.
I’ll also note you are being a little loose with your notion of ‘payoff’. You are calculating the payoff for the entire experiment, whereas I define the ‘payoff’ as being the odds being offered at each wakeup. (since there’s no rule saying that Beauty has to bet the same each time!)
To be concise, here’s my overall rationale:
Upon each (indistinguishable) wakeup, you are given the following offer:
- If you bet H and win, you get $N$ dollars.
- If you bet T and win, you get 1+ $ϵ$ dollars.
If you believe T yields a higher EV, then you have a credence $P (T) \geq \frac{N}{N + 1}$
You get a positive EV for all N up to 2, so $P (T) = \frac{2}{3}$ . Thus you should be a thirder.
Here’s a clarifying example where this interpretation becomes more useful than yours:
The experimenter flips a second coin. If the second coin is Heads (H2), then N= 1.50 on Monday and 2.50 on Tuesday. If the second coin is Tails, then the order is reversed.
I’ll maximize my EV if I bet T when $N = 1.5$ , and H when $N = 2.5$ . Both of these fall cleanly out of ‘thirder’ logic.
What’s the ‘halfer’ story here? Your earlier logic doesn’t allow for separate bets on each awakening.

samshap Jul 16, 2024, 9:49 PM
1 point
0
in reply to: Neel Nanda’s comment on: Stitching SAEs of different sizes
Thanks for sharing that study. It looks like your team is already well-versed in this subject!
You wouldn’t want something that’s too hard to extract, but I think restricting yourself to a single encoder layer is too conservative—LLMs don’t have to be able to fully extract the information from a layer in a single step.
I’d be curious to see how much closer a two-layer encoder would get to the ITO results.

samshap Jul 14, 2024, 11:53 PM
2 points
0
on: Stitching SAEs of different sizes
:Here’s my longer reply.
I’m extremely excited by the work in SAEs and their potential for interpretability, however I think there is a subtle misalignment in the SAE architecture and loss function, and the actual desired objective function.
The SAE loss function is:
$L (x; W_{d e c}, b_{d e c}) = E_{x} [| | x -^x | |^{2} + λ | | f (x) | |_{1}]$ , where $| | f (x) | |_{1} = \sum_{i} f_{i} (x)$ is the $ℓ 1$ -Norm.
or
$L (x) = E_{x} [| | x - W_{d e c} f (x) - b_{d e c} | |^{2} + λ | | f (x) | |_{1}]$
I would argue that, however, what you are actually trying to solve is the sparse coding problem:
$L (x; W_{d e c}, b_{d e c}) = E_{x} [{min}_{f} | | x - W_{d e c} f - b_{d e c} | |^{2} + λ | | f | |_{1}]$
where, importantly, the inner optimization is solved separately (including at runtime).
Since $f$ is an overcomplete basis, finding $f^{*}$ that minimizes the inner loop (also known as basis pursuit denoising^[1] ) is a notoriously challenging problem, one which a single-layer encoder is underpowered to compute. The SAE’s encoder thus introduces a significant error ${~ f}_{e n c}$ , which means that you are actual loss function is:
$L (x; Θ) = E_{x} [| | x - W_{d e c} (f^{*} + {~ f}_{e n c}) - b_{d e c} | |^{2} + λ | | f^{*} + {~ f}_{e n c} | |_{1}]$
The magnitude of the errors would have to be determined empirically, but I suspect that it is enough to be a significant source of error..
There are a few things you could do reduce the error:
1. Ensuring that $W_{d e c}$ obeys the restricted isometry property^[2] (i.e. a cap on the cosine similarity of decoder weights), or barring that, adding a term to your loss function that at least minimizes the cosine similarities.
2. Adding extra layers to your encoder, so it’s better at solving for $f^{*}$ .
3. Empirical studies to see how large the feature error is / how much reconstruction error it is adding.
1. ^
  https://epubs.siam.org/doi/abs/10.1137/S003614450037906X?casa_token=E-R-1D55k-wAAAAA:DB1SABlJH5NgtxkRlxpDc_4IOuJ4SjBm5-dLTeZd7J-pnTAA4VQQ2FJ6TfkRpZ3c93MNrpHddcI
2. ^
  http://www.numdam.org/item/10.1016/j.crma.2008.03.014.pdf

samshap Jul 13, 2024, 10:12 PM
1 point
0
on: Stitching SAEs of different sizes
This is great work. My recommendation: add a term in your loss function that penalizes features with high cosine similarity.

I think there is a strong theoretical underpinning for the results you are seeing.

I might try to reach out directly—some of my own academic work is directly relevant here.

samshap May 12, 2024, 1:26 AM
3 points
0
on: Should I Finish My Bachelor’s Degree?
This is one of those cases where it might be useful to list out all the pros and cons of taking the 8 courses in question, and then thinking hard about which benefits could be achieved by other means.

Key benefits of taking a course (vs. Independent study) beyond the signaling effect might include:
- precommitting to learning a certain body of knowledge
- curation of that body of knowledge by an experienced third party
- additional learning and insight from partnerships / teamwork / office hours
But these depend on the courses and your personality. The precommitment might be unnecessary due to your personal work habits, the curation might be misaligned with what you are interested in learning, and the other students or TAs may not have useful insights that you can’t figure out in your own.

Hope that helps.

samshap May 1, 2024, 2:34 PM
2 points
0
in reply to: TurnTrout’s comment on: TurnTrout’s shortform feed
Instead of demanding orthogonal representations, just have them obey the restricted isometry property.
Basically, instead of requiring $\forall i \neq j :< x_{i}, x_{j} >= 0$ , we just require $\forall i \neq j : x_{i} \cdot x_{j} \leq ϵ$ .
This would allow a polynomial number of sparse shards while still allowing full recovery.

samshap Mar 18, 2024, 4:50 PM
1 point
0
on: The Worst Form Of Government (Except For Everything Else We’ve Tried)
I think the success or failure of this model really depends on the nature and number of the factions. If interfactional competition gets too zero-sum (this might help us, but it helps them more, so we’ll oppose it) then this just turns into stasis.

During ordinary times, vetocracy might be tolerable, but it will slowly degrade state capacity. During a crisis it can be fatal.

Even in America, we only see this factional veto in play in a subset of scenarios—legislation under divided government. Plenty of action at the executive level or in state governments don’t have to worry about this.

samshap Dec 19, 2023, 10:45 PM
1 point
0
in reply to: Maxwell Tabarrok’s comment on: Contra Scott on Abolishing the FDA
You switch positions throughout the essay, sometimes in the same sentence!

“Completely remove efficacy testing requirements” (Motte) ”… making the FDA a non-binding consumer protection and labeling agency” (Bailey)

“Restrict the FDA’s mandatory authority to labeling” logically implies they can’t regulate drug safety, and can’t order recalls of dangerous products. Bailey! ”… and make their efficacy testing completely non-binding” back to Motte again.

“Pharmaceutical manufactures can go through the FDA testing process and get the official “approved’ label if insurers, doctors, or patients demand it, but its not necessary to sell their treatment.” Again implies the FDA has no safety regulatory powers.

“Scott’s proposal is reasonable and would be an improvement over the status quo, but it’s not better than the more hardline proposal to strip the FDA of its regulatory powers.” Bailey again!

samshap Dec 16, 2023, 8:19 PM
5 points
0
on: Contra Scott on Abolishing the FDA
This is a Motte and Bailey argument.

The Motte is ‘remove the FDAs ability to regulate drugs for efficacy’

The Bailey is ‘remove the FDAs ability to regulate drugs at all’

The FDA doesn’t just regulate drugs for efficacy, it regulates them for safety too. This undercuts your arguments about off-label prescriptions, which were still approved for use by the FDA as safe.

Relatedly, I’ll note you did not address Scott’s point on factory safety.

If you actually want to make the hardline position convincing, you need to clearly state and defend that the FDA should not regulate drugs for safety.

samshap Sep 5, 2023, 1:03 PM
2 points
−1
on: Decision theory is not policy theory is not agent theory
The differentiation between CDT as a decision theory and FDT as a policy theory is very helpful at dispelling confusion. Well done.

However, why do you consider EDT a policy theory? It’s just picking actions with the highest conditional utility. It does not model a ‘policy’ in the optimization equation.

Also, the ladder analogy here is unintuitive.

samshap Aug 14, 2023, 2:11 AM
7 points
0
on: Learning as you play: anthropic shadow in deadly games
This doesn’t make sense to me. Why am I not allowed to update on still being in the game?

I noticed that in your problem setup you deliberately removed n=6 from being in the prior distribution. That feels like cheating to me—it seems like a perfectly valid hypothesis.

After seeing the first chamber come up empty, that should definitively update me away from n=6. Why can’t I update away from n=5 ?

samshap Aug 13, 2023, 5:45 PM
1 point
−3
on: AGI is easier than robotaxis
Counterpoint, robotaxis already exist: https://www.nytimes.com/2023/08/10/technology/driverless-cars-san-francisco.html

You should probably update your priors.

samshap Aug 11, 2023, 9:55 PM
5 points
0
on: The Pandemic is Only Beginning: The Long COVID Disaster
Nope.

According to the CDC pulse survey you linked (https://www.cdc.gov/nchs/covid19/pulse/long-covid.htm) the metrics for long covid are trending down. This includes: currently experiencing, any limitations, and significant limitations categories.

samshap Aug 4, 2023, 8:57 PM
10 points
3
in reply to: Carl Feynman’s comment on: The Sinews of Sudan’s Latest War
How is this in the wrong place?

samshap Aug 3, 2023, 9:07 PM
3 points
1
in reply to: Sweetgum’s comment on: Rationalization Maximizes Expected Value
Nice. This also matches my earlier observation that the epestemic failure is of not anticipating one’s change in value. If you do anticipate it, you won’t agree to this money pump.

samshap Jul 30, 2023, 10:22 PM
12 points
11
on: Rationalization Maximizes Expected Value
I agree that the type of rationalization you’ve described is often practically rational. And it’s at most a minor crime against epestemic rationality. If anything, the epestemic crime here is not anticipating that your preferences will change after you’ve made a choice.

However, I don’t think this case is what people have in mind when they critique rationalization.

The more central case is when we rationalize decisions that affect other people; for example, Alice might make a decision that maximizes her preferences and disregards Bob’s, but after the fact she’ll invent reasons that make her decision appear less callous: “I thought Bob would want me to do it!”

While this behavior might be practically rational from Alice’s selfish perspective, she’s being epestemically unvirtuous by lying to Bob, degrading his ability to predict her future behavior.

Maybe you can use specific terminology to differentiate your case from the more central one, maybe “preference rationalization”?

samshap Jul 11, 2023, 3:06 AM
2 points
0
in reply to: Timothy Underwood’s comment on: The world where LLMs are possible
I can use a laptop to hammer in a nail, but it’s probably not the fastest or most reliable way to do so.