I do think it implies something about what is happening behind the scenes when their new flagship model is smaller and less capable than what was released a year ago.
Daniel
I am surprised to hear this, especially “I don’t think it has lasting value”. In my opinion, this post has aged incredibly well. Reading it now, knowing that the EA criticism contest utterly failed to do one iota of good with regards to stopping the giant catastrophe on the horizon (FTX), and seeing that the top prizes were all given to long, well-formatted essays providing incremental suggestions on heavily trodden topics while the one guy vaguely gesturing at the actual problem (https://forum.effectivealtruism.org/posts/T85NxgeZTTZZpqBq2/the-effective-altruism-movement-is-not-above-conflicts-of) gets ignored, cements this as one of your more prophetic works.
Just to pull on some loose strings here, why was it okay for Ben Pace to unilaterally reveal the names of Kat Woods and Emerson Spartz, but not for Roko to unilaterally reveal the names of Alice and Chloe? Theoretically Ben could have titled his post, “Sharing Information About [Pseudonymous EA Organization]”, and requested the mods enforce anonymity of both parties, right? Is it because Ben’s post was first so we adopt his naming conventions as the default? Is it because Kat and Emerson are “public figures” in some sense? Is it because Alice and Chloe agreed to share information in exchange for anonymity? That was an agreement with Ben. Why do we assume that the agreement between Ben Pace and Alice/Chloe is binding upon LessWrong commenters in general? I agree that it feels wrong to reveal the identities of Alice and/or Chloe without concrete evidence of major wrongdoing, but I don’t think we have a good theoretical framework for why that is.
Wait, that link goes to an archive page from well after Chloe was hired. When I look back to the screen captures from the period of time that Chloe would have seen, there are no specific numbers given for compensation (would link them myself, but I’m on mobile at the moment).
If the ad that Chloe saw said $60,000 - $100,000 in compensation in big bold letters at the top, then that seems like a bait and switch, but the archives from late 2021 list travel as the first benefit, which seems accurate to what the compensation package actually was.
- 19 Dec 2023 16:23 UTC; 2 points) 's comment on Nonlinear’s Evidence: Debunking False and Misleading Claims by (
Maybe I’m projecting more economic literacy than I should, but anytime I read something like “benefits package worth $X”, I always decompose it into its component parts mentally. A benefits package nominally worth $X will provide economic value less than $X, because there is option value lost compared to if you were given liquid cash instead.
The way I would conceptualize the compensation offered (and the way it is presented in the Nonlinear screenshots) is $1000/month + all expenses paid while traveling around fancy destinations with the family. I kind of doubt that Chloe had a mental model of how $40,000/yr in fancy travel destinations differs from $70,000/yr in fancy travel destinations. There could potentially be unrecorded verbal conversations that would make me feel differently about this, but I don’t currently feel like Chloe got materially shafted other than that she probably didn’t enjoy the travel as much as she thought she would.
I did notice these. I specifically used the word “loadbearing” because almost all of these either don’t matter much or their interpretation is entirely context-dependent. I focused on the salary bullet-point because failing to pay agreed salary is both
1. A big deal, and
2. Bad in almost any context.
The other ones that I think are pretty bad are the Adderall smuggling and the driving without a license, but my prior on “what is the worst thing the median EA org has done” is somewhere between willful licensing noncompliance and illegal amphetamine distribution.
Yeah, I’ve been going back and checking things as they were stated in the original “Sharing Information About Nonlinear” post. Rereading it, I was surprised at how few specific loadbearing factual claims there were at all. Lots of “vibes-based reasoning” as they say. I think the most damning single paragraph with a concrete claim was:
Chloe’s salary was verbally agreed to come out to around $75k/year. However, she was only paid $1k/month, and otherwise had many basic things compensated i.e. rent, groceries, travel. This was supposed to make traveling together easier, and supposed to come out to the same salary level. While Emerson did compensate Alice and Chloe with food and board and travel, Chloe does not believe that she was compensated to an amount equivalent to the salary discussed, and I believe no accounting was done for either Alice or Chloe to ensure that any salary matched up. (I’ve done some spot-checks of the costs of their AirBnbs and travel, and Alice/Chloe’s epistemic state seems pretty reasonable to me.)
I think this is just false. Nonlinear provided enough screenshot evidence to prove that Chloe agreed to exactly the arrangement that she ultimately got. Yes, it was a shitty job, but it was also a shitty job offer, and Chloe seems to have agreed to that shitty job offer.
Also, the more I read about and cross-reference the Alice stuff, the less it makes sense. Either Nonlinear is putting on a masterclass Chewbacca defense, or none of the Alice information provided by either party is evidence of anything.
I think what is bugging me about this whole situation is that there doesn’t seem to be any mechanism of accountability for the (allegedly) false and/or highly misleading claims made by Alice. You seem to be saying something like, “we didn’t make false and/or highly misleading claims, we just repeated the false and/or highly misleading claims that Alice told us, then we said that Alice was maybe unreliable,” as if this somehow makes the responsibility (legal, ethical, or otherwise) to tell the truth disappear.
Here is what Ben said in his post, Closing Notes on Nonlinear Investigation:
“Eventually, after getting to talk with Alice and Chloe, it seemed to me Alice and Chloe would be satisfied to share a post containing accusations that were received as credible. They expected that the default trajectory, if someone wrote up a post, was that the community wouldn’t take any serious action, that Nonlinear would be angry for “bad-mouthing” them, and quietly retaliate against them (by, for instance, reaching out to their employer and recommending firing them, and confidentially sharing very negative stories). They wanted to be confident that any accusations made would be strong enough that people wouldn’t just shrug and move on with their lives. If that happened, the main effect would be to hurt them further and drive them out of the ecosystem.
It seemed to me that I could not personally vouch for any of the claims (at the time), but also that if I did vouch for them, then people would take them seriously. I didn’t know either Alice or Chloe before, and I didn’t know Nonlinear, so I needed to do a relatively effortful investigation to get a better picture of what Nonlinear was like, in order to share the accusations that I had heard.”
It’s not 100% clear, but it seems like Ben is saying that he does (at the time he wrote that post) vouch for the claims of Alice that he included in his post. If Ben did vouch for those claims, and those claims were wrong, and those wrong claims caused large amounts of damage to Nonlinear, and Ben thinks that any retaliation against Alice is unacceptable, then that leaves Ben Pace and Lightcone ultimately responsible does it not?
Spencer sent us a screenshot about the vegan food stuff 2 hours before publication, which Ben didn’t get around to editing in before the post went live, but that’s all the evidence that I know about that you could somehow argue we had but didn’t include. It is not accurate that Nonlinear sent credible signals of having counterevidence before the post went live
Uh, actually I do think that being sent screenshots showing that claims made in the post are false 2 hours before publication is a credible signal that Nonlinear has counterevidence.
I can’t believe I’m saying this, but I’m currently leaning towards the position that Lightcone deserves to be sued for defamation. Maybe not for “maximum damages permitted by law” (since those are truly excessive), but you probably owe them significant material compensation.
This is a better response than I was expecting. Definitely a few non-sequiturs (Ex: you can’t just add travel expenses onto a $1000/month salary and call that $70,000-$75,000 in compensation. The whole point of money is that it’s fungible and can be spent however you like), but the major accusations appear refuted.
The tone is combative, but if the facts are what Nonlinear alleges then a combative tone seems… appropriate? I’m not sure how I feel about the “Sharing Information About Ben Pace” section, but I do think it was a good idea to mention the “elephant in the room” about Ben possibly white-knighting for Alice, since that’s the only way I can get this whole saga to make sense.
If the factions were Altman-Brockman-Sutskever vs. Toner-McCauley-D’Angelo, then even assuming Sutskever was an Altman loyalist, any vote to remove Toner would have been tied 3-3.
A 3-3 tie between the CEO founder of the company, the president founder of the company, and the chief scientist of the company vs three people with completely separate day jobs who never interact with rank-and-file employees is not a stable equilibrium. There are ways to leverage this sort of soft power into breaking the formal deadlock, for example: as we saw last week.
It reminds me of the loyalty successful generals like Caesar and Napoleon commanded from their men. The engineers building GPT-X weren’t loyal to The Charter, and they certainly weren’t loyal to the board. They were loyal to the projects they were building and to Sam, because he was the one providing them resources to build and pumping the value of their equity-based compensation.
I think it’s almost always fine for criticized authors to defend themselves in the comments, even if their defense isn’t very good.
In my original answers I address why this is not the case (private communication serves this purpose more naturally).
This stood out to me as strange. Are you referring to this comment?
And regardless of these resources you should of course visit a nutritionist (even if very sporadically, or even just once when you start being vegan) so that they can confirm the important bullet points, whether what you’re doing broadly works, and when you should worry about anything. (And again, anecdotically this has been strongly stressed and acknowledged as necessary by all vegans I’ve met, which are not few).
The nutritionist might recommend yearly (or less frequent) blood testing, which does feel like a good failsafe. I’ve been taking them for ~6 years and all of them have turned out perfect (I only supplement B12, as nutritionist recommended).
I guess it’s not that much that there’s some resource that is the be-all end-all on vegan nutrition, but more that all of the vegans I’ve met have forwarded a really positive health-conscious attitudes, and stressed the importance of this points.
It sounds like you’re saying that the nutritional requirements of veganism are so complex that they require individualized professional assistance, that there is no one-page “do this and you will get all the nutrients you need” document that will work for a the vast majority of vegans. You seem to dismiss this as if it’s a minor concern, but I don’t think it is.
> I have a lot of respect for Soto for doing the math and so clearly stating his position that “the damage to people who implement veganism badly is less important to me than the damage to animals caused by eating them”
As I mentioned many times in my answer, that’s not the (only) trade-off I’m making here. More concretely, I consider the effects of these interventions on community dynamics and epistemics possibly even worse (due to future actions the community might or might not take) than the suffering experienced by farmed animals murdered for members of our community to consume at present day.
After reading your post, I feel like you are making a distinction without a difference here. You mention community dynamics, but they are all community dynamics about the ethical implications of veganism in the community, not the epistemic implications. It seems perfectly fair for Elizabeth to summarize your position the way she does.
The real reason why it’s enraging is that it rudely and dramatically implies that Eliezer’s time is much more valuable than the OP’s
It does imply that, but it’s likely true that Eliezer’s time is more valuable (or at least in more demand) than OP’s. I also don’t think Eliezer (or anyone else) should have to spend all that much effort worrying about if what they’re about to say might possibly come off as impolite or uncordial.
If he actually wanted to ask OP what the strongest point was he should have just DMed him instead of engineering this public spectacle.
I don’t agree here. Commenting publicly opens the floor up for anyone to summarize the post or to submit what they think is the strongest point. I think it’s actually less pressure on Quintin this way.
It seems you and Paul are correct. I still think this suggests that there is something deeply wrong with RLHF, but less in the “intentionally deceives humans” sense, and more in the “this process consistently writes false data to memory” sense.
Perhaps I am misunderstanding Figure 8? I was assuming that they asked the model for the answer, then asked the model what probability it thinks that that answer is correct. Under this assumption, it looks like the pre-trained model outputs the correct probability, but the RLHF model gives exaggerated probabilities because it thinks that will trick you into giving it higher reward.
In some sense this is expected. The RLHF model isn’t optimized for helpfulness, it is optimized for perceived helpfulness. It is still disturbing that “alignment” has made the model objectively worse at giving correct information.
My guess is that RLHF is unwittingly training the model to lie.
There are reasonable and coherent forms of moral skepticism in which the statement, “It is morally wrong to eat children and mentally disabled people,” is false or at least meaningless. The disgust reaction upon hearing the idea of eating children is better explained by the statement, “I don’t want to live in a society where children are eaten,” which is much more well-grounded in physical reality.
What is disturbing about the example is that this seems to be a person who believes that objective morality exists, but that it wouldn’t entail that eating children is wrong. This is indeed a red flag that something in the argument has gone seriously wrong.
I don’t think we can engage in much “community-wide introspection” without discussing the object-level issues in question, and I can’t think of a single instance of an online discussion of that specific issue going particularly well.
That’s why I’m (mostly) okay tabooing these sorts of discussions. It’s better to deal with the epistemic uncertainty than to risk converging on a false belief.