Robert Kennedy

Karma: 105

Robert Kennedy Jul 28, 2023, 3:13 AM
8 points
1
on: AI #22: Into the Weeds
RE: GPT getting dumber, that paper is horrendous.

The code gen portion was completely thrown off because of Markdown syntax (the authors mistook back-ticks for single-quotes, afaict). I think the update to make there is that it is decent evidence that there was some RLHF on ChatGPT outputs. If you remember from that “a human being will die if you don’t reply with pure JSON” tweet, even that final JSON code was escaped with markdown. My modal guess is that markdown was inserted via cludge to make the ChatGPT UX better, and then RLHF was done on that cludged output. Code sections are often mislabeled for what language they contain. My secondary guess is that the authors used an API which had this cludged added on top of it, such that GPT just wouldn’t output plaintext code, tho that is baffled by the “there were any passing examples”.

In the math portion they say GPT-4-0613 only averaged 3.8 CHARACTERS per response. Note that “[NO]” and “[YES]” both contain more than 3.8 characters. Note that GPT-4 does not answer hardly any queries with a single word. Note that the paper’s example answer for the primality question included 1000 characters, so the remaining questions apparently averaged 3 characters flat. Even if you think they only fucked up that data analysis: I also replicated GPT-4 failing to solve “large” number primality, and am close to calling a that cherry picked example. It is a legit difficult problem for GPT, I agree that anyone who goes to ChatGPT to replicate will agree the answer they get back is a coin flip at best. But we need to say it again for the kids in the back: the claim is that GPT-4 got 2% on yes/no questions. What do we call a process that gets 2% on coin flip questions?

Robert Kennedy May 1, 2023, 8:05 PM
3 points
1
on: What Boston Can Teach Us About What a Woman Is
If you take the distance between the North and South pole and divide it by ten million: voilà, you have a meter!

NB: The circumference of the Earth is ~40k km—this definition of a meter should instead mention the distance from the North or South pole to the Equator.

Robert Kennedy Mar 17, 2023, 1:15 AM
2 points
0
on: On the Crisis at Silicon Valley Bank

The problem with this is that you get whatever giant risks you aren’t measuring properly. That’s what happened at SVB, they bought tons of ‘safe’ assets while taking on a giant unsafe bet on interest rates because the system didn’t check for that. Also they cheated on the accounting, because the system allowed that too.

A very good example of Goodhart’s Law/misalignment. Highlighting for the skimmers. Thanks for the write up Zvi!

Tidbit to make this comment useful: “duration” is the (negative) derivative of price with respect to yield—a bond with duration of 10 will be worth 5% (relative to par) after a 50 bip (0.5%) rate hike. So why do they call it duration? Well, suppose you buy a 10 year bond that pays 2% interest, and then tomorrow someone offers you a 3% 10 year bond. How much money do you have to pay to trade in yesterday’s bond? Well, pretty much you have to pay an extra 1% for each year of the bonds life!

This is probably dead obvious to everyone in finance, but I only got into finance by joining fintech as after a math undergrad, and it took me years to figure out why they called it duration when they are nice enough to call the second derivative “convexity”.

Robert Kennedy Feb 15, 2023, 2:07 AM
1 point
0
on: Evaluating 2022 ACX Predictions

New U.S. sanctions on Russia (70%): Scott holds, I sell to 60%.

This seems like a better sale than the sale on Russia going to war, by a substantial amount. So if I was being consistent I should have sold more here. Given that I was wrong about the chances of the war, the sale would have been bad, but I didn’t know that at the time. Therefore this still counts as a mistake not to sell more.

This seems like a conjunctive fallacy. “US sanctions Russia” is very possible outside “Russia goes to war”, even if “Russia goes to war” implies “US sanctions Russia”. You had 30% on “major flare up in Russia-Ukraine”. Perhaps you are anchoring your relative sells or something?

I obviously agree that you know these things, and am only noting a self-flagellation that seemed unearned. Thanks for writing Zvi!

Robert Kennedy Feb 7, 2023, 2:54 PM
1 point
0
on: SolidGoldMagikarp (plus, prompt generation)
What prompts maximize the chance of returning these tokens?

Idle speculation: cloneembedreportprint and similar end up encoding similar to /EOF.

Robert Kennedy Dec 2, 2022, 9:30 PM
3 points
3
in reply to: TW123’s comment on: Did ChatGPT just gaslight me?
I am sorry for insulting you. My experience in the rationality community is that many people choose abstinence from alcohol, which I can respect, but I forgot that likely in many social circles that choice leads to feelings of alienation. While I thought you were signaling in-group allegiance, I can see that you might not have that connection. I will attempt to model better in the future, since this seems generalizable.
I’m still interested in whether the beet margarita with OJ was good~

Robert Kennedy Dec 2, 2022, 4:42 AM
−8 points
−6
on: Did ChatGPT just gaslight me?
Did you try the beet margarita with orange juice? Was it good?

To be honest, this exchange seems completely normal for descriptions of alcohol. Tequila is canonically described as sweet. You are completely correct that when people say “tequila is sweet” they are not trying to compared it to super stimulants like orange juice and coke. GPT might not understand this fact. GPT knows that the canonical flavor profile for tequila includes “sweet”, and your friend knows that it’d be weird to call tequila a sweet drink.

I think the gaslighting angle is rather overblown. GPT knows that tequila is sweet. GPT knows that most the sugar in tequila has been converted to alcohol. GPT may not know how to reconcile these facts.

Also, I get weird vibes from this post as generally performative about sobriety. You don’t know the flavor profiles of alcohol, and the AI isn’t communicating well the flavor profiles of alcohol. Why are you writing about the AIs lack of knowledge about the difference between tequila’s sweetness and orange juice’s sweetness? You seem like an ill informed person on the topic, and like you have no intention of becoming better informed. From where I stand, it seems like you understand alcohol taste less than GPT.

Robert Kennedy Nov 11, 2022, 3:22 AM
12 points
9
on: We must be very clear: fraud in the service of effective altruism is unacceptable
I wish this post talked about object level trade offs. It did that somewhat with the reference to the importance of “have a decision theory that makes it easier to be traded with”. However, the opening was extremely strong and was not supported:

I care deeply about the future of humanity—more so than I care about anything else in the world. And I believe that Sam and others at FTX shared that care for the world. Nevertheless, if some hypothetical person had come to me several years ago and asked “Is it worth it to engage in fraud to send billions of dollars to effective causes?”, I would have said unequivocally no.

What level of funding would make fraud worth it?

Edit to expand: I do not believe the answer is infinite. I believe the answer is possibly less than the amount I understand FTX has contributed (assuming they honor their commitments, which they maybe can’t). I think this post gestures at trading off sacred values, in a way that feels like it signals for applause, without actually examining the trade.

Robert Kennedy Sep 24, 2022, 10:51 PM
16 points
6
in reply to: janus’s comment on: Intelligence as a Platform
Thanks for feedback, I am new to writing in this style and may have erred too much towards deleting sentences while editing. But, if you never cut too much you’re always too verbose, as they say. I in particular appreciate that, when talking about how I am updating, I should make clear where I am updating from.

For instance, regarding human level intelligence, I was also describing relative to “me a year/month ago”. I relistened to the Sam Harris/Yudkowsky podcast yesterday, and they detour for a solid 10 minutes about how “human level” intelligence is a straw target. I think their arguments were persuasive, and that I would have endorsed them a year ago, but that they don’t really apply to GPT. I had pretty much concluded that the difference between a 150 IQ AI and a 350 IQ AI would be a matter of scale. GPT as a simulator/platform seems to me like an existence proof for a not-artificially-handicapped human level AI attractor state. Since I had previous thought the entire idea was a distraction, this is an update towards human level AI.

The impact on AI timelines mostly follows from diversion of investment. I will think on if I have anything additional to add on that front.

Intelligence as a Platform

Robert KennedySep 23, 2022, 5:51 AM

10 points

5 comments3 min readLW link

Robert Kennedy Sep 7, 2022, 8:32 PM
4 points
0
in reply to: lsusr’s comment on: Nine nines
Right, okay. I am trying to learn your ontology here, but the concepts are not close to my current inferential distance. I don’t understand what the 95% means. I don’t understand why the d100 has 99% chance to be fixed after one roll, while a d10 only has 90%. By the second roll I think I can start to stomach the logic here though, so maybe we can set that aside.

In my terms, when you say that a Bayesian wouldn’t bet $1bil:$1 that the sun will rise tomorrow, that doesn’t seem correct to me. It’s true that I wouldn’t actually make that nightly bet, because the risk free rate is like 3% per annum so it’d be a pretty terrible allocation of risk, plus it seems like it’d be an assassination market on the rotation of Earth and I don’t like incentivizing that as a matter of course. But does the math of likelihood ratios not work as well to bury bad theories under a mountain of evidence?

I think not assigning 1e-40 chance to an event is an epistemological choice separate from Bayesianism. The math seems quite capable of leading to that conclusion, and recovering from that state quickly enough.

I think maybe the crux is “There is no way for a Bayesian to be wrong. Everything is just an update. But a Frequentist who said the die was fair can be proven wrong to arbitrary precision.” You can, if the Bayesian announces their prior, know precisely how much of your arbitrary evidence they will require to believe the die is loaded.

Again, I hope this is taken in the spirit I mean it, which is “you are the only self proclaimed Frequentist on this board I know of, so you are a very valuable source of epistemic variation that I should learn how to model”.

Robert Kennedy Sep 6, 2022, 8:34 PM
10 points
0
on: Nine nines
I am not sure I understand, probably because I am too preprogrammed by Bayesianism.

You roll a d20, it comes up with a number (let’s say 8). The Frequentist now believes there is a 95% chance the die is loaded to produce 8s? But they won’t bet 20:1 on the result, and instead they will do something else with that 95% number? Maybe use it to publish a journal article, I guess.

Robert Kennedy Sep 1, 2022, 12:08 AM
1 point
0
in reply to: Dennis Towne’s comment on: Grand Theft Education
I would like to note that the naive version of this is bad. First, the naive version falls prey to new grads (who generally have nothing) declaring bankruptcy immediately after graduation. Then, lenders are forced to ask for collateral, which gets rid of a GREAT quality our current system has—you can go to college even if your parents weren’t frugal, no matter their income. I think this criticism probably still lands with a 5 year time horizon, maybe less for a 10 year.

I like the concept that lenders would take an interest in which major you were getting, since that seems like something that could use an actuarial table. I think we would benefit from more directly incentivizing STEM (and other profitable) degrees, which IDR doesn’t seem to do. What if IDR left lenders holding the bag?

Robert Kennedy Jul 23, 2022, 4:03 AM
2 points
0
in reply to: Davis_Kingsley’s comment on: YouTubeTV and Spoilers
This was the UX I was going to mention—watching GSL (SC:BW) VoDs. There it is tricky, especially since individual games can vary so heavily.

Robert Kennedy May 23, 2022, 2:01 AM
2 points
on: Formula for a Shortage
This article was great! Please define WIC much earlier, that was how I felt reading it and the first feedback I got after sharing it. Thanks for writing this!

Robert Kennedy Apr 22, 2022, 12:45 AM
2 points
on: Covid 4/21/22: Variants Working Overtime
My understanding is that the math textbooks were banned in Florida for their use of the “Common Core” framework. I was a math educator, and my experience is that resistance to Common Core comes primarily from parents who hate math, and are confused why they can’t do their child’s math, and who somehow take this as a failure mode.

Robert Kennedy Apr 6, 2022, 10:35 PM
1 point
on: A Word to the Wise is Sufficient because the Wise Know So Many Words
I really appreciate this post. In Chinese, the vocal pronouns for “he” and “she” are the same (they are distinguished in writing). It is common for Chinese ESL students to mix the words “she” and “he” when speaking. I have been trying to understand this, and relate it to my (embarrassingly recent) understanding that probabilistic forecasts (which I now use ubiquitously) are a different “epistemology” than I used to have. This post is a very concrete exploration of the subject. Thank you!

Robert Kennedy Apr 1, 2022, 10:38 PM
1 point
in reply to: MondSemmel’s comment on: They Don’t Know About Second Booster
I think finding the correct link required a good heart. In the hope Zvi will see you, I am commenting to further boost visibility.

Robert Kennedy Apr 1, 2022, 10:09 PM
2 points
in reply to: Lukas Finnveden’s comment on: Replacing Karma with Good Heart Tokens (Worth $1!)
I think top level posts generate much more than 10x the value than the entire comments section combined, based off my impression that the majority of lurkers don’t get deep in the comments. I wonder if top level posts having a x^1.5 exponent would get closer to the ideal… That would also disincentivize post series...

Robert Kennedy Mar 15, 2022, 7:30 PM
2 points
in reply to: Gunnar_Zarncke’s comment on: Probabilistic Negotiation
No, since if I had rolled low I wouldn’t want to like, give them significantly more notice than necessary as I job hunted. I offered to do something like hash a seed to use on a RNG, they didn’t think that was necessary.

Robert Kennedy

In­tel­li­gence as a Platform

Intelligence as a Platform