And English has it backwards. You can see the past, but not the future. The thing which just happened is most clear. The future comes at us from behind.
justinpombrio
Here’s the reasoning I intuitively want to apply:
where X = “you roll two 6s in a row by roll N”, Y = “you roll at least two 6s by roll N”, and Z = “the first N rolls are all even”.
This is valid, right? And not particularly relevant to the stated problem, due to the “by roll N” qualifiers mucking up the statements in complicated ways?
Where’s the pain?
Sure. For simplicity, say you play two rounds of Russian Roulette, each with a 60% chance of death, and you stop playing if you die. What’s the expected value of YouAreDead at the end?
With probability 0.6, you die on the first round
With probability 0.4*0.6 = 0.24, you die on the second round
With probability 0.4*0.4=0.16, you live through both rounds
So the expected value of the boolean YouAreDead random variable is 0.84.
Now say you’re monogamous and go on two dates, each with a 60% chance to go well, and if they both go well then you pick one person and say “sorry” to the other. Then:
With probability 0.4*0.4=0.16, both dates go badly and you have no partner.
With probability 20.40.6=0.48, one date goes well and you have one partner.
With probability 0.6*0.6=0.36, both dates go well and you select one partner.
So the expected value of the HowManyPartnersDoYouHave random variable is 0.84, and the expected value of the HowManyDatesWentWell random variable is 0.48+2*0.36 = 1.2.
Now say you’re polyamorous and go on two dates with the same chance of success. Then:
With probability 0.4*0.4=0.16, both dates go badly and you have no partners.
With probability 20.40.6=0.48, one date goes well and you have one partner.
With probability 0.6*0.6=0.36, both dates go well and you have two partners.
So the expected value of both the HowManyPartnersDoYouHave random variable and the HowManyDatesWentWell random variable is 1.2.
Note that I’ve only ever made statements about expected value, never about utility.
Probability of at least two success: ~26%
My point is that in some situations, “two successes” doesn’t make sense. I picked the dating example because it’s cute, but for something more clear cut imagine you’re playing Russian Roulette with 10 rounds each with a 10% chance of death. There’s no such thing as “two successes”; you stop playing once you’re dead. The “are you dead yet” random variable is a boolean, not an integer.
If you’re monagamous and go to multiple speed dating events and find two potential partners, you end up with one partner. If you’re polyamorous and do the same, you end up with two partners.
One way to think of it is whether you will stop trying after the first success. Though that isn’t always the distinguishing feature. For example, you might start 10 job interviews at the same time, even though you’ll take at most one job.
However it is true that doing something with a 10% success rate 10 times will net you an average of 1 success.
For the easier to work out case of doing something with a 50% success rate 2 times:
25% chance of 0 successes
50% chance of 1 success
25% chance of 2 successes
Gives an average of 1 success.
Of course this only matters for the sort of thing where 2 successes is better than 1 success:
10% chance of finding a monogamous partner 10 times yields 0.63 monogamous partners in expectation.
10% chance of finding a polyamorous partner 10 times yields 1.00 polyamorous partners in expectation.
EDIT: To clarify, a 10% chance of finding a monogamous partner 10 times yields 1.00 successful dates and 0.63 monogamous partners that you end up with, in expectation.
IQ over median does not correlate with creativity over median
That’s not what that paper says. It says that IQ over 110 or so (quite above median) correlates less strongly (but still positively) with creativity. In Chinese children, age 11-13.
And for a visceral description of a kind of bullying that’s plainly bad, read the beginning of Worm: https://parahumans.wordpress.com/2011/06/11/1-1/
I double-downvoted this post (my first ever double-downvote) because it crosses a red line by advocating for verbal and physical abuse of a specific group of people.
Alexej: this post gives me the impression that you started with a lot of hate and went looking for justifications for it. But if you have some real desire for truth seeking, here are some counterarguments:
Yeah, I think “computational irreducibility” is an intuitive term pointing to something which is true, important, and not-obvious-to-the-general-public. I would consider using that term even if it had been invented by Hitler and then plagiarized by Stalin :-P
Agreed!
OK, I no longer claim that. I still think it might be true
No, Rice’s theorem is really not applicable. I have a PhD in programming languages, and feel confident saying so.
Let’s be specific. Say there’s a mouse named Crumbs (this is a real mouse), and we want to predict whether Crumbs will walk into the humane mouse trap (they did). What does Rice’s theorem say about this?
There are a couple ways we could try to apply it:
-
We could instantiate the semantic property P with “the program will output the string ‘walks into trap’”. Then Rice’s theorem says that we can’t write a program Q that takes as input a program R and says whether R outputs ‘walks into trap’. For any Q we write, there will exist a program R that defeats it. However, this does not say anything about what the program R looks like! If R is simply
print('walks into trap')
, then it’s pretty easy to tell! And if R is the Crumbs algorithm running in Crumb’s brain, Rice’s theorem likewise does not claim that we’re unable tell if it outputs ‘walks into trap’. All the theorem says is that there exists a program R that Q fails on. The proof of the theorem is constructive, and does give a specific program as a counter-example, but this program is unlikely to look anything like Crumb’s algorithm. The counter-example program R runs Q on P and then does the opposite of it, while Crumbs does not know what we’ve written for Q and is probably not very good at emulating Python. -
We could try to instantiate the counter-example program R with Crumb’s algorithm. But that’s illegal! It’s under an existential, not a forall. We don’t get to pick R, the theorem does.
Actually, even this kind of misses the point. When we’re talking about Crumb’s behavior, we aren’t asking what Crumbs would do in a hypothetical universe in which they lived forever, which is the world that Rice’s theorem is talking about. We mean to ask what Crumbs (and other creatures) will do today (or perhaps this year). And that’s decidable! You can easily write a program Q that takes a program R and checks if R outputs ‘walks into trap’ within the first N steps! Rice’s theorem doesn’t stand in your way even a little bit, if all you care about is behavior after a fixed finite amount of time!
Here’s what Rice’s theorem does say. It says that if you want to know whether an arbitrary critter will walk into a trap after an arbitrarily long time, including long after the heat death of the universe, and you think you have a program that can check that for any creature in finite time, then you’re wrong. But creatures aren’t arbitrary (they don’t look like the very specific, very scattered counterexample programs that are constructed in the proof of Rice’s theorem), and the duration of time we care about is finite.
If you care to have a theorem, you should try looking at Algorithmic Information Theory. It’s able to make statements about “most programs” (or at least “most bitstrings”), in a way that Rice’s theorem cannot. Though I don’t think it’s important you have a theorem for this, and I’m not even sure that there is one.
-
Rice’s theorem (a.k.a. computational irreducibility) says that for most algorithms, the only way to figure out what they’ll do with certainty is to run them step-by-step and see.
Rice’s theorem says nothing of the sort. Rice’s theorem says:
For every semantic property P, For every program Q that purports to check if an arbitrary program has property P, There exists a program R such that Q(R) is incorrect: Either P holds of R but Q(R) returns false, or P does not hold of R but Q(R) returns true
Notice that the tricky program
R
that’s causing your property-checkerQ
to fail is under an existential. This isn’t saying anything about most programs, and it isn’t even saying that there’s a subset of programs that are tricky to analyze. It’s saying that after you fix a property P and a property checker Q, there exists a program R that’s tricky for Q.There might be a more relevant theorem from algorithmic information theory, I’m not sure.
Going back to the statement:
for most algorithms, the only way to figure out what they’ll do with certainty is to run them step-by-step and see
This is only sort of true? Optimizing compilers rewrite programs into equivalent programs before they’re run, and can be extremely clever about the sorts of rewrites that they do, including reducing away parts of the program without needing to run them first. We tend to think of the compiled output of a program as “the same” program, but that’s only because compilers are reliable at producing equivalent code, not because the equivalence is straightforward.
a.k.a. computational irreducibility
Rice’s theorem is not “also known as” computational irreducibility.
By the way, be wary of claims from Wolfram. He was a serious physicist, but is a bit of an egomaniac these days. He frequently takes credit for others’ ideas (I’ve seen multiple clear examples) and exaggerates the importance of the things he’s done (he’s written more than one obituary for someone famous, where he talks more about his own accomplishments than the deceased’s). I have a copy of A New Kind of Science, and I’m not sure there’s much of value in it. I don’t think this is a hot take.
for most algorithms, the only way to figure out what they’ll do with certainty is to run them step-by-step and see
I think the thing you mean to say is that for most of the sorts of complex algorithms you see in the wild, such as the algorithms run by brains, there’s no magic shortcut to determine the algorithm’s output that avoids having to run any of the algorithm’s steps. I agree!
I think we’re in agreement on everything.
Excellent. Sorry for thinking you were saying something you weren’t!
still not have an answer to whether it’s spinning clockwise or counterclockwise
More simply (and quite possibly true), Nobuyuki Kayahara rendered it spinning either clockwise or counterclockwise, lost the source, and has since forgotten which way it was going.
I like “veridical” mildly better for a few reasons, more about pedagogy than anything else.
That’s a fine set of reasons! I’ll continue to use “accurate” in my head, as I already fully feel that the accuracy of a map depends on which territory you’re choosing for it to represent. (And a map can accurately represent multiple territories, as happens a lot with mathematical maps.)
Another reason is I’m trying hard to push for a two-argument usage
Do you see the Spinning Dancer going clockwise? Sorry, that’s not a veridical model of the real-world thing you’re looking at.
My point is that:
The 3D spinning dancer in your intuitive model is a veridical map of something 3D. I’m confident that the 3D thing is a 3D graphical model which was silhouetted after the fact (see below), but even if it was drawn by hand, the 3D thing was a stunningly accurate 3D model of a dancer in the artist’s mind.
That 3D thing is the obvious territory for the map to represent.
It feels disingenuous to say “sorry, that’s not a veridical map of [something other than the territory map obviously represents]”.
So I guess it’s mostly the word “sorry” that I disagree with!
By “the real-world thing you’re looking at”, you mean the image on your monitor, right? There are some other ways one’s intuitive model doesn’t veridically represent that such as the fact that, unlike other objects in the room, it’s flashing off and on at 60 times per second, has a weirdly spiky color spectrum, and (assuming an LCD screen) consists entirely of circularly polarized light.
It was made by a graphic artist. I’m not sure their exact technique, but it seems at least plausible to me that they never actually created a 3D model.
This is a side track, but I’m very confident a 3D model was involved. Plenty of people can draw a photorealistic silhouette. The thing I think is difficult is drawing 100+ silhouettes that match each other perfectly and have consistent rotation. (The GIF only has 34 frames, but the original video is much smoother.) Even if technically possible, it would be much easier to make one 3D model and have the computer rotate it. Annnd, if you look at Nobuyuki Kayahara’s website, his talent seems more on the side of mathematics and visualization than photo-realistic drawing, so my guess is that he used an existing 3D model for the dancer (possibly hand-posed).
This is fantastic! I’ve tried reasoning along these directions, but never made any progress.
A couple comments/questions:
Why “veridical” instead of simply “accurate”? To me, the accuracy of a map is how well it corresponds to the territory it’s trying to map. I’ve been replacing “veridical” with “accurate” while reading, and it’s seemed appropriate everywhere.
Do you see the Spinning Dancer going clockwise? Sorry, that’s not a veridical model of the real-world thing you’re looking at. [...] after all, nothing in the real world of atoms is rotating in 3D.
I think you’re being unfair to our intuitive models here.
The GIF isn’t rotating, but the 3D model that produced the GIF was rotating, and that’s the thing our intuitive models are modeling. So exactly one of [spinning clockwise] and [spinning counterclockwise] is veridical, depending on whether the graphic artist had the dancer rotating clockwise or counterclockwise before turning her into a silhouette. (Though whether it happens to be veridical is entirely coincidental, as the silhouette is identical to the one that would have been produced had the dancer been spinning in the opposite direction.)
If you look at the photograph of Abe Lincoln from Feb 27, 1860, you see a 3D scene with a person in it. This is veridical! There was an actual room with an actual person in it, who dressed that way and touched that book. The map’s territory is 164 years older than the map, but so what.
(My favorite example of an intuitive model being wildly incorrect is Feynman’s story of learning to identify kinds of galaxies from images on slides. He asks his mentor “what kind of galaxy is this one, I can’t identify it”, and his mentor says it’s a smudge on the slide.)
Very curious what part of this people think is wrong.
Here’s a simple argument that simulating universes based on Turing machine number can give manipulated results.
Say we lived in a universe much like this one, except that:
The universe is deterministic
It’s simulated by a very short Turing machine
It has a center, and
That center is actually nearby! We can send a rocket to it.
So we send a rocket to the center of the universe and leave a plaque saying “the answer to all your questions is Spongebob”. Now any aliens in other universes that simulate our universe and ask “what’s in the center of that universe at time step 10^1000?” will see the plaque, search elsewhere in our universe for the reference, and watch Spongebob. We’ve managed to get aliens outside our universe to watch Spongebob.
I feel like it would be helpful to speak precisely about the universal prior. Here’s my understanding.
It’s a partial probability distribution over bit strings. It gives a non-zero probability to every bit string, but these probabilities add up to strictly less than 1. It’s defined as follows:
That is, describe Turing machines by a binary
code
, and assign each one a probability based on the length of its code, such that those probabilities add up to exactly 1. Then magically run all Turing machines “to completion”. For those that halt leaving abitstring
on their tape, attribute the probability of that Turing machine to thatbitstring
. Now we have a probability distribution overbitstring
s, though the probabilities add up to less than one because not all of the Turing machines halted.You cannot compute this probability distribution, but you can compute lower bounds on the probabilities of its bitstrings. (The Nth lower bound is the probability distribution you get from running the first N TMs for N steps.)
Call a TM that halts poisoned if its output is determined as follows:
The TM simulates a complex universe full of intelligent life, then selects a tiny portion of that universe to output, erasing the rest.
That intelligent life realizes this might happen, and writes messages in many places that could plausibly be selected.
It works, and the TM’s output is determined by what the intelligent life it simulated chose to leave behind.
If we approximate the universal prior, the probability contribution of poisoned TMs will be precisely zero, because we don’t have nearly enough compute to simulate a poisoned TM until it halts. However, if there’s an outer universe with dramatically more compute available, and it’s approximating the universal prior using enough computational power to actually run the poisoned TMs, they’ll effect the probability distribution of the bitstrings, making bitstrings with the messages they choose to leave behind more likely.
So I think Paul’s right, actually (not what I expected when I started writing this). If you approximate the UP well enough, the distribution you see will have been manipulated.
The feedback is from Lean, which can validate attempted formal proofs.
This is one of the bigger reasons why I really don’t like RLHF—because inevitably you’re going to have to use a whole bunch of Humans who know less-than-ideal amounts about philosophy, pertaining to Ai Alignment.
What would these humans do differently, if they knew about philosophy? Concretely, could you give a few examples of “Here’s a completion that should be positively reinforced because it demonstrates correct understanding of language, and here’s a completion of the same text that should be negatively reinforced because it demonstrates incorrect understanding of language”? (Bear in mind that the prompts shouldn’t be about language, as that would probably just teach the model what to say when it’s discussing language in particular.)
It’s impossible for the Utility function of the Ai to be amenable to humans if it doesn’t use language the same way
What makes you think that humans all use language the same way, if there’s more than one plausible option? People are extremely diverse in their perspectives.
The orthogonality thesis doesn’t say anything about intelligences that have no goals. It says that an intelligence can have any specific goal. So I’m not sure you’ve actually argued against the orthogonality thesis.