anonymousaisafety

Karma: 724

anonymousaisafety Jul 6, 2022, 7:42 PM
3 points
0
in reply to: Kenoubi’s comment on: Murphyjitsu: an Inner Simulator algorithm
It depends on what you mean by “didn’t work”. The study described is published in a paper only 16 pages long. We can just read it: http://web.mit.edu/curhan/www/docs/Articles/biases/67_J_Personality_and_Social_Psychology_366,_1994.pdf
First, consider the question of, “are these predictions totally useless?” This is an important question because I stand by my claim that the answer of “never” is actually totally useless due to how trivial it is.
Despite the optimistic bias, respondents’ best estimates were by no means devoid of information: The predicted completion times were highly correlated with actual completion times (r = .77, p < .001). Compared with others in the sample, respondents who predicted that they would take more time to finish actually did take more time. Predictions can be informative even in the presence of a marked prediction bias.
...
Respondents’ optimistic and pessimistic predictions were both strongly correlated with their actual completion times (rs = .73 and .72, respectively; ps < .01).
Yep. Matches my experience.
We know that only 11% of students met their optimistic targets, and only 30% of students met their “best guess” targets. What about the pessimistic target? It turns out, 50% of the students did finish by that target. That’s not just a quirk, because it’s actually related to the distribution itself.
However, the distribution of difference scores from the best-guess predictions were markedly skewed, with a long tail on the optimistic side of zero, a cluster of scores within 5 or 10 days of zero, and virtually no scores on the pessimistic side of zero. In contrast, the differences from the worst-case predictions were noticeably more symmetric around zero, with the number of markedly pessimistic predictions balancing the number of extremely
optimistic predictions.
In other words, asking people for a best guess or an optimistic prediction results in a biased prediction that is almost always earlier than a real delivery date. On the other hand, while the pessimistic question is not more accurate (it has the same absolute error margins), it is unbiased. The reality is that the study says that people asked for a pessimistic question were equally likely to over-estimate their deadline as they were to under-estimate it. If you don’t think a question that gives you a distribution centered on the right answer is useful, I’m not sure what to tell you.
The paper actually did a number of experiments. That was just the first.
In the third experiment, the study tried to understand what people are thinking about when estimating.
Proportionally more responses concerned future scenarios (M = .74) than relevant past experiences (M =.07), r(66) = 13.80, p < .001. Furthermore, a much higher proportion of subjects’ thoughts involved planning for a project and imagining its likely progress (M =.71) rather than considering potential impediments (M = .03), r(66) = 18.03, p < .001.
This seems relevant considering that the idea of premortems or “worst case” questioning is to elicit impediments, and the project managers / engineering leads doing that questioning are intending to hear about impediments and will continue their questioning until they’ve been satisfied that the group is actually discussing that.
In the fourth experiment, the study tries to understand why it is that people don’t think about their past experiences. They discovered that just prompting people to consider past experiences was insufficient, they actually needed additional prompting to make their past experience “relevant” to their current task.
Subsequent comparisons revealed that subjects in the recall-relevant condition predicted they would finish the assignment later than subjects in either the recall condition, t(79) = 1.99, p < .05, or the control condition, f(80) = 2.14, p < .04, which did not differ significantly from each other, t(& 1) < 1
...
Further analyses were performed on the difference between subjects’ predicted and actual completion times. Subjects underestimated their completion times significantly in the control (M = −1.3 days), r(40) = 3.03, p < .01, and recall conditions (M = −1.0 day), t(41) = 2.10, p < .05, but not in the recall-relevant condition (M = −0.1 days), ((39) < i. Moreover, a higher percentage of subjects finished the assignments in the predicted time in the recall-relevant condition (60.0%) than in the recall and control conditions (38.1% and 29.3%, respectively), x2G, N = 123) = 7.63, p < .01. The latter two conditions did not differ significantly from each other.
...
The absence of an effect in the recall condition is rather remarkable. In this condition, subjects first described their past performance with projects similar to the computer assignment and acknowledged that they typically finish only 1 day before
deadlines. Following a suggestion to “keep in mind previous experiences with assignments,” they then predicted when they would finish the computer assignment. Despite this seemingly powerful manipulation, subjects continued to make overly optimistic forecasts. Apparently, subjects were able to acknowledge their past experiences but disassociate those episodes from their present predictions.
In contrast, the impact of the recall-relevant procedure was sufficiently robust to eliminate the optimistic bias in both deadline conditions
How does this compare to the first experiment?
Interestingly, although the completion estimates were less biased in the recall-relevant condition than in the other conditions, they were not more strongly correlated with actual completion times, nor was the absolute prediction error any smaller. The optimistic bias was eliminated in the recall-relevant condition because subjects’ predictions were as likely to be too long as they were to be too short. The effects of this manipulation mirror those obtained with the instruction to provide pessimistic predictions in the first study: When students predicted the completion date for their honor’s thesis on the assumption that “everything went as poorly as it possibly could” they produced unbiased but no more accurate predictions than when they made their “best guesses.”
It’s common in engineering to perform group estimates. Does the study look at that? Yep, the fifth and last experiment asks individuals to estimate the performance of others.
As hypothesized, observers seemed more attuned to the actors’ base rates than did the actors themselves. Observers spontaneously used the past as a basis for predicting actors’ task completion times and produced estimates that were later than both the actors’ estimates and their completion times.
So observers are more pessimistic. Actually, observers are so pessimistic that you have to average it with the optimistic estimates to get an unbiased estimate.
One of the most consistent findings throughout our investigation was that manipulations that reduced the directional (optimistic) bias in completion estimates were ineffective in in-
creasing absolute accuracy. This implies that our manipulations did not give subjects any greater insight into the particular predictions they were making, nor did they cause all subjects to become more pessimistic (see Footnote 2), but instead caused enough subjects to become overly pessimistic to counterbalance the subjects who remained overly optimistic. It remains for future research to identify those factors that lead people to make
more accurate, as well as unbiased, predictions. In the real world, absolute accuracy is sometimes not as important as (a) the proportion of times that the task is completed by the “best-guess” date and (b) the proportion of dramatically optimistic, and therefore memorable, prediction failures. By both of these criteria, factors that decrease the optimistic bias “improve” the quality of intuitive prediction.
At the end of the day, there are certain things that are known about scheduling / prediction.
1. In general, individuals are as wrong as they are right for any given estimate.
2. In general, people are overly optimistic.
3. But, estimates generally correlate well with actual duration—if an individual thinks something is longer in estimate than another task, it most likely is! This is why in SW sometimes estimation is not in units of time at all, but in a concept called “points”.
4. The larger and more nebulously scoped the task, the worse any estimates will be in absolute error.
5. The length of a time a task can take follows a distribution with a very long right tail—a task that takes way longer than expected can take an arbitrary amount of time, but the fastest time to complete a task is limited.
6. The best way to actually schedule or predict a project is to break it down into as many small component tasks as possible, identify dependencies between those tasks, and produce most likely, optimistic, and pessimistic estimates for each task, and then run a simulation for chain of dependencies to see what the expected project completion looks like. Use a Gantt chart. This is a boring answer because it’s the “learn project management” answer, and people will hate on it because gesture vaguely to all of the projects that overrun their schedule. There are many interesting reasons for why that happens and why I don’t think it’s a massive failure of rationality, but I’m not sure this comment is a good place to go into detail on that. The quick answer is that comical overrun of a schedule has less to do with an inability to create correct schedules from an engineering / evidence-based perspective, and much more to do with a bureaucratic or organizational refusal to accept an evidence-based schedule when a totally false but politically palatable “optimistic” schedule is preferred.

anonymousaisafety Jul 5, 2022, 10:16 PM
9 points
7
in reply to: Duncan Sabien (Inactive)’s comment on: Murphyjitsu: an Inner Simulator algorithm
Right. I think I agree with everything you wrote here, but here it is again in my own words:
In communicating with people, the goal isn’t to ask a hypothetically “best” question and wonder why people don’t understand or don’t respond in the “correct” way. The goal is to be understood and to share information and acquire consensus or agree on some negotiation or otherwise accomplish some task.
This means that in real communication with real people, you often need to ask different questions to different people to arrive at the same information, or phrase some statement differently for it to be understood. There shouldn’t be any surprise or paradox here. When I am discussing an engineering problem with engineers, I phrase it in the terminology that engineers will understand. When I need to communicate that same problem to upper management, I do not use the same terminology that I use with my engineers.
Likewise, there’s a difference when I’m communicating with some engineering intern or new grad right out of college, vs a senior engineer with a decade of experience. I tailor my speech for my audience.
In particular, if I asked this question to Kenoubi (“what’s the worst case for how long this thesis could take you?”), and Kenoubi replied “It never finishes”, then I would immediately follow up with the question, “Ok, considering cases when it does finish, what’s the worst-case look like?” And if that got the reply “the day before it is required to be due”, I would then start poking at “What would would cause that to occur?”.
The reason why I start with the first question is because it works for, I don’t know, 95% of people I’ve ever interacted with in my life? In my mind, it’s rational to start with a question that almost always elicits the information I care about, even if there’s some small subset of the population that will force me to choose my words as if they’re being interpreted by a Monkey’s paw.

anonymousaisafety Jul 5, 2022, 9:59 PM
3 points
2
in reply to: Howard Halim’s comment on: Decision theory and dynamic inconsistency
Isn’t this identical to the proof for why there’s no general algorithm for solving the Halting Problem?
The Halting Problem asks for an algorithm A(S, I) that when given the source code S and input I for another program will report whether S(I) halts (vs run forever).
There is a proof that says A does not exist. There is no general algorithm for determining whether an arbitrary program will halt. “General” and “arbitrary” are important keywords because it’s trivial to consider specific algorithms and specific programs and say, yes, we can determine that this specific program will halt via this specific algorithm.
That proof of the Halting Problem (for a general algorithm and arbitrary programs!) works by defining a pathological program S that inspects what the general algorithm A would predict and then does the opposite.
What you’re describing above seems almost word-for-word the same construction used for constructing the pathological program S, except the algorithm A for “will this program halt?” is replaced by the predictor “will this person one-box?”.
I’m not sure that this necessarily matters for the thought experiment. For example, perhaps we can pretend that the predictor works on all strategies except the pathological case described here, and other strategies isomorphic to it.

anonymousaisafety Jul 4, 2022, 8:06 PM
6 points
0
in reply to: Kenoubi’s comment on: Murphyjitsu: an Inner Simulator algorithm
If we look at the student answers, they were off by ~7 days, or about a 14% error from the actual completion time.
The only way I can interpret your post is that you’re suggesting all of these students should have answered “never”.
I’m not convinced that “never” just didn’t occur to them because they were insufficiently motivated to give a correct answer.
How far off is “never” from the true answer of 55.5 days?
It’s about infinitely far off. It is an infinitely wrong answer. Even if a project ran 1000% over every worst-case pessimistic schedule, any finite prediction was still infinitely closer than “never”.
It’s a quirk of rationalist culture (and a few others — I’ve seen this from physicists too) to take the words literally and propose that “infinitely long” is a plausible answer, and be baffled as to how anyone could think otherwise.
That’s because “infinitely long” is a trivial answer for any task that isn’t literally impossible.^[1] It provides 0 information and takes 0 computational effort. It might as well be the answer from a non-entity, like asking a brick wall how long the thesis could take to complete.
Question: How long can it take to do X?
Brick wall: Forever. Just go do not-X instead.
It is much more difficult to give an answer for how long a task can take assuming it gets done while anticipating and predicting failure modes that would cause the schedule to explode, and that same answer is actually useful since you can now take preemptive actions to avoid those failure modes—which is the whole point of estimating and scheduling as a logical exercise.
The actual conversation that happens during planning is
A: “What’s the worst case for this task?”
B: “6 months.”
A: “Why?”
B: “We don’t have enough supplies to get past 3 trial runs, so if any one of them is a failure, the lead time on new materials with our current vendor is 5 months.”
A: “Can we source a new vendor?”
B: “No, but… <some other idea>”
1. ^
  In cases when something is literally impossible, instead of saying “infinitely long”, or “never”, it’s more useful to say “that task is not possible” and then explain why. Communication isn’t about finding the “haha, gotcha” answer to a question when asked.

anonymousaisafety Jul 1, 2022, 5:49 AM
6 points
4
on: Murphyjitsu: an Inner Simulator algorithm
Is the concept of “murphyjitsu” supposed to be different than the common exercise known as a premortem in traditional project management? Or is this just the same idea, but rediscovered under a different name, exactly like how what this community calls a “double crux” is just the evaporating cloud, which was first described in the 90s.
If you’ve heard of a postmortem or possibly even a retrospective, then it’s easy to guess what a premortem is. I cannot say the same for “murphyjitsu”.
I see that premortem is even referenced in the “further resources” section, so I’m confused why you’d describe it under a different name that cannot be researched easily outside of this site, where there is tons of literature and examples of how to do premortems correctly.

anonymousaisafety Jul 1, 2022, 5:35 AM
1 point
in reply to: awenonian’s comment on: I No Longer Believe Intelligence to be “Magical”
The core problem remains computational complexity.
Statements like “does this image look reasonable” or saying “you pay attention to regularities in the data”, or “find the resolution by searching all possible resolutions” are all hiding high computational costs behind short English descriptions.
Let’s consider the case of a 1280x720 pixel image.
That’s the same as 921600 pixels.
How many bytes is that?
It depends. How many bytes per pixel?^[1] In my post, I explained there could be 1-byte-per-pixel grayscale, or perhaps 3-bytes-per-pixel RGB using [0, 255] values for each color channel, or maybe 6-bytes-per-pixel with [0, 65535] values for each color channel, or maybe something like 4-bytes-per-pixel because we have 1-byte RGB channels and a 1-byte alpha channel.
Let’s assume that a reasonable cutoff for how many bytes per pixel an encoding could be using is say 8 bytes per pixel, or a hypothetical 64-bit color depth.
How many ways can we divide this between channels?
If we assume 3 channels, it’s 1953.
If we assume 4 channels, it’s 39711.
Also if it turns out to be 5 channels, it’s 595665.
This is a pretty fast growing function. The following is a plot.

Note that the red line is O(2^N) and the black line barely visible at the bottom is O(N^2). N^2 is a notorious runtime complexity because it’s right on the threshold of what is generally unacceptable performance.^[2]
Let’s hope that this file isn’t actually a frame buffer from a graphics card with 32 bits per channel or a 128 bit per pixel / 16 byte per pixel.
Unfortunately, we still need to repeat this calculation for all of the possibilities for how many bits per pixel this image could be. We need to add in the possibility that it is 63 bits per pixel, or 62 bits per pixel, or 61 bits per pixel.
In case anyone wants to claim this is unreasonable, it’s not impossible to have image formats that have RGBA data, but only 1 bit associated with the alpha data for each pixel. ^[3]
And for each of these scenarios, we need to question how many channels of color data there are.
- 1? Grayscale.
- 2? Grayscale, with an alpha channel maybe?
- 3? RGB, probably, or something like HSV.
- 4? RGBA, or maybe it’s the RGBG layout I described for a RAW encoding of a Bayer filter, or maybe it’s CMYK for printing.
- 5? This is getting weird, but it’s not impossible. We could be encoding additional metadata into each pixel, e.g. distance from the camera.
- 6? Actually, this question how how many channels there are is very important, given the fast growing function above.
- 7? This one question, if we don’t know the right answer, is sufficient to make this algorithm pretty much impossible to run.
- 8? When we say we can try all of options, that’s not actually possible.
- 9? What I think people mean is that we can use heuristics to pick the likely options first and try them, and then fall back to more esoteric options if the initial results don’t make sense.
- 10? That’s the difference between average run-time and worst case run-time.
- 11? The point that I am trying to make is that the worst case run-time for decoding an arbitrary binary file is pretty much unbounded, because there’s a ridiculous amount of choice possible.
- 12? Some examples of “image” formats that have large numbers of channels per “pixel” are things like RADAR / LIDAR sensors, e.g. it’s possible to have 5 channels per pixel for defining 3D coordinates (relative to the sensor), range, and intensity.
You actually ran into this problem yourself.
Similarly (though you’d likely do this first), you can tell the difference between RGB and RGBA. If you have (255, 0, 0, 255, 0, 0, 255, 0, 0, 255, 0, 0), this is probably 4 red pixels in RGB, and not a fully opaque red pixel, followed by a fully transparent green pixel, followed by a fully transparent blue pixel in RGBA. It could be 2 pixels that are mostly red and slightly green in 16 bit RGB, though. Not sure how you could piece that out.
Summing up all of the possibilities above is left as an exercise for the reader, and we’ll call that sum K.
Without loss of generality, let’s say our image was encoded as 3 bytes per pixel divided between 3 RGB color channels of 1 byte each.
Our 1280x720 image is actually 2764800 bytes as a binary file.
But since we’re decoding it from the other side, and we don’t know it’s 1280x720, when we’re staring at this pile of 2764800 bytes, we need to first assume how many bytes per pixel it is, so that we can divide the total bytes by the bytes per pixel to calculate the number of pixels.
Then, we need to test each possible resolutions as you’ve suggested.
The number of possible resolutions is the same as the number of divisors of the number of pixels. The equation for providing an upper bound is exp(log(N)/log(log(N)))^[4], but the average number of divisors is approximately log(N).
Oops, no it isn’t!
Files have headers! How large is the header? For a bitmap, it’s anywhere between 26 and 138 bytes. The JPEG header is at least 2 bytes. PNG uses 8 bytes. GIF uses at least 14 bytes.
Now we need to make the following choices:
1. Guess at how many bytes per pixel the data is.
2. Guess at the length of the header. (maybe it’s 0, there is no header!)
3. Calculate the factorization of the remaining bytes N for the different possible resolutions.
4. Hope that there isn’t a footer, checksum, or any type of other metadata hanging out in the sea of bytes. This is common too!
Once we’ve made our choices above, then we multiply that by log(N) for the number of resolutions to test, and then we’ll apply the suggested metric. Remember that when considering the different pixel formats and ways the color channel data could be represented, the number was K, and that’s what we’re multiplying by log(N).
In most non-random images, pixels near to each other are similar. In an MxN image, the pixel below is a[i+M], whereas in an NxM image, it’s a[i+N]. If, across the whole image, the difference between a[i+M] is less than the difference between a[i+N], it’s more likely an MxN image. I expect you could find the resolution by searching all possible resolutions from 1x<length> to <length>x1, and finding which minimizes average distance of “adjacent” pixels.
What you’re describing here is actually similar to a common metric used in algorithms for automatically focusing cameras by calculating the contrast of an image, except for focusing you want to maximize contrast instead of minimize it.
The interesting problem with this metric is that it’s basically a one-way function. For a given image, you can compute this metric. However, minimizing this metric is not the same as knowing that you’ve decoded the image correctly. It says you’ve found a decoding, which did minimize the metric. It does not mean that is the correct decoding.
A trivial proof:
1. Consider an image and the reversal of that image along the horizontal axis.
2. These have the same metric.
3. So the same metric can yield two different images.
A slightly less trivial proof:
1. For a given “image” of N bytes of image data, there are 2^(N*8) possible bit patterns.
2. Assuming the metric is calculated as an 8-byte IEEE 754 double, there are only 2^(8*8) possible bit patterns.
3. When N > 8, there are more bit patterns than values allowed in a double, so multiple images need to map to the same metric.
The difference between our 2^(2764800*8) image space and the 2^64 metric is, uhhh, 10^(10^6.8). Imagine 10^(10^6.8) pigeons. What a mess.^[5]
The metric cannot work as described. There will be various arbitrary interpretations of the data possible to minimize this metric, and almost all of those will result in images that are definitely not the image that was actually encoded, but did minimize the metric. There is no reliable way to do this because it isn’t possible. When you have a pile of data, and you want to reverse meaning from it, there is not one “correct” message that you can divine from it.^[6] See also: numerology, for an example that doesn’t involve binary file encodings.
Even pretending that this metric did work, what’s the time complexity of it? We have to check each pixel, so it’s O(N). There’s a constant factor for each pixel computation. How large is that constant? Let’s pretend it’s small and ignore it.
So now we’ve got K*O(N*log(N)) which is the time complexity of lots of useful algorithms, but we’ve got that awkward constant K in the front. Remember that the constant K reflects the number of choices for different bits per pixel, bits per channel, and the number of channels of data per pixel. Unfortunately, that constant is the one that was growing a rate best described as “absurd”. That constant is the actual definition of what it means to have no priors. When I said “you can generate arbitrarily many hypotheses, but if you don’t control what data you receive, and there’s no interaction possible, then you can’t rule out hypotheses”, what I’m describing is this constant.
I think it would be very weird, if we were trying to train an AI, to send it compressed video, and much more likely that we do, in fact, send it raw RGB values frame by frame.
What I care about is the difference between:
1. Things that are computable.
2. Things that are computable efficiently.
These sets are not the same.
Capabilities of a superintelligent AGI lie only in the second set, not the first.
It is important to understand that a superintelligent AGI is not brute forcing this in the way that has been repeatedly described in this thread. Instead the superintelligent AGI is going to use a bunch of heuristics or knowledge about the provenance of the binary file, combined with access to the internet so that it can just lookup the various headers and features of common image formats, and it’ll go through and check all of those, and then if it isn’t any of the usual suspects, it’ll throw up metaphorical hands, and concede defeat. Or, to quote the title of this thread, intelligence isn’t magic.
1. ^
  This is often phrased as bits per pixel, because a variety of color depth formats use less than 8 bits per channel, or other non-byte divisions.
2. ^
  Refer to https://accidentallyquadratic.tumblr.com/ for examples.
3. ^
  A fun question to consider here becomes: where are the alpha bits stored? E.g. if we assume 3 bytes for RGB data, and then we have the 1 alpha bit, is each pixel taking up 9 bits, or are the pixels stored in runs of 8 pixels followed by a single “alpha” pixel with 8 bits describing the alpha channels of the previous 8 pixels?
4. ^
  https://terrytao.wordpress.com/2008/09/23/the-divisor-bound/
5. ^
  https://en.wikipedia.org/wiki/Pigeonhole_principle
6. ^
  The way this works for real reverse engineering is that we already have expectations of what the data should look like, and we are tweaking inputs and outputs until we get the data we expected. An example would be figuring out a camera’s RAW format by taking pictures of carefully chosen targets like an all white wall, or a checkerboard wall, or an all red wall, and using the knowledge of those targets to find patterns in the data that we can decode.

anonymousaisafety Jun 28, 2022, 4:33 PM
1 point
0
in reply to: Rafael Harth’s comment on: Contest: An Alien Message
Why do you say that Kolmogorov complexity isn’t the right measure?
most uniformly sampled programs of equal KC that produce a string of equal length.
...
“typical” program with this KC.
I am worried that you might have this backwards?
Kolmogorov complexity describes the output, not the program. The output file has low Kolmogorov complexity because there exists a short computer program to describe it.

anonymousaisafety Jun 28, 2022, 4:26 PM
12 points
15
in reply to: harfe’s comment on: Contest: An Alien Message
I have mixed thoughts on this.
I was delighted to see someone else put forth an challenge, and impressed with the amount of people who took it up.
I’m disappointed though that the file used a trivial encoding. When I first saw the comments suggesting it was just all doubles, I was really hoping that it wouldn’t turn out to be that.
I think maybe where the disconnect is occurring is that in the original That Alien Message post, the story starts with aliens deliberately sending a message to humanity to decode, as this thread did here. It is explicitly described as such:
From the first 96 bits, then, it becomes clear that this pattern is not an optimal, compressed encoding of anything. The obvious thought is that the sequence is meant to convey instructions for decoding a compressed message to follow...
But when I argued against the capability of decoding binary files in the I No Longer Believe Intelligence To Be Magical thread, that argument was on a tangent—is it possible to decode an arbitrary binary files? I specifically ruled out trivial encodings in my reasoning. I listed the features that make a file difficult to decode. A huge issue is ambiguity because in almost all binary files, the first problem is just identifying when fields start or end.
I gave examples like
1. Camera RAW formats
2. Compressed image formats like PNG or JPG
3. Video codecs
4. Any binary protocol between applications
  1. Network traffic
  2. Serialization to or from disk
  3. Data in RAM
On the other hand, an array of doubles falls much more into this bucket
data that is basically designed to be interpreted correctly, i.e. the data, even though it is in a binary format, is self-describing.
With all of the above said, the reason why I did not bother uploading an example file in the first thread is frankly because it would have taken me some number of hours to create and I didn’t think there would be any interest in actually decoding it by enough people to justify the time spent. That assumption seems wrong now! It seems like people really enjoyed the challenge. I will update accordingly, and I’ll likely post my example of a file later this week after I have an evening or day free to do so.

anonymousaisafety Jun 28, 2022, 3:44 PM
3 points
0
in reply to: Rafael Harth’s comment on: Contest: An Alien Message
https://en.wikipedia.org/wiki/Kolmogorov_complexity
The fact that the program is so short indicates that the solution is simple. A complex solution would require a much longer program to specify it.

anonymousaisafety Jun 27, 2022, 9:13 PM
2 points
2
in reply to: DirectedEvolution’s comment on: Air Conditioner Repair
I gave this post a strong disagree.

anonymousaisafety Jun 27, 2022, 7:22 PM
3 points
0
on: Contest: An Alien Message
Some thoughts for people looking at this:
- It’s common for binary schemas to distinguish between headers and data. There could be a single header at the start of the file, or there could be multiple headers throughout the file with data following each header.
- There’s often checksums on the header, and sometimes on the data too. It’s common for the checksums to follow the respective thing being checksummed, i.e. the last bytes of the header are a checksum, or the last bytes after the data are a checksum. 16-bit and 32-bit CRCs are common.
- If the data represents a sequence of messages, e.g. from a sensor, there will often be a counter of some sort in the header on each message. E.g. a 1, 2, or 4-byte counter that provides ordering (“message 1”, “message 2″, “message N”) that wraps back to 0.

anonymousaisafety Jun 26, 2022, 11:32 PM
1 point
in reply to: Alexander’s comment on: Alexander’s Shortform
You should add computational complexity.

anonymousaisafety Jun 25, 2022, 6:17 PM
1 point
in reply to: paulfchristiano’s comment on: Air Conditioner Test Results & Discussion
I’m not sure if your comment is disagreeing with any of this. It sounds like we’re on the same page about the fact that exact reasoning is prohibitively costly, and so you will be reasoning approximately, will often miss things, etc.
I agree. The term I’ve heard to describe this state is “violent agreement”.
so in practice wrong conclusions are almost always due to a combination of both “not knowing enough” and “not thinking hard enough” / “not being smart enough.”
The only thing I was trying to point out (maybe more so for everyone else reading the commentary than for you specifically) is that it is perfectly rational for an actor to “not think hard enough” about some problem and thus arrive at a wrong conclusion (or correct conclusion but for a wrong reason), because that actor has higher priority items requiring their attention, and that puts hard time constraints on how many cycles they can dedicate to lower priority items, e.g. debating AC efficiency. Rational actors will try to minimize the likelihood that they’ve reached a wrong conclusion, but they’ll also be forced to minimize or at least not exceed some limit on allowed computation cycles, and on most problems that means the computation cost + any type of hard time constraint is going to be the actual limiting factor.
Although even that, I think that’s more or less what you meant by
in some sense you’ve probably spent too long thinking about the question relative to doing something else
In engineering R&D we often do a bunch of upfront thinking at the start of a project, and the goal is to identify where we have uncertainty or risk in our proposed design. Then, rather than spend 2 more months in meetings debating back-and-forth who has done the napkin math correctly, we’ll take the things we’re uncertain about and design prototypes to burn down risk directly.

anonymousaisafety Jun 25, 2022, 5:31 PM
2 points
in reply to: Lone Pine’s comment on: Conor Sullivan’s Shortform
First, it only targeted Windows machines running an Microsoft SQL Server reachable via the public internet. I would not be surprised if ~70% or more theoretically reachable targets were not infected because they ran some other OS (e.g. Linux) or server software instead (e.g. MySQL). This page makes me think the market share was actually more like 15%, so 85% of servers were not impacted. By not impacted, I mean, “not actively contributing to the spread of the worm”. They were however impacted by the denial-of-service caused by traffic from infected servers.
Second, the UDP port (1434) that the worm used could be trivially blocked. I have discussed network hardening in many of my posts. The easiest way to prevent yourself from getting hacked is to not let the hacker send traffic to you—blocking IP ranges, ports, unneeded Ethernet or IP protocols, and other options available in both network hardware (routers) or software firewalls provides a low cost and highly effective way to do so. This contained the denial-of-service.
Third, the worm’s attack only persisted in RAM, so the only thing a host had to do was restart the infected application. Combined with the second point, this would prevent the machine from being reinfected.
This graph^[1] shows the result of wide-spread adoption of filter rules within hours of the attack being detected
1. ^
  https://cseweb.ucsd.edu//~savage/papers/IEEESP03.pdf

anonymousaisafety Jun 24, 2022, 6:28 PM
4 points
in reply to: paulfchristiano’s comment on: Air Conditioner Test Results & Discussion
This was actually a kind of fun test case for a priori reasoning. I think that I should have been able to notice the consideration denkenbgerger raised, but I didn’t think of it. In fact when I stared reading his comment my immediate reaction was “this methodology is so simple, how could the equilibrium infiltration rate end up being relevant?” My guess would be that my a priori reasoning about AI is wrong in tons of similar ways even in “simple” cases. (Though obviously the whole complexity scale is shifted up a lot, since I’ve spent hundreds of hours thinking about key questions.)
This idea—that you should have been able to notice the issue with infiltration rates—is what I’ve been questioning when I ask “what is the computational complexity of general intelligence” or “what does rational decision making look like in a world with computational costs for reasoning”.
There is a mindset that people are simply not rational enough, and if they were more rational, they wouldn’t fall to those traps. Instead, they would more accurately model the situation, correctly anticipate what will and won’t matter, and arrive at the right answer, just by exercising more careful, diligent thought.
My hypothesis is that whatever that optimal “general intelligence” algorithm^[1] is—the one where you reason a priori from first principles, and then you exhaustively check all of your assumptions for which one might be wrong, and then you recursively use that checking to re-reason from first principles—it is computational inefficient enough in such a way that for most interesting^[2] problems, it is not realistic to assume that it can run to completion in any reasonable^[3] time with realistic computation resources, e.g. a human brain, or a supercomputer.^[4]
I suspect that the human brain is implementing some type of randomized vaguely-Monte-Carlo-like algorithm when reasoning, which is how people can (1) often solve problems in a reasonable amount of time^[5], (2) often miss factors during a priori reasoning but understand them easily after they’ve seen it confirmed experimentally, (3) different people miss different things, (4) often if someone continues to think about a problem for an arbitrarily long people of time^[6] they will continue to generate insights, and (5) often those insights generated from thinking about a problem for an arbitrarily long period of time are only loosely correlated^[7].
In that world, while it is true that you should have been able to notice the problem, there is no guarantee on how much time it would have taken you to do so.
1. ^
  The “God algorithm” for reasoning, to use a term that Jeff Atwood wrote about in this blog post. It describes the idea of an optimal algorithm that isn’t possible to actually use, but the value of thinking about that algorithm is that it gives you a target to aim towards.
2. ^
  The use of the word “interesting” is intended to describe the nature of problems in the real world, which require institutional knowledge, or context-dependent reasoning.
3. ^
  The use of the word “reasonable” is intended to describe the fact that if a building is on fire and you are inside of it, you need to calculate the optimal route out of that burning building in a time period that is than a few minutes in length in order to maximize your chance of survival. Likewise, if you are tasked to solve a problem at work, you have somewhere between weeks and months to show progress or be moved to a separate problem. For proving a theorem, it might be reasonable to spend 10+ years on it if there’s nothing necessitating a more immediate solution.
4. ^
  This is mostly based on an observation that for any scenario with say some fixed number of “obvious” factors influencing it, there are effectively arbitrarily many “other” factors that may influence the scenario, and the process of deterministically ordering an arbitrarily long list and then preceding down the list from “most likely to impact the situation” and “least likely to impact the scenario” to manually check if each “other” factor actually does matter has an arbitrarily high computational cost.
5. ^
  Feel free to put “solve” in quotes and read this as “halt in a reasonable time” instead. Getting the correct answer is optional.
6. ^
  Like mathematical proofs, or the thing where people take a walk and suddenly realize the answer to a question they’ve been considering.
7. ^
  It’s like the algorithm jumped from one part of solution space where it was stuck to a random, new part of the solution space and that’s where it made progress.
What links here?
- Noosphere89's comment on Shutting Down the Lightcone Offices by habryka (Mar 15, 2023, 1:29 PM; 2 points)

anonymousaisafety Jun 23, 2022, 3:32 AM
4 points
0
in reply to: alyssavance’s comment on: Let’s See You Write That Corrigibility Tag
I deliberately tried to focus on “external” safety features because I assumed everyone else was going to follow the task-as-directed and give a list of “internal” safety features. I figured that I would just wait until I could signal-boost my preferred list of “internal” safety features, and I’m happy to do so now—I think Lauro Langosco’s list here is excellent and captures my own intuition for what I’d expect from a minimally useful AGI, and that list does so in probably a clearer / easier to read manner than what I would have written. It’s very similar to some of the other highly upvoted lists, but I prefer it because it explicitly mentions various ways to avoid weird maximization pitfalls, like that the AGI should be allowed to fail at completing a task.

anonymousaisafety Jun 23, 2022, 2:16 AM
19 points
in reply to: DirectedEvolution’s comment on: Air Conditioner Test Results & Discussion
We can even consider the proposed plan (add a 2nd hose and increase the price by $20) in the context of an actual company.
The proposed plan does not actually redesign the AC unit around the fact that we now have 2 hoses. It is “just” adding an additional hose.
Let’s assume that the distribution of AC unit cooling effectively looks something like this graphic that I made in 3 seconds.

In this image, we are choosing to assume that yes, in fact, 2-hose units are more efficient on average than a 1-hose unit. We are also recognizing that perhaps there is some overlap. Perhaps there are especially bad 2-hose units, and especially good 1-hose units.
Based on all of the evidence, I’m going to say that the average 1-hose unit does represent the minimum efficiency needed for cooling in an average consumer’s use-case—i.e. it is sufficient for their needs.
When I consider what would make a 2-hose unit good or bad, I suspect it has a lot to do with how much of the design is built around the fact that there are 2-hoses.
In your proposal, we simply add a 2nd hose to a unit that was otherwise designed functionally as a 1-hose unit. Let’s consider where that might be plotted on this graph.
I’m going to claim based on vague engineering intuition / judgment / experience that it goes right here.

If I am right about where this proposal falls against the competition, then here’s what we’ve done:
1. This is not a 1-hose unit any more. Despite it being more efficient than the average 1-hose units, and only slightly more expensive, consumers looking at 1-hose units (because they are concerned about cost) will not see this model. The argument that it is “only $20 more expensive” is irrelevant. Their search results are filtered, they read online that they wanted a one-hose unit, this product has been removed from their consideration.
2. This is a bad 2-hose unit. It is at the bottom of the efficiency scale, because other 2-hose units were actually designed to take full advantage of the 2-hoses. They will beat you on efficiency, even if they cost more. Wirecutter will list this in the “also ran” when discussing 2-hose units, “So and so sells a 2-hose model, but it was barely more efficient than a 1-hose, we cannot recommend it”.
3. A consumer looking at 2-hose units is already selecting for efficiency over cost, so they will not buy the “just add another hose” 2-hose unit, since it is on the wrong end of the 2-hose distribution.
4. You will acquire a reputation as the company that sells “cheap” products—your unit is cheaper than other 2-hose units, but isn’t better because it wasn’t designed as a 2-hose unit, and it was torn apart by reviewers.
5. Fixing this inefficiency requires actually designing around 2-hoses, which likely results in something like this
“Minimum viable”, in the context of a “minimum viable product” or MVP, is a term in engineering that represents the minimal thing that a consumer will pay to acquire. This is a product that can actually be sold. It’s not the literal worst in its category, and it has a clear supremacy over cheaper categories. This is also called table stakes. Reviewers will consider it fairly, consumers will not rage review it, etc.
However, it’s probably also a lot more expensive than the hypothetical “only $20 more” that has been repeatedly stated.
Even in the scenario where a reviewer does consider the “just add another hose” model when viewing one-hose units, we’ve already established that the one-hose unit is cheaper (by $20! if it’s a $200 unit, that’s 10%), and that the average 1-hose unit is sufficient for some average use-case. Therefore the rational consumer choice is to buy the cheaper one-hose anyway, because it’s irrational to pay more for efficiency that isn’t needed!^[1]^[2]
1. ^
  The exception here is some hypothetical consumer who knows, for a fact, that their unique situation requires a two-hose unit, e.g. they tried a one-hose unit already and it was insufficient.
2. ^
  There’s also an argument here that a rational option is to buy a 1-hose unit, and then if you need slightly more efficiency, just buy & wrap the 1-hose with insulation, as described here. This allows the consumer to purchase at the lower price point and then add efficiency if needed for the cost of the insulation. It’s unclear to me that the “just add another hose” AC would still perform better than an insulated 1-hose.
What links here?

anonymousaisafety Jun 23, 2022, 12:20 AM
1 point
in reply to: Said Achmiz’s comment on: Air Conditioner Test Results & Discussion
I didn’t even think to check this math, but now that I’ve gone and tried to calculate it myself, here’s what I got:
INSIDE ΔINSIDE (CONTROL)
AVERAGE OUTSIDE 86.5
AVERAGE ONE HOSE Δ 19.65 66.85 6.55
AVERAGE TWO HOSE Δ 22.45 64.05 9.35
CONTROL Δ 13.1 73.4

1.42 ΔTWO/ΔONE
EDIT: I see the issue. The parent post says that the control test was done at evening, where the temperature was 82 F. So it’s not even comparable at all, imo.

anonymousaisafety Jun 23, 2022, 12:03 AM
1 point
in reply to: Dustin’s comment on: Air Conditioner Test Results & Discussion
I’ll edit the range, and note that “uncomfortably hot” is my opinion. Rest of my analysis / rant still applies. In fact, in your case, you don’t need need the AC unit at all, since you’d be fine with the control temperature.

anonymousaisafety Jun 22, 2022, 11:12 PM
83 points
0
on: Air Conditioner Test Results & Discussion
I take fault with your primary conclusion, for the same reasons I gave in the first thread:
1. You claim how little adding a 2nd hose would impact the system, without analyzing the actual constraints that apply to engineers building a product that must be shipped & distributed
2. You still neglect the existence of insulating wraps for the hose which do improve efficiency, but are also not sold with the single-hose AC system, which lends evidence to my first point—companies are aware of small cost items that improve AC system efficiency, but do not include them with the AC by default, suggesting that there is an actual price point / consumer market / confounding issue at play that prevents them doing so
The full posts, quoted here for convenience
I think one reason that this error occurs is that there’s a mistaken assumption that the available literature captures all institutional knowledge on a topic, so if one simply spends enough time reading the literature, they’ll have all requisite knowledge needed for policy recommendations. I realize that this statement could apply equally to your own claims here, but in my experience I see it happen most often when someone reads a handful of the most recently released research papers and from just that small sample of work tries to draw conclusions applicable that are broadly applicable to the entire field.
Engineering claims are particularly suspect because institutional knowledge (often in the form of proprietary or confidential information held by companies and their employees) is where the difference between what is theoretically efficient and what is practically more efficient is found. It doesn’t even need to be protected information though—it can also just be that due to manufacturing reasons, or marketing reasons, or some type of incredibly aggravating constraint like “two hoses require a larger box and the larger box pushes you into a shipping size with much higher per-volume / mass costs so the overall cost of the product needs to be non-linearly higher than what you’d expect would be needed for a single hose unit, and that final per-unit cost is outside of what people would like to pay for an AC unit, unless you then also make drastic improvements to the motor efficiency, thermal efficiency, and reduce the sound level, at which point the price is now even higher than before, but you have more competitive reasons to justify it which will be accepted by a large enough % of the market to make up for the increased costs elsewhere, except the remaining % of the market can’t afford that higher per-unit cost at all, so we’re back to still making and selling a one-hose unit for them”.
Concrete example while we’re on the AC unit debate—there’s a very simple way to increase efficiency of portable AC units, and it’s to wrap the hot exhaust hose with insulating duct wrap so that less of the heat on that very hot hose radiates directly back into the room you’re trying to cool. Why do companies not sell their units with that wrap? Probably for one of any of the following reasons—A.) takes up a lot of space, B.) requires a time investment to apply to the unit which would dissuade buyers who think they can’t handle that complexity, C.) would cost more money to sell and no longer be profitable at the market’s price point, D.) has to be applied once the AC unit is in place, and generally is thick enough that the unit is no longer “portable” which during market testing was viewed as a negative by a large % of surveyed people, or E.) some other equally trivial sounding reason that nonetheless means it’s more cost effective for companies to NOT sell insulating duct wrap in the same box as the portable AC unit.
Example of an AC company that does sell an insulating wrap as an optional add-on: https://www.amazon.com/DeLonghi-DLSA003-Conditioner-Insulated-Universal/dp/B07X85CTPX
EDIT: I want to make a meta point here, which is that I have not personally worked on ACs, but I have built & shipped multiple products to consumers, and the type of stupid examples I gave in the first AC post are not just made-up for fun. Engineers argue extensively in meetings about “how can we make product A better”, and ideas get shot down for seemingly trivial reasons that basically come down to—yes, in a vacuum, that would be better, but unfortunately, there’s a ton of existing context like how large a truck is or what parts can actually be bought off the shelf that kneecap those ideas before they leave the design room. The engineers who designed the AC were not idiots, or morons, or clowns who don’t understand thermodynamic efficiency. Engineering is about working around limitations. Those limitations do not have to be rooted in physics; society or infrastructure or consumer behavior around critical price points can all be just as real in terms of what it is feasible for a company to create. Just look at how many startups fail and the founder claims in a postmortem, “Yeah, our tech was way better, but unfortunately people wouldn’t pay 10% more for it, even though it was AMAZING compared to our competitor. We just couldn’t get them to switch.”
EDIT 2: I’m pretty annoyed that you doubled-down on your conclusion even after admitting the actual efficiency difference was significantly less than expected, and then chose a different analysis to let you defend your original point anyway, so these edits might keep coming. Regarding market pressures, two-hose AC units do exist. Companies do sell them, and if consumers want to buy a two-hose AC unit, they can do so. But the presence of both one-hose AC units and two-hose AC units in the market tells us it is not winner-take-all and there is consumer behavior, e.g. around price or complexity, that prevents two-hose units from acquiring literally all market share. So until that changes, it will always be more rational for companies to sell one-hose AC units in addition to their two-hose AC unit, because otherwise they’d be leaving money on the floor by only servicing part of the consumer market. (EDIT 5: see also this post, which was itself a reply to AllAmericanBreakfast’s reply on this thread here)
EDIT 3: Let’s look at your math. Outdoor temp is 85-88 F, let’s just take the average and call it 86.5 F. That’s pretty hot. I’d definitely be uncomfortable in that scenario. How cold did the AC cool the rooms? You say on low fan it was 20.6 F degrees with one hose, 22.7 F with two hoses, and then on high fan, 18.3 F with one hose, and 22.2 F with two hoses. The control was 13.1 F. Looking at the control, that gives a room temperature of ~73.4 F. That is uncomfortably hot in my opinion. I keep my room temperature around 68-70 F ish. The internet tells me that this is within the window of a “comfortable room temperature” defined as 67-75 F^[1], so I’m just a normal human, I guess. How well did the ACs accomplish that? With one hose, you got it down to ~66 F, and with two hoses, you had it down to about ~64 F. That is pretty cold in my mind—I would not set my AC that low if it actually reached that temperature. What does this mean? The one hose unit literally did the job it was designed to do. With an incredibly hot outside temperature, that resulted in an uncomfortable indoor “control” temperature, the one-hose AC was able to lower the temperature to a comfortable, ideal range, and then go below that, showing it even has margin left over. But now you’re saying that they should make the thing more expensive and optimize it for even greater efficiency because … why!? It works!
EDIT 4: I will die on this hill. This is the problem with how the rationalist community approaches the concept of what it means to “make a rational decision” perfectly demonstrated in a single debate. You do not make a “rational decision” in the real world by reasoning in a vacuum. That is how you arrive at a hypothetically good action, but it is not necessarily feasible or possible to perform, so you always need to check your analysis by looking at real world constraints and then pick the action that is 1.) actually possible in the real world, and 2.) still has the highest expected value. Failing to do that is not more clever or more rational, it is just a bad, broken model for how an ideal, optimal agent would behave. An optimal agent doesn’t ignore their surroundings—they play to them, exploit them, use them.
1. ^
  I averaged the following lower / upper temperatures.
  Wikipedia: 64-75
  www.cielowigle.com: 68-72
  www.vivint.com: 68-76
  www.provicincialheating.ca: 68-76
What links here?
- Noosphere89's comment on Instrumental Goals Are A Different And Friendlier Kind Of Thing Than Terminal Goals by johnswentworth (Jan 25, 2025, 6:20 PM; 10 points)
- Noosphere89's comment on My Mental Model of AI Optimist Opinions by tailcalled (Jan 29, 2025, 9:35 PM; 8 points)

		INSIDE	ΔINSIDE (CONTROL)
AVERAGE OUTSIDE	86.5
AVERAGE ONE HOSE Δ	19.65	66.85	6.55
AVERAGE TWO HOSE Δ	22.45	64.05	9.35
CONTROL Δ	13.1	73.4

			1.42	ΔTWO/ΔONE