Supply side: It approaches the minimum average total, not marginal, cost. Maybe if people accounted for it finer (e.g., charging self “wages” and “rent”), cooking at home would be in the ballpark (assuming equal quality of inputs and outputs across venues..), but that just illustrates how real costs can explain a lot of the differential without having to jump to regulation and barriers to entry (yes, those are nonzero too!).
Demand side: Complaints in the OP about the uninformativeness of ratings also highlight how far we are from perfect competition (also, e.g., heterogeneous products), so you can expect nonzero markups. We aren’t in equilibrium and in the long run we’re all dead, etc.
I’m a big proponent of starting with the textbook economic analysis, but I was surprised by the surprise. Let’s even assume perfect accounting and competition:
Draw a restaurant supply curve in the middle of the graph. In the upper right corner, draw a restaurant demand curve (high demand given all the benefits I listed). Equilibrium price is P_r*. Now draw a home supply curve to the far left, indicating an inefficient supply relative to restaurants (for the same quantity, restaurants do it “cheaper”). In the bottom left corner, draw a home demand curve (again the point is I demand eating out more than eating at home). Equilibrium price for those is P_h*. It’s very easy to draw where P_h* < P_r*.
Daniel V
Cooking at Home Being Cheaper is Weird
I like the argument that the scaling should make the average marginal cost per plate lower in restaurants than at home, but I find cooking at home being cheaper not weird at all. First, there are also real fixed costs to account for, not just regulatory costs.
More importantly, the average price per plate is not just a function of costs, it’s a function of the value that people receive. Cooking at home does give some nice benefits, but eating out gives some huge ones: essentially leisure, time savings (a lot of things get prepped before service), no dishes, and possibly lower search costs (“what’s for dinner tonight?”).
A classic that seemingly will have to be reargued til the end of time. Other allocation methods are not clearly more egalitarian and are less efficient (depends on the correlation matrix of WTP, need, time budget, etc., plus one’s own judgment of fairness, but money prices come out looking great a lot of the time). In some cases, even prices don’t perform great (addressed in some comments on this post), but they’re better than the alternatives.
For more reading: https://www.lesswrong.com/posts/gNodQGNoPDjztasbh/lies-damn-lies-and-fabricated-options?commentId=nG2X7x3n55cb3p7yB
To get Robin worried about AI doom, I’d need to convince him that there’s a different metric he needs to be tracking
That, or explain the factors/why the Robin should update his timeline for AI/computer automation taking “most” of the jobs.
AI Doom Scenario
Robin’s take here strikes me both as an uncooperative thought-experiment participant and as a decently considered position. It’s like he hasn’t actually skimmed the top doom scenarios discussed in this space (and that’s coming from me...someone who has probably thought less about this space than Robin) (also see his equating corporations with superintelligence—he’s not keyed into the doomer use of the term and not paying attention to the range of values it could take).
On the other hand, I find there is some affinity with my skepticism of AI doom, with my vibe being it’s in the notion that authorization lines will be important.
On the other other hand, once the authorization bailey is under siege by the superhuman intelligence aspect of the scenario, Robin retreats to the motte that there will be billions of AIs and (I guess unlike humans?) they can’t coordinate. Sure, corporations haven’t taken over the government and there isn’t one world government, but in many cases, tens of millions of people coordinate to form a polity, so why would we assume all AI agents will counteract each other?
It was definitely a fun section and I appreciate Robin making these points, but I’m finding myself about as unassuaged by Robin’s thoughts here as I am by my own.
Robin: We have this abstract conception of what it might eventually become, but we can’t use that abstract conception to do very much now about the problems that might arise. We’ll need to wait until they are realized more.
When talking about doom, I think a pretty natural comparison is nuclear weapon development. And I believe that analogy highlights how much more right Robin is here than doomers might give him credit for. Obviously a lot of abstract thinking and scenario consideration went into developing the atomic bomb, but also a lot of safeguards were developed as they built prototypes and encountered snags. If Robin is so correct that no prototype or abstraction will allow us address safety concerns, so we need to be dealing with the real thing to understand it, then I think a biosafety analogy still helps his point. If you’re dealing with GPT-10 before public release, train it, give it no authorization lines, and train people (plural) studying it to not follow its directions. In line with Robin’s competition views, use GPT-9 agents to help out on assessments if need be. But again, Robin’s perspective here falls flat and is of little assurance if it just devolves into “let it into the wild, then deal with it.”
A great debate and post, thanks!
Paper from the Federal Reserve Bank of Dallas estimates 150%-300% returns to government nondefense R&D over the postwar period on business sector productivity growth. They say this implies underfunding of nondefense R&D, but that is not right. One should assume decreasing marginal returns, so this is entirely compatible with the level of spending being too high. I also would not assume conditions are unchanged and spending remains similarly effective.
At low returns, you might question whether it’s good enough to invest more compared to other options (e.g., at 5%, maybe simply not incurring the added deficit to be financed at 5% is arguably preferable; at 7%, maybe your value function is such that simply not incurring the added deficit to be financed at 5% is arguably preferable), but at such high returns, unless you think the private sector is achieving a ballpark level of marginal returns, invest, baby, invest! The marginal returns would have to be insanely diminishing for it not to make sense to invest more, which implies we’re investing at just about the optimal level (if the marginal return of the next $1 were 0%, we shouldn’t invest more, but we shouldn’t invest less either because our current marginal return is 150%). Holding skepticism about the estimated return itself would be a different story.
That is an additional 15% of kids not sleeping seven hours
I was not aware of the concomitant huge drop in sleep (though it’s obvious in retrospect). Maybe it’s more important to limit screen time at night, when you’re alone in your room not sleeping. Being constantly lethargic as a result may also contribute to (and be a) depressive symptoms. It will be very important to figure out the mechanism(s) by which smartphone use hurts kids.
I agree, I was thinking more generally this isn’t a “poker” theory specifically, just one about rules and buy-in. But it’s about poker night, so I’ll let it slide. The main game rules, though, remain extraneous. Loved the post still!
Mira: You should be able to buy anything with a limit order.
“I don’t feel like paying $250 for an anime figurine, but I left an order up for $50”
If they saw 10,000 orders at a lower price rung …
As usual the answer is transaction costs
Agree and also perceptions. The idea here is to facilitate price discovery and price discrimination. If only we knew people’s WTP and could serve them lower prices acceptable to us when volume isn’t moving at the current price! We can adjust prices ad hoc, but maybe a little upfront market research would be better and an exchange might be smoother (subject to TCs). The flipside of this has the problem that consumers hate it [Reuters]. Also, hedging (see: futures markets) does happen in B2B, but with more sophisticated owners and larger businesses. The supply chain is constantly to optimize inventory management (again, not mom-and-pops you see on save-my-business shows).
Why is turbulence worse on planes? The headlines blame it on ‘climate change.’ The actual answer is the FAA told airlines to prioritize saving fuel over passenger comfort, despite passengers having a strong revealed preference for spending the extra cost of fuel to have a more pleasant flight. This then became ‘because climate change.’ This kind of thing damages public trust in all such claims, making solving climate change (and everything else) that much harder.
There are benefits to optimized profile descents (fuel, time, reduced air traffic controller instructions, reduced noise over populated areas), which they did studies on to confirm since in high traffic airspace the stepwise approach can be easier for ATC. This change could conceivably increase turbulence on approach but would not explain the increase that “the narrative” is attributing to increased wind shear at higher altitudes.
I agree with Neil here: if you identify with your flaws, that is bad. By definition. If you are highly analytical and you identify with it, great, regardless of if other people see it as a flaw. Like you said and Neil’s reply in the footnote, if it’s a goal, then it is not a flaw. But if you say it is a personal flaw, then either you shouldn’t be adopting it into your identity (you don’t even have to try to fix it as noble as that would be, but you don’t get to say “I’m the bad-at-math-person, it’s so funny and quirky, and I just led my small business and partners into financial ruin with an arithmetic mistake,” life is not a sit-com) or maybe you don’t really see it as a flaw after all. Either way, something is wrong, either in your priorities or the reliability of your self-reports. And, yeah, this topic involves value judgments. If nothing has valence, then the notion of a flaw would not exist.
I quite appreciate the post’s laying things out, but it’s not convincing regarding Scott’s post (it’s not bad either, just not convincing!) because it doesn’t offer much more than “no, you’re wrong.” The crux of the argument presented here is taking the word disability, which to most speakers means X and implies Y, and breaking it into an impairment, which means X, and a disability, which is Y. Scott says this is wrong and explains why he thinks so. DirectedEvolution says Scott is wrong “because the definitions say...” but that’s exactly what Scott is complaining about.
For example, if you’re short-sighted, normally we’d say “you have a disability (or impairment or handicap, etc., they’re interchangeable) of your vision so that means you will struggle with reading road signs.” Instead, the social model entails saying “you have an impairment of your vision so that means, because of society, you will be disabled when it comes to reading road signs.”
We can debate which view is more useful (and for what purposes). Scott thinks the social model is useful to promote accommodations since it separates the physical condition from the consequences (whether it produces negative consequences depends on society). He thinks the Szaz-Caplan model is useful to deny accommodations since it separates the mental condition (i.e., preferences, in that model) from the consequences (whether it produces negative consequences depends on will). More importantly, he thinks the social model is “slightly wrong about some empirical facts” (what empirical facts? DirectedEvolution is correct that Scott’s argumentation is a bit soft...he benefits greatly from arguing the layperson side) in that in some cases it feels absurd to pin blame on society for the consequences of some impairments (e.g., Mt. Everest). And on that your layperson (and I) would agree with him. DirectedEvolution offers no counterpoint on that (which is the primary argument), but the post DOES provide a key benefit:
Adopting separate definitions for impairment and disability IS NOT strictly equivalent to adopting the social model. One could restate short-sightedness: “you have an impairment of your vision so that means you will be disabled when it comes to reading road signs.” This drops the blame game and allows for impairments to disable people outside of societies. In fact, Scott accidentally endorsed it [added by me]: “the blind person’s inability to drive [disability] remains due to their blindness [impairment], not society.” So perhaps the crux of Scott’s argument is not about using two definitions but about whether disability ought to be defined as stemming from society! And in fact that’s evident in Scott’s post. However, Scott’s post DID also, at times, imply that one definition would suffice.
This post made me update toward two definitions potentially being useful, but it did not make me update away from endorsing Scott’s main point, that disability ought not be defined as stemming from society.
As an aside: the two definitions are still debatable though. Suppose someone has an impairment that has not nor ever will generate a disability. How is this not the same as “there exists variability”? If someone has perfect vision and I am short-sighted but we live in a dome with a 5 foot diameter such that I can see just fine, and no one tells me my lived experience could be better, how could you even call that an impairment? Is it an impairment if I realize that my vision could be better? Is that other person impaired if they realize their vision could be improved above “normal”? “Impairment” could just refer to being low on the spectrum of natural human variability in some capability, but how low is low enough? “So low that it starts to interfere...” is bringing disability into the mix. What capabilities count? Certainly not “reading road signs” as that would be in the realm of disability, but what level of specificity is appropriate? Short-sightedness is not an impairment of seeing near objects, it’s an impairment of seeing far objects, so that is to say, not vision generally. But once you get specific enough, it’s back to sounding like a disability—“your far object vision is impaired so you are disabled at seeing far objects.”
It’s very interesting to see the intuitive approach here and there is a lot to like about how you identified something you didn’t like in some personality tests (though there are some concrete ones out there), probed content domains for item generation, and settled upon correlations to assess hanging-togetherness.
But you need to incorporate your knowledge from reading about scale development and factor analysis. Obviously you’ve read in that space. You know you want to test item-total correlations (trait impact), multi-dimensionality (factor model loss), and criterion validity (correlation with lexical notion). Are you trying to ease us in with a primer (with different vocabulary!) or reinvent the wheel?
Let’s start with the easy-goingness scale:
(+) In the evening I tend to relax and watch some videos/TV
(+) I don’t feel the need to arrange any elaborate events to go to in my free time
(+) I think it is best to take it easy about exams and interviews, rather than worrying a bunch about doing it right
(+) I think you’ve got to have low expectations of others, as otherwise they will let you down
(-) I get angry about politics
(-) I have a stressful job
(-) I don’t feel like I should have breaks at work unless I’ve “earned” them by finishing something productive
(-) I spent a lot of effort on parenting
The breadth of it is either a strength or a weakness. It’d be nice to have a construct definition or at least some gesturing at what easy-goingness actually is to gauge the face-validity of these items. Concrete items necessarily will have some domain-dependence, resulting in deficiency (e.g., someone who likes to relax and read a book will score low on item 1) or contamination (e.g., having low expectations of others might also be trait pessimism), but item 8 is really specific. It hampers the ability of this scale to capture easy-goingness among non-parents. The breadth would be good if it captured variations on easy-goingness, but instead it’d be bad if it just captures different things that don’t really relate to each other. That’s especially problematic because then the inference from low inter-correlations might not be that the construct is bad, but that the items just don’t tap into it. You can see where I’m going with this because...
This suggests to me that Easy-Goingness is not very “real”. While it might make sense to describe a person as doing something Easy-Going, for instance when they are watching TV, it is kind of arbitrary to talk about people as being more or less Easy-Going, because it depends a lot on context/what you mean.
...indeed, the items are mainly just capturing different things, not reflecting on easy-goingness in any way. From a scale-assessment standpoint, it’s great to see the results confirm my unease about the items based on simply reading them.
The fact that this is weak means that even the most Easy-Going people cannot necessarily be expected to be particularly Easy-Going in all contexts.
This statement presumes your measure reflects a higher-order easy-goingness and that context-specific easy-goingnesses are also being adequately measured.
With conservatism, on the other hand, you can see there is some context-specificity (e.g., dress vs. general social views vs. issue-based ideology), but the measure is facially better. And it hangs together better. Alternately, you might explore those contours and say you’ve come up with a multi-dimensional conservatism scale, just like you have a multi-dimensional creativity scale.
the “Correlation with lexical notion” was consistently close to 1, showing that the concrete and the abstract descriptors were getting at the same thing.
There’s an implicit “when the concrete descriptors actually had face validity” hidden here; low correlation with the lexical notion could indicate a problem with the lexical scale or a problem with the concrete scale, or both.
Overall, I am very impressed that you presented a scary chart to start, promised you’d explain it, and successfully did so. The general takeaway from it is that the lexical hypothesis could be pretty sound and a few of these might be multidimensional in nature (or could be that some items are good and some a bad). For the low trait impact scales, it’s a question of whether the items are good and the construct isn’t “real,” or whether the items are just a bad measurement approach.
Who has an alternative hypothesis that explains this data? Anyone? Ooh ooh, pick me, pick me. Perhaps being depressed has something to do with your life being depressing, due to things like lack of human capital or job opportunities, life and career setbacks or alienation from one’s work. Income increases life satisfaction, as I assume does the prospect of future income.
It is amazing to see the ‘depression is purely a chemical imbalance unrelated to one’s physical circumstances’ attitude in this brazen a form. Mistaking correlation for causation here seems like a difficult mistake for a reasonable and reflecting person to make.
They measured depression at ages 27-35 in 1992 and outcomes at age 50. They control for “age, gender, race, for level of education by age 26, parental education, r marital status in 1992 survey, years of work experience accumulated by 1992 survey, the average percentage of weeks the person’s work history data is unaccounted for by 1992 survey, health status during childhood, a dummy for number of cigarettes consumed by 1992 survey, year indicators, local unemployment rate in 1992, 1998, 2004, and the year the person’s outcome variable is collected.”
So it’s not like they just correlated depression and wages from a cross-sectional survey and claimed causation. They did some work here.
It was a good post! To the extent that whatever I said was value-added or convincing to you, it was only because your quality post prompted me to lay it out.
And like you said, perhaps there is more here. Does a negative (vs. positive) frame make it harder to notice (or easier to forget) that there is a null hypothesis? Preliminary evidence in favor is that people who “own” the null will cede it in a negative frame, whereas they tend to retain it in a positive frame. More thinking/research may be needed though to feel confident about that (I say that as a scientist starting with the null effect of no difference, not as someone proponing the hypothesis of no difference).“It’s not sufficient to be right in many contexts, you must also be rhetorically persuasive.” Spittin’ facts.
Going off localdeity’s comment, I think “arrogating the right to choose the null hypothesis” or as you said, “assuming the burden of proof” are more critical than whether the frame involves negations. If you want to win an argument, don’t argue, make the other person do the arguing by asking lots of questions, even questions phrased as statements, and then just say whatever claim they make isn’t convincing enough. Why should purple be better than green? An eminently reasonable question! But one whose answer will never have satisfactory support, unless you want it to. “I’m just asking questions.”
It’s good for you to point out that the true statement localdeity offered and your conclusion seem in contention. It is a weaker statement, so if you are being asked for your opinion, you may want to hedge with that negation. If you are actually trying to convince someone of something though (and this is why I think you rightly believe these are about subtly different things), that is not the way to do it. You could make the stronger claim, or alternately, you could phrase it as a question—“why shouldn’t we do anti-X?” (but notice it would also work without the negation: “why should we do X?”) and get them to do the arguing for you.
You’re not wrong, and I don’t disagree!
In the long run it seems pretty clear labor won’t have any real economic value
I’d love to see a full post on this. It’s one of those statements that rings true since it taps into the underlying trend (at least in the US) where the labor share of GDP has been declining. But *check notes* that was from 65% to 60% and had some upstreaks in there. So it’s also one of those statements that, upon cognitive reflection, also has a lot of ways to end up false: in an economy with labor crowded out by capital, what does the poor class have to offer the capitalists that would provide the basis for a positive return on their investment (or are they...benevolent butchers in the scenario)? Also, this dystopia just comes about without any attempts to regulate the business environment in a way that makes the use of labor more attractive? Like I said, I’d love to see the case for this spelled out in a way that allows for a meaningful debate.
As you can tell from my internal debate above, I agree with the other points—humans have a long history of voluntarily crippling our technology or at least adapting to/with it.
Thanks for writing this. I suppose the same could be said about any tool that you have suspicions might be inferior to another on the horizon in your lifetime. As quanticle said, some romance around self-crafting could support the psychological value of the labor. More importantly, I think there are in fact qualia pertinent to our quality evaluations that leave AI productions inferior in important ways than human work...currently. That gap will attenuate and we’ll hone our models to be better at producing in a wider spectrum of areas, too.
However, I don’t think it’s a foregone conclusion that no gap will remain. When the world of bits can’t quite recreate the world of atoms (efficiently), there will be a place for human labors (okay, even the boundaries for this are subject to change too but bear with me) - think of handwriting. What a pain! The tool has been replaced with word processing and printing for many written documents. But when I want to send a thank-you to a big client, printing just can’t recreate my ink-on-paper signature. An autopen could, but again it’s not at the level of efficiency where it is worth the widespread adoption that would snuff out human labor in that space.
By the way, I wonder if you took your inspiration and general plan for this essay, turned it into a prompt, and gave it to chatGPT, what it would produce (maybe there could be some honing of that by a prompt engineer, but whatever). To be fair, you could let chatGPT rewrite it a few times with edits like you would have done for yourself. I suspect it would not write as good of a post—that’s a good enough reason to bother doing it yourself.
(Also because the prompt to write with the style of a specific person only works when you have enough online content in the training data. So if you want a unique style, you need to write a lot before you can outsource. LOL)
Upvote for paragraph one, agree for paragraph two.
It’s a very narrow (but admittedly compelling) perspective to realize that in particularly bad situations, regulations can compound the badness. But there is plenty of room to debate regulations when it comes to typical cases, and it’s probably a better basis on which to evaluate them.
I’m here to say, this is not some property specific to p-values, just about the credibility of the communicator.
If scientists make a bunch of errors all the time, especially those that change their conclusions, indeed you can’t trust them. Turns out (BW11) that scientistspublishedinbetterjournals are more credible than scientistspublishedinworsejournals, the errors they make tend not to change the conclusions of the test (i.e., the chance of drawing a wrong conclusion from their data (“gross error” in BW11) was much lower than the headline rate), and (admittedly I’m going out on a limb here) it is very possible the errors that change the conclusion of a particular test do not change the overall conclusion about the general theory (e.g., if theory says X, Y, and Z should happen, and you find support for X and Y and marginal-support-now-not-significant-support-anymore for Z, the theory is still pretty intact unless you really care about using p-values in a binary fashion. If theory says X, Y, and Z should happen, and you find support for X and Y and now-not-significant-support-anymore for Z, that’s more of an issue. But given how many tests are in a paper, it’s also possible theory says X, Y, and Z should happen, and you find support for X and Y and Z, but turns out your conclusion about W reverses, which may or may not really have something to say about your theory).
I don’t think it is wise to throw the baby out with the bathwater.