How is it that someone like Graham could hold so strongly and express so eloquently the opinion that the most important lesson to unlearn is the desire to pass tests, and then create an institution like Y-Combinator?
An alternative hypothesis is that Graham has a bunch of tests/metrics that he uses to evaluate people on, and those tests/metrics work much better when people do not try to optimize for doing well on them (cf. Goodhart’s law). He may not be consciously thinking this, but his incentives certainly point in the direction of writing an essay like the one he wrote.
From my perspective, both you and Graham are failing to address or even mention the root cause of the problem, namely that value differences and asymmetric information imply that a utopia where people don’t give tests and people don’t try to pass or hack tests can’t exist; tests are part of a second-best solution, and therefore so is the desire to pass/hack tests.
Your explanation is instead:
There’s a way to succeed (i.e. become a larger share of what exists) through production, and a way to succeed through purely adversarial (i.e. zero-sum) competition. These are incompatible strategies, so that productive people will do poorly in zero-sum systems, and vice versa. The productive strategy really is good, and in production-oriented contexts, a zero-sum attitude really is a disadvantage.
I don’t know how to make sense of this, because it’s really far from anything I’ve learned from economics and game theory. For example what are “zero-sum systems” and “production-oriented contexts”? What is a “zero-sum attitude” and what causes people to have a “zero-sum attitude” even in a “production-oriented context”? Why is “zero-sum attitude” a disadvantage in “production-oriented contexts”?
He then sets up an institution optimizing for “success” directly, rather than specifically for production-based strategies. But in the environment in which he’s operating, adversarial strategies can scale faster.
Given that “zero-sum attitude” is a disadvantage in “production-oriented contexts”, this seems to be saying that Graham failed to set up a “production-oriented context” and instead accidentally set up a “zero-sum system”. Is that right? How is one supposed to go about setting up a “production-oriented context” then? Is it possible to do that without giving tests or using metrics that could be gamed by others?
Y Combinator could very easily (if not for the apparent emotional difficulties mentioned in Black Swan farming) have instead been organized to screen for founders making a credible effort to create a great product, instead of screening for generalized responsiveness to tests.
Do you think other VCs / startup accelerators are doing this (“screen for founders making a credible effort to create a great product, instead of screening for generalized responsiveness to tests”), or doing it to a greater extent than Y Combinator?
I believe Ben is distinguishing between tests-that-are-useful-because-of-what-the-test-administrators-can-give-you, vs honest assessments of the value of something.
For example, as a VC you could genuinely not care about what clothes people wear, or you could loudly announce that you don’t care while actually assessing applicants based on their visual resemblance to the ideal founder in your head. Ability to guess your clothing password is associated with ability to guess passwords in general, which is associated with success in general, so even an arbitrary clothing test is somewhat predictive of success. What it’s not predictive of is the value a product will create or the marginal difference this funding will make, or success in a world that’s based on these things instead of password guessing.
An alternative hypothesis is that Graham has a bunch of tests/metrics that he uses to evaluate people on, and those tests/metrics work much better when people do not try to optimize for doing well on them
Isn’t it a bit odd that PG’s secret filters have the exact same output as those of staid, old, non-disruptive industrialists?
I.e., strongly optimizing for passing the tests of <Whitebread Funding Group> doesn’t seem to hurt you on YC’s metrics.
I’m not sure whether that’s true. One way to optimize for the tests of <Whitebread Funding Group> might be to go to a convention where you can get a change to pitch yourself in person to them. There are also other ways you can spend a lot of time to someone to endorse you to <Whitebread Funding Group>.
Spending that means that’s time not spent getting clear about vision, building product or talking to users and that will be negative for applying to VC.
An alternative hypothesis is that Graham has a bunch of tests/metrics that he uses to evaluate people on, and those tests/metrics work much better when people do not try to optimize for doing well on them (cf. Goodhart’s law). He may not be consciously thinking this, but his incentives certainly point in the direction of writing an essay like the one he wrote.
From my perspective, both you and Graham are failing to address or even mention the root cause of the problem, namely that value differences and asymmetric information imply that a utopia where people don’t give tests and people don’t try to pass or hack tests can’t exist; tests are part of a second-best solution, and therefore so is the desire to pass/hack tests.
Your explanation is instead:
I don’t know how to make sense of this, because it’s really far from anything I’ve learned from economics and game theory. For example what are “zero-sum systems” and “production-oriented contexts”? What is a “zero-sum attitude” and what causes people to have a “zero-sum attitude” even in a “production-oriented context”? Why is “zero-sum attitude” a disadvantage in “production-oriented contexts”?
Given that “zero-sum attitude” is a disadvantage in “production-oriented contexts”, this seems to be saying that Graham failed to set up a “production-oriented context” and instead accidentally set up a “zero-sum system”. Is that right? How is one supposed to go about setting up a “production-oriented context” then? Is it possible to do that without giving tests or using metrics that could be gamed by others?
Y Combinator could very easily (if not for the apparent emotional difficulties mentioned in Black Swan farming) have instead been organized to screen for founders making a credible effort to create a great product, instead of screening for generalized responsiveness to tests.
Do you think other VCs / startup accelerators are doing this (“screen for founders making a credible effort to create a great product, instead of screening for generalized responsiveness to tests”), or doing it to a greater extent than Y Combinator?
No.
I believe Ben is distinguishing between tests-that-are-useful-because-of-what-the-test-administrators-can-give-you, vs honest assessments of the value of something.
For example, as a VC you could genuinely not care about what clothes people wear, or you could loudly announce that you don’t care while actually assessing applicants based on their visual resemblance to the ideal founder in your head. Ability to guess your clothing password is associated with ability to guess passwords in general, which is associated with success in general, so even an arbitrary clothing test is somewhat predictive of success. What it’s not predictive of is the value a product will create or the marginal difference this funding will make, or success in a world that’s based on these things instead of password guessing.
Isn’t it a bit odd that PG’s secret filters have the exact same output as those of staid, old, non-disruptive industrialists?
I.e., strongly optimizing for passing the tests of <Whitebread Funding Group> doesn’t seem to hurt you on YC’s metrics.
I’m not sure whether that’s true. One way to optimize for the tests of <Whitebread Funding Group> might be to go to a convention where you can get a change to pitch yourself in person to them. There are also other ways you can spend a lot of time to someone to endorse you to <Whitebread Funding Group>.
Spending that means that’s time not spent getting clear about vision, building product or talking to users and that will be negative for applying to VC.