This post uses “I can identify ways in which chairs are bad” as an example. But it’s easier for me to verify that I can sit in a chair and that it’s comfortable then to make a chair myself. So I don’t really know why this is a good example for “verification is easier than generation”.
More examples:
I can tell my computer is a good typing machine, but cannot make one myself
I can tell a waterbottle is water tight, but do not know how to make a water bottle
I can tell that my pepper grinder grinds pepper, but do not know how to make a pepper grinder.
evaluation isn’t easier than generation, and that claim is true regardless of how good you are at evaluation until you get basically perfect at it
then I think there is a large disconnect between the post above, which is positing that in order for this claim to be false there has to be some “deep” sense in which delagation is viable, and the sense in which I think this crux is obviously false in the more mundane sense in which all humans interface with the world and optimize over the products other people create, and are therefore more capable than they would have been if they had to make all products for themselves from scratch.
I assumed John was pointing at verifying that perhaps the chemicals used in the production of the chair might have some really bad impact on the environmnet, start causing a problem with the food chain eco system and make food much scarcers for everyone—including the person who bought the chair—in the meaningfully near future. Something a long those lines.
As you note, verifying the chair functions as you want—as a place to sit that is comfortable—is pretty easy. Most of us probably do that without even really thinking about it. But will this chair “kill me” in the future is not so obvious or easy to assess.
I suspect at the core, this is a question about an assumption about evaluating a simple/non-complex world and doing so in an inherently complex world do doesn’t allow true separability in simple and independant structures.
I assumed John was pointing at verifying that perhaps the chemicals used in the production of the chair might have some really bad impact on the environmnet, start causing a problem with the food chain eco system and make food much scarcers for everyone—including the person who bought the chair—in the meaningfully near future.
What I had in mind is more like: many times over the years I’ve been sitting at a desk and noticed my neck getting sore. Then when I move around a bit, I realize that the chair/desk/screen are positioned such that my neck is at an awkward angle when looking at the screen, which makes my neck sore when I hold that angle for a long time. The mispositioning isn’t very salient; I just reflexively adjust my neck to look at the screen and don’t notice that it’s at an awkward angle. Then later my neck hurts, and it’s nonobvious and takes some examination to figure out why my neck hurts.
That sort of thing, I claim, generalizes to most “ergonomics”. Chairs, keyboards, desks, mice… these are all often awkward in ways which make us uncomfortable when using them for a long time. But the awkwardness isn’t very salient or obvious (for most people), because we just automatically adjust position to handle it, and the discomfort only comes much later from holding that awkward position for a long time.
I agree ergonimics can be hard to verify. But some ergonomics are easy to verify, and chairs conform to those ergonomics (e.g. having a backrest is good, not having sharp stabby parts are good, etc.).
I mean, sure, for any given X there will be some desirable properties of X which are easy to verify, and it’s usually pretty easy to outsource the creation of an X which satisfies the easy-to-verify properties. The problem is that the easy-to-verify properties do not typically include all the properties which are important to us. Ergonomics is a very typical example.
Extending to AI: sure, there will be some desirable properties of AI which are easy to verify, or properties of alignment research which are easy to verify, or properties of plans which are easy to verify, etc. And it will be easy to outsource the creation of AI/research/plans which satisfy those easy-to-verify properties. Alas, the easy-to-verify properties do not include all the properties which are important to us, or even all the properties needed to not die.
I think there are some easy-to-verify properties that would make us more likely to die if they were hard-to-verify. And therefore think “verification is easier than generation” is an important part of the overall landscape of AI risk.
I agree that there are some properties of objects that are hard to verify. But that doesn’t mean generation is as hard as verification in general. The central property of a chair (that you can sit on it) is easy to verify.
This post uses “I can identify ways in which chairs are bad” as an example. But it’s easier for me to verify that I can sit in a chair and that it’s comfortable then to make a chair myself. So I don’t really know why this is a good example for “verification is easier than generation”.
More examples:
I can tell my computer is a good typing machine, but cannot make one myself
I can tell a waterbottle is water tight, but do not know how to make a water bottle
I can tell that my pepper grinder grinds pepper, but do not know how to make a pepper grinder.
If the goal of this post is to discuss the crux https://www.lesswrong.com/posts/fYf9JAwa6BYMt8GBj/link-a-minimal-viable-product-for-alignment?commentId=mPgnTZYSRNJDwmr64:
then I think there is a large disconnect between the post above, which is positing that in order for this claim to be false there has to be some “deep” sense in which delagation is viable, and the sense in which I think this crux is obviously false in the more mundane sense in which all humans interface with the world and optimize over the products other people create, and are therefore more capable than they would have been if they had to make all products for themselves from scratch.
I assumed John was pointing at verifying that perhaps the chemicals used in the production of the chair might have some really bad impact on the environmnet, start causing a problem with the food chain eco system and make food much scarcers for everyone—including the person who bought the chair—in the meaningfully near future. Something a long those lines.
As you note, verifying the chair functions as you want—as a place to sit that is comfortable—is pretty easy. Most of us probably do that without even really thinking about it. But will this chair “kill me” in the future is not so obvious or easy to assess.
I suspect at the core, this is a question about an assumption about evaluating a simple/non-complex world and doing so in an inherently complex world do doesn’t allow true separability in simple and independant structures.
What I had in mind is more like: many times over the years I’ve been sitting at a desk and noticed my neck getting sore. Then when I move around a bit, I realize that the chair/desk/screen are positioned such that my neck is at an awkward angle when looking at the screen, which makes my neck sore when I hold that angle for a long time. The mispositioning isn’t very salient; I just reflexively adjust my neck to look at the screen and don’t notice that it’s at an awkward angle. Then later my neck hurts, and it’s nonobvious and takes some examination to figure out why my neck hurts.
That sort of thing, I claim, generalizes to most “ergonomics”. Chairs, keyboards, desks, mice… these are all often awkward in ways which make us uncomfortable when using them for a long time. But the awkwardness isn’t very salient or obvious (for most people), because we just automatically adjust position to handle it, and the discomfort only comes much later from holding that awkward position for a long time.
I agree ergonimics can be hard to verify. But some ergonomics are easy to verify, and chairs conform to those ergonomics (e.g. having a backrest is good, not having sharp stabby parts are good, etc.).
I mean, sure, for any given X there will be some desirable properties of X which are easy to verify, and it’s usually pretty easy to outsource the creation of an X which satisfies the easy-to-verify properties. The problem is that the easy-to-verify properties do not typically include all the properties which are important to us. Ergonomics is a very typical example.
Extending to AI: sure, there will be some desirable properties of AI which are easy to verify, or properties of alignment research which are easy to verify, or properties of plans which are easy to verify, etc. And it will be easy to outsource the creation of AI/research/plans which satisfy those easy-to-verify properties. Alas, the easy-to-verify properties do not include all the properties which are important to us, or even all the properties needed to not die.
I think there are some easy-to-verify properties that would make us more likely to die if they were hard-to-verify. And therefore think “verification is easier than generation” is an important part of the overall landscape of AI risk.
That is certainly a more directly related, non-obvious aspect for verification. Thanks.
I agree that there are some properties of objects that are hard to verify. But that doesn’t mean generation is as hard as verification in general. The central property of a chair (that you can sit on it) is easy to verify.
This feels more like an argument that Wentworth’s model is low-resolution than that he’s actually misidentified where the disagreement is?