I think this “use computers to find a cure for cancer” example is misleading. The issue is confusion between optimization and hypothesis generation.
The “cure for cancer” kind of problems are not of the the “we have a bunch of hypotheses, which is the best one?” kind. They are of the “we have no good hypotheses and we need to generate/invent/create some” kind. And optimizers, regardless of how powerful they are, are useless for that.
And optimizers, regardless of how powerful they are, are useless for that.
The (powerful) optimizer needs to have a model of how its optimizations impact that which is to be optimized. A model it adapts. Hypotheses, in other words. “Maximize my life span” would need to deal with cancer.
The (powerful) optimizer needs to have a model of how its optimizations impact that which is to be optimized.
Yes, that’s typically called a “fitness function” or a “loss function”, depending on the sign.
But the problem is defining the set out of which you pick your “optimizations” to be evaluated. Make it too narrow and your optimum will be outside of it, make it too wide and you’ll never find the optimum.
They are of the “we have no good hypotheses and we need to generate/invent/create some” kind
Not really. Firstly, we do have chemistry figured out very well; it’s just that cells are complicated and it is very difficult to find the consequences of our interventions, so we tend to throw fairly bad ideas at the wall and see what sticks. Secondarily, generating most plausible hypotheses that fit the data is also an optimization problem.
And thirdly, observe that evolution—a rather messy optimization process—did decrease the per-cell cancer rate for a whale to the utterly minuscule fraction of that of a human. And of a human, to a fairly small fraction to that of a dog. (With an advanced optimization, whale’s cellular biochemistry may be of use also—a whale has far more cells).
it is very difficult to find the consequences of our interventions
Otherwise,
generating most plausible hypotheses that fit the data is also an optimization problem
Is not true. In your example of evolution it’s sexual reproduction and mutation that “generate hypotheses”—neither is an optimizer.
Yes, I understand that you can treat hypothesis generation as a traversal of hypothesis space and so a search and so an optimization, but that doesn’t seem to be a helpful approach in this instance.
We have chemistry figured out, we don’t have “making truly enormous computers to compute enough of that chemistry fast enough” figured out, or “computing chemistry a lot more efficiently” figured out. Does that make it clearer?
I am not entirely clear how do you imagine hypothesis generation happening on a computer, other than by either trying what sticks, or analytically finding the best hypothesis that works by working backwards from the data.
Your position is clear, it’s just that I don’t agree with it. I don’t think that human biochemistry has been figured out (e.g. consider protein structure). I also think that modeling human body at the chemistry level is not a problem of insufficient computing power. It’s a problem of insufficient knowledge.
Non-trivial hypothesis generation is very hard to do via software which is one of the reasons why IBM’s Watson haven’t produced a cure for cancer already. Humans are still useful in some roles :-/
The structure of a protein is determined by the known laws of physics, other compounds in the solution, and protein’s formula (which is a trivial translation of the genetic code for that protein). But it is very computationally expensive to simulate for a large, complicated protein. Watson is a very narrow machine that tries to pretend at answering by using a large database of answers. AFAIK it can’t even do trivial new answers (what is the velocity of a rock that fell from the height of 131.5 meters? Wolfram Alpha can answer this, but it is just triggered by the keyword ‘fell’ and ‘height’)
I think this “use computers to find a cure for cancer” example is misleading. The issue is confusion between optimization and hypothesis generation.
The “cure for cancer” kind of problems are not of the the “we have a bunch of hypotheses, which is the best one?” kind. They are of the “we have no good hypotheses and we need to generate/invent/create some” kind. And optimizers, regardless of how powerful they are, are useless for that.
The (powerful) optimizer needs to have a model of how its optimizations impact that which is to be optimized. A model it adapts. Hypotheses, in other words. “Maximize my life span” would need to deal with cancer.
Yes, that’s typically called a “fitness function” or a “loss function”, depending on the sign.
But the problem is defining the set out of which you pick your “optimizations” to be evaluated. Make it too narrow and your optimum will be outside of it, make it too wide and you’ll never find the optimum.
Not really. Firstly, we do have chemistry figured out very well; it’s just that cells are complicated and it is very difficult to find the consequences of our interventions, so we tend to throw fairly bad ideas at the wall and see what sticks. Secondarily, generating most plausible hypotheses that fit the data is also an optimization problem. And thirdly, observe that evolution—a rather messy optimization process—did decrease the per-cell cancer rate for a whale to the utterly minuscule fraction of that of a human. And of a human, to a fairly small fraction to that of a dog. (With an advanced optimization, whale’s cellular biochemistry may be of use also—a whale has far more cells).
I don’t think that the statement
is consistent with
Otherwise,
Is not true. In your example of evolution it’s sexual reproduction and mutation that “generate hypotheses”—neither is an optimizer.
Yes, I understand that you can treat hypothesis generation as a traversal of hypothesis space and so a search and so an optimization, but that doesn’t seem to be a helpful approach in this instance.
We have chemistry figured out, we don’t have “making truly enormous computers to compute enough of that chemistry fast enough” figured out, or “computing chemistry a lot more efficiently” figured out. Does that make it clearer?
I am not entirely clear how do you imagine hypothesis generation happening on a computer, other than by either trying what sticks, or analytically finding the best hypothesis that works by working backwards from the data.
Your position is clear, it’s just that I don’t agree with it. I don’t think that human biochemistry has been figured out (e.g. consider protein structure). I also think that modeling human body at the chemistry level is not a problem of insufficient computing power. It’s a problem of insufficient knowledge.
Non-trivial hypothesis generation is very hard to do via software which is one of the reasons why IBM’s Watson haven’t produced a cure for cancer already. Humans are still useful in some roles :-/
The structure of a protein is determined by the known laws of physics, other compounds in the solution, and protein’s formula (which is a trivial translation of the genetic code for that protein). But it is very computationally expensive to simulate for a large, complicated protein. Watson is a very narrow machine that tries to pretend at answering by using a large database of answers. AFAIK it can’t even do trivial new answers (what is the velocity of a rock that fell from the height of 131.5 meters? Wolfram Alpha can answer this, but it is just triggered by the keyword ‘fell’ and ‘height’)