The reason to be explicit is to be able to handle control flow.
The datasets aren’t dependent on each other, though some of them use the same input parameters.
If your jobs are independent, then they should be scheduled as such. This allows jobs to run in parallel.
Sure, there’s some benefit to breaking down jobs even further. There’s also overhead to spinning up workers. Each of these functions takes ~30s to run, so it ends up being more efficient to put them in one job instead of multiple.
Your errors would come out just as fast if you ran check_dataset_params() up front.
So then you have to maintain check_dataset_params, which gives you a level of indirection. I don’t think this is likely to be much less error-prone.
The benefit of the pass-through approach is that it uses language-level features to do the validation – you simply check whether the parameters dict has keywords for each argument the function is expecting.
A good way to increase feedback rate is to write better tests.
I agree in general, but I don’t think there are particularly good ways to test this without introducing indirection.
Failure in production should be the exception, not the norm.
The failure you’re talking about here is tripping a try clause. I agree that exceptions aren’t the best control flow – I would prefer if the pattern I’m talking about could be implemented with if statements – but it’s not really a major failure, and (unfortunately) a pretty common pattern in Python.
Each of these functions takes ~30s to run, so it ends up being more efficient to put them in one job instead of multiple.
This is a perfect example of the AWS Batch API ‘leaking’ into your code. The whole point of a compute resource pool is that you don’t have to think about how many jobs you create.
It sounds like you’re using the wrong tool for the job (or a misconfiguration—e.g. limit the batch template to 1 vcpu).
The benefit of the pass-through approach is that it uses language-level features to do the validation
You get language-level validation either way. The assert statements are superfluous in that sense. What they do add is in effect check_dataset_params(), whose logic probably doesn’t belong in this file.
The failure you’re talking about here is tripping a try clause.
No, I meant a developer introducing a runtime bug.
This is a perfect example of the AWS Batch API ‘leaking’ into your code. The whole point of a compute resource pool is that you don’t have to think about how many jobs you create.
This is true. We’re using AWS Batch because it’s the best tool we could find for other jobs that actually do need hundreds/thousands of spot instances, and this particular job goes in the middle of those. If most of our jobs looked like this one, using Batch wouldn’t make sense.
You get language-level validation either way. The assert statements are superfluous in that sense. What they do add is in effect check_dataset_params(), whose logic probably doesn’t belong in this file.
You’re right. In the explicit example, it makes more sense to have that sort of logic at the call site.
The datasets aren’t dependent on each other, though some of them use the same input parameters.
Sure, there’s some benefit to breaking down jobs even further. There’s also overhead to spinning up workers. Each of these functions takes ~30s to run, so it ends up being more efficient to put them in one job instead of multiple.
So then you have to maintain
check_dataset_params
, which gives you a level of indirection. I don’t think this is likely to be much less error-prone.The benefit of the pass-through approach is that it uses language-level features to do the validation – you simply check whether the parameters dict has keywords for each argument the function is expecting.
I agree in general, but I don’t think there are particularly good ways to test this without introducing indirection.
The failure you’re talking about here is tripping a
try
clause. I agree that exceptions aren’t the best control flow – I would prefer if the pattern I’m talking about could be implemented with if statements – but it’s not really a major failure, and (unfortunately) a pretty common pattern in Python.This has its own problems, but you could use inspect.signature, I think?
I didn’t know about that, thanks!
This is a perfect example of the AWS Batch API ‘leaking’ into your code. The whole point of a compute resource pool is that you don’t have to think about how many jobs you create.
It sounds like you’re using the wrong tool for the job (or a misconfiguration—e.g. limit the batch template to 1 vcpu).
You get language-level validation either way. The
assert
statements are superfluous in that sense. What they do add is in effectcheck_dataset_params()
, whose logic probably doesn’t belong in this file.No, I meant a developer introducing a runtime bug.
This is true. We’re using AWS Batch because it’s the best tool we could find for other jobs that actually do need hundreds/thousands of spot instances, and this particular job goes in the middle of those. If most of our jobs looked like this one, using Batch wouldn’t make sense.
You’re right. In the explicit example, it makes more sense to have that sort of logic at the call site.