Lately I’ve been writing some new (”greenfield”)
code in
C.
I feel a bit silly about this: isn’t
Rust what you’re supposed to use
if you want metal-level performance and are unconstrained by past
engineering decisions? Or
Go?
Something on the
JVM?
Even
C++? Why use
a 50 year-old language that is notorious for memory unsafety and
nasal
demons?
There are a few things about this situation that are a pretty good fit
for C:
I’m prototyping in a data-heavy situation, iterating on the
right way to solve various problems. Normally I would use Python for prototyping, and I am
using it a bunch here. In many cases, however, I need to run my code
quickly over a large amount of data and see how it works in practice,
and my Python implementations have generally been far too slow.
I’m implementing algorithms that are computationally
straight-forward: a Trie to count unique
fixed-length substrings (code),
or approximating the previous algorithm with multiple processes writing
to a big block of shared memory and accepting collisions (code).
If I needed library support or had a large amount of tricky logic I’d
use a different tool.
I’m processing very simple
formats: essentially just long strings of
[ACGT]*. This keeps the code rule
of two compliant. I wouldn’t want to use C for any tricky parsing
unless it was sandboxed and I was very confident in the sandboxing
setup.
I’m working on this set of problems mostly by myself, so I
should choose the tooling where I’ll be able to make progress most
quickly. It doesn’t matter much right now if my code is a bit weird.
Once I get a better handle on how we’re going to approach this problem
computationally it will likely make sense to rewrite in a modern
language, which will both be safer and more readable.
The biggest risk here is that prototype code will become production
code, there will always be something more urgent than a rewrite, and
at the core of our system we have a chunk of code in an unsafe
language that’s poorly documented and confusing to read. This is
probably the strongest argument against starting anything in C, even
for prototyping. While I can’t completely commit to ensuring this
doesn’t happen, I’m going into this with my eyes open. And I will at
least commit to seriously documenting anything that’s becoming
production code.
This is a bit of an unusual confluence of factors, and if you’d asked
me a few years ago if I was ever going to write C professionally again,
let alone choose to start something in C, I would have said no. Yet,
in this case, I think it’s the right call.
(Large parts of my rhythm stage setup,
including both the MIDI
routing and whistle-controlled
synthesizer are also in C, though there for minimizing latency
instead of maximizing throughput. Since that’s something silly
I’m doing for fun I feel less weird about it.)
Prototyping in C
Link post
Lately I’ve been writing some new (” greenfield”) code in C. I feel a bit silly about this: isn’t Rust what you’re supposed to use if you want metal-level performance and are unconstrained by past engineering decisions? Or Go? Something on the JVM? Even C++? Why use a 50 year-old language that is notorious for memory unsafety and nasal demons?
There are a few things about this situation that are a pretty good fit for C:
I’m prototyping in a data-heavy situation, iterating on the right way to solve various problems. Normally I would use Python for prototyping, and I am using it a bunch here. In many cases, however, I need to run my code quickly over a large amount of data and see how it works in practice, and my Python implementations have generally been far too slow.
I’m implementing algorithms that are computationally straight-forward: a Trie to count unique fixed-length substrings (code), or approximating the previous algorithm with multiple processes writing to a big block of shared memory and accepting collisions (code). If I needed library support or had a large amount of tricky logic I’d use a different tool.
I’m processing very simple formats: essentially just long strings of
[ACGT]*
. This keeps the code rule of two compliant. I wouldn’t want to use C for any tricky parsing unless it was sandboxed and I was very confident in the sandboxing setup.I’m working on this set of problems mostly by myself, so I should choose the tooling where I’ll be able to make progress most quickly. It doesn’t matter much right now if my code is a bit weird. Once I get a better handle on how we’re going to approach this problem computationally it will likely make sense to rewrite in a modern language, which will both be safer and more readable.
The biggest risk here is that prototype code will become production code, there will always be something more urgent than a rewrite, and at the core of our system we have a chunk of code in an unsafe language that’s poorly documented and confusing to read. This is probably the strongest argument against starting anything in C, even for prototyping. While I can’t completely commit to ensuring this doesn’t happen, I’m going into this with my eyes open. And I will at least commit to seriously documenting anything that’s becoming production code.
This is a bit of an unusual confluence of factors, and if you’d asked me a few years ago if I was ever going to write C professionally again, let alone choose to start something in C, I would have said no. Yet, in this case, I think it’s the right call.
(Large parts of my rhythm stage setup, including both the MIDI routing and whistle-controlled synthesizer are also in C, though there for minimizing latency instead of maximizing throughput. Since that’s something silly I’m doing for fun I feel less weird about it.)
Comment via: facebook