I work in the software industry but am not a software developer. My job is to write about software development, and I’ve learned a whole bucketload of terms: stuff like ‘linked lists’, ‘CI/CD’, ‘performance optimization’, ‘deploy to AWS’, ‘dockerize’, ‘microservices’, ‘SQL injection’, ‘multithreaded program’, ‘vectorized code’, and on and on and on. However, a lot of the time I’m basically just Chinese-rooming – I can write about these things, but I don’t actually understand how any of them fit together. For example, I’ve had three people try to explain exactly what an API is to me, for more than two hours total, but I just can’t internalize it. I feel that there’s some impossible-to-articulate piece I’m missing, and none of the words people say to me about software stuff stick because I’m lacking a foundation on which to build up my understanding.
So my question is, are there any books (or other resources) that explain the field of software engineering as a cohesive whole? I’m not looking for books that will teach me to code, because I don’t think that’s the thing I want. Feel free to ask clarifying questions. Thanks!
-
EDIT: I realized I should include more context on my work and my background, so here it is:
I have an undergrad degree in physics, which gave me extremely minimal exposure to Python. I also took two quarters of intro CS, one in C and one in Racket. As a result I know how to write a for loop and a bit about very basic algorithms; that’s about it. I’ve been in my current job for nearly a year, and my primary task is to write about the skillsets of individual software engineers. This entails things like connecting someone’s verbal knowledge of back-end web development to their experience creating microservices; I can do this quite competently and don’t make many technical mistakes. I have also learned a bit on the job regarding a couple data structures, some web stuff, and smatterings of info about ML, data science, DevOps, front-end/UI, and mobile development.
It’s hard to find a book that explains software engineering as “a cohesive whole” because software engineering isn’t cohesive. It’s a grab-bag of various fields that have never been well-organized.
I’d sort the words you listed into the following classes:
Data structures and algorithms: linked lists
Libraries: APIs, microservices
Deployment: CI/CD, AWS, dockerize
Security: SQL injection
Optimization: multithreaded program, vectorized code
Of these five categories, the only one that could be called “cohesive” is data structures and algorithms. This is a branch of mathematics. There’s dozens of introductory textbooks on the subject. Pick any one you like and skim it. You can skip the example code entirely.
Of these five categories, the only one that’s truly “universal” to software development is the concept of libraries. A software library is a bit of code someone else wrote. So instead of having to write software to do X you can issue a command to software someone else wrote that does X. Getting a feel for libraries does require writing real code, but you don’t have to write much because libraries are all about leveraging code other people wrote. The best way to learn how libraries work is to copy someone else’s Python or JavaScript script. All those
import
statements at the top are libraries. To learn about APIs or microservices you should write (read: copy) someone else’s script that interacts with one.It’s hard to find a good book on deployment because the whole field was transformed recently with the launch of AWS and its clones Azure and Google Cloud. The field continues to change rapidly. Anything you learn about it will go rapidly out-of-date. If you want your knowledge to endure you should start with the broader history of severs and networking. In particular, you should find a book on how the Internet is architected (and maybe a little something on Unix Systems, like Chapter 2 of The Art of Unix Programming). Both would give you an solid idea of how the Internet works without writing any code. This will put the jargon you know into context.
Security the least cohesive of any sub-field of software development. It is the ultimate grab-bag of exploits and counter-defenses. There is no foundation. It’s turtles all the way down. The best way to get a feel for security is to read some blogs like Schneier on Security and Krebs on Security. You can look up individual exploits like SQL injection when you need to know what they are.
Optimization is almost as incohesive as security. Basically, multithreading and distributed systems are hard. Optimization is a set of tricks to get around this difficulty—except this time they require a deeper understanding of computer science. In general, optimization is the hardest sub-field to understand without a foundation in computer science. The most important trick is “functional code”, something difficult to understand without writing code yourself. However, many bits such as caching and the application of GPUs can be understood without knowledge of how to code.
Don’t be too intimidated. Half of professional software engineers don’t understand half of these subjects half as well as they’d like.
These bits of humor, philosophy and documentation might help build an intuitive general understanding of software development better than any explicit book on the subject.
Paul Graham’s essays on software development.
Git From the Bottom Up
The Story of Mel
List of Free Programming Books
I minor detour here. I have a sense this is semi done already with all the shared classes and foundation classes but has the software industry ever attempted to replicate something along the lines of a Dewey Decimal System for the software libraries?
Yes. The shared classes and foundation classes are called “standard libraries”. Collections of non-standard libraries are called “repositories”. Repositories usually accessed via a “package manager”. Repositories tend to be system-specific or language-specific. Here are some of the more popular repositories.
Python Package Index
Linux: Ubuntu, Arch, Debian
node.js
Conda combines several of these specific repositories into a mega package manager like you describe.
Another book that might be useful is Peter Seibel’s Coders at Work: Reflections on the Craft of Programming. It is a collection of interviews with prominent software engineers (like Jamie Zawinsky, Douglas Crockford, Joe Armstrong, Ken Thompson, etc) in which they describe how they work and what it feels like (subjectively) for them to write code.
The benefit for practicing software engineers is to read the responses from other programmers in order to gain the perspectives of accomplished programmers on the act of programming. The benefit for you would be to look at how Seibel interviews programmers and how he can get them to speak about their accomplishments without necessarily getting too deep into the details of their work.
You might find Joel Spolsky’s books:
Joel on Software: And on Diverse and Occasionally Related Matters That Will Prove of Interest to Software Developers, Designers, and Managers, and to Those Who, Whether by Good Fortune or Ill Luck, Work with Them in Some Capacity
and
More Joel on Software: Further thoughts on Diverse and Occasionally Related Matters...
to be amusing and helpful. They’re selections from his popular blog. I read them when I was getting started as a software engineer and found them helpful.
(The same Joel Spolsky who, after his blog got popular, went on to create StackOverflow.com)
Oh this looks promising, thanks Rogs! Do you have a copy I could borrow?
If I still have it, it’s in storage in Seattle :P
Since nobody else seems to have mentioned it: Code Complete is probably part of the answer you’re looking for, even if it’s several years old by this point—the concepts you’re looking to learn aren’t as fleeting as the technical details that change all the time. (Although, I don’t remember if even the latest edition tackles Agile methodology, so you might need a separate resource for that if it doesn’t.)