Looking for books about software engineering as a field

post by mingyuan · 2020-02-03T21:49:05.926Z · LW · GW · 7 comments

This is a question post.

Contents

  Answers
    29 lsusr
    5 quanticle
    2 ESRogs
    1 digital_carver
None
7 comments

I work in the software industry but am not a software developer. My job is to write about software development, and I've learned a whole bucketload of terms: stuff like 'linked lists', 'CI/CD', 'performance optimization', 'deploy to AWS', 'dockerize', 'microservices', 'SQL injection', 'multithreaded program', 'vectorized code', and on and on and on. However, a lot of the time I'm basically just Chinese-rooming – I can write about these things, but I don't actually understand how any of them fit together. For example, I've had three people try to explain exactly what an API is to me, for more than two hours total, but I just can't internalize it. I feel that there's some impossible-to-articulate piece I'm missing, and none of the words people say to me about software stuff stick because I'm lacking a foundation on which to build up my understanding.

So my question is, are there any books (or other resources) that explain the field of software engineering as a cohesive whole? I'm not looking for books that will teach me to code, because I don't think that's the thing I want. Feel free to ask clarifying questions. Thanks!

-

EDIT: I realized I should include more context on my work and my background, so here it is:

I have an undergrad degree in physics, which gave me extremely minimal exposure to Python. I also took two quarters of intro CS, one in C and one in Racket. As a result I know how to write a for loop and a bit about very basic algorithms; that's about it. I've been in my current job for nearly a year, and my primary task is to write about the skillsets of individual software engineers. This entails things like connecting someone's verbal knowledge of back-end web development to their experience creating microservices; I can do this quite competently and don't make many technical mistakes. I have also learned a bit on the job regarding a couple data structures, some web stuff, and smatterings of info about ML, data science, DevOps, front-end/UI, and mobile development.

Answers

answer by lsusr · 2020-02-04T03:21:30.208Z · LW(p) · GW(p)

It's hard to find a book that explains software engineering as "a cohesive whole" because software engineering isn't cohesive. It's a grab-bag of various fields that have never been well-organized.

I'd sort the words you listed into the following classes:

  • Data structures and algorithms: linked lists
  • Libraries: APIs, microservices
  • Deployment: CI/CD, AWS, dockerize
  • Security: SQL injection
  • Optimization: multithreaded program, vectorized code

Of these five categories, the only one that could be called "cohesive" is data structures and algorithms. This is a branch of mathematics. There's dozens of introductory textbooks on the subject. Pick any one you like and skim it. You can skip the example code entirely.

Of these five categories, the only one that's truly "universal" to software development is the concept of libraries. A software library is a bit of code someone else wrote. So instead of having to write software to do X you can issue a command to software someone else wrote that does X. Getting a feel for libraries does require writing real code, but you don't have to write much because libraries are all about leveraging code other people wrote. The best way to learn how libraries work is to copy someone else's Python or JavaScript script. All those import statements at the top are libraries. To learn about APIs or microservices you should write (read: copy) someone else's script that interacts with one.

It's hard to find a good book on deployment because the whole field was transformed recently with the launch of AWS and its clones Azure and Google Cloud. The field continues to change rapidly. Anything you learn about it will go rapidly out-of-date. If you want your knowledge to endure [LW · GW] you should start with the broader history of severs and networking. In particular, you should find a book on how the Internet is architected (and maybe a little something on Unix Systems, like Chapter 2 of The Art of Unix Programming). Both would give you an solid idea of how the Internet works without writing any code. This will put the jargon you know into context.

Security the least cohesive of any sub-field of software development. It is the ultimate grab-bag of exploits and counter-defenses. There is no foundation. It's turtles all the way down. The best way to get a feel for security is to read some blogs like Schneier on Security and Krebs on Security. You can look up individual exploits like SQL injection when you need to know what they are.

Optimization is almost as incohesive as security. Basically, multithreading and distributed systems are hard. Optimization is a set of tricks to get around this difficulty—except this time they require a deeper understanding of computer science. In general, optimization is the hardest sub-field to understand without a foundation in computer science. The most important trick is "functional code", something difficult to understand without writing code yourself. However, many bits such as caching and the application of GPUs can be understood without knowledge of how to code.

Don't be too intimidated. Half of professional software engineers don't understand half of these subjects half as well as they'd like.


These bits of humor, philosophy and documentation might help build an intuitive general understanding of software development better than any explicit book on the subject.

comment by jmh · 2020-02-05T01:22:56.321Z · LW(p) · GW(p)

I minor detour here. I have a sense this is semi done already with all the shared classes and foundation classes but has the software industry ever attempted to replicate something along the lines of a Dewey Decimal System for the software libraries?

Replies from: lsusr
comment by lsusr · 2020-02-05T03:37:19.608Z · LW(p) · GW(p)

Yes. The shared classes and foundation classes are called "standard libraries". Collections of non-standard libraries are called "repositories". Repositories usually accessed via a "package manager". Repositories tend to be system-specific or language-specific. Here are some of the more popular repositories.

Conda combines several of these specific repositories into a mega package manager like you describe.

answer by quanticle · 2020-02-05T15:34:48.089Z · LW(p) · GW(p)

Another book that might be useful is Peter Seibel's Coders at Work: Reflections on the Craft of Programming. It is a collection of interviews with prominent software engineers (like Jamie Zawinsky, Douglas Crockford, Joe Armstrong, Ken Thompson, etc) in which they describe how they work and what it feels like (subjectively) for them to write code.

The benefit for practicing software engineers is to read the responses from other programmers in order to gain the perspectives of accomplished programmers on the act of programming. The benefit for you would be to look at how Seibel interviews programmers and how he can get them to speak about their accomplishments without necessarily getting too deep into the details of their work.

answer by ESRogs · 2020-02-05T01:46:47.951Z · LW(p) · GW(p)

You might find Joel Spolsky's books:

Joel on Software: And on Diverse and Occasionally Related Matters That Will Prove of Interest to Software Developers, Designers, and Managers, and to Those Who, Whether by Good Fortune or Ill Luck, Work with Them in Some Capacity

and

More Joel on Software: Further thoughts on Diverse and Occasionally Related Matters...

to be amusing and helpful. They're selections from his popular blog. I read them when I was getting started as a software engineer and found them helpful.

(The same Joel Spolsky who, after his blog got popular, went on to create StackOverflow.com)

comment by mingyuan · 2020-02-05T02:15:21.226Z · LW(p) · GW(p)

Oh this looks promising, thanks Rogs! Do you have a copy I could borrow?

Replies from: ESRogs
comment by ESRogs · 2020-02-05T02:18:54.770Z · LW(p) · GW(p)

If I still have it, it's in storage in Seattle :P

answer by digital_carver · 2020-02-17T12:22:09.452Z · LW(p) · GW(p)

Since nobody else seems to have mentioned it: Code Complete is probably part of the answer you're looking for, even if it's several years old by this point - the concepts you're looking to learn aren't as fleeting as the technical details that change all the time. (Although, I don't remember if even the latest edition tackles Agile methodology, so you might need a separate resource for that if it doesn't.)

7 comments

Comments sorted by top scores.

comment by philh · 2020-02-03T23:14:41.479Z · LW(p) · GW(p)

I have an intuition here that learning to code might be almost necessary for what you want. It's only an intuition, and it's not very strong. You may feel like your current understanding is higher than this intuition would predict, and I wouldn't contradict you. But it seemed worth sharing.

My feeling is that trying to understand these things without knowing how to code, would be like trying to understand the classification of finite simple groups without having sat down to play with some examples of groups. One could probably get an intellectual understanding of a group without playing with them, but playing with them will give an intuitive understanding that will be super helpful for understanding a "simple group" on an intellectual level, let alone an intuitive one. And so on.

(The rough hierarchy here, as I see it: a "group" is a collection of objects closed under a binary operation satisfying certain properties (examples include integers with addition, real numbers with multiplication, states of a rubix cube with side-twists). A "finite group" is probably easy enough, a "simple group" is one with no "normal subgroups" except the ones considered trivial. To classify these groups requires us to understand "isomorphisms" between groups: the goal is to take a relatively small collection of groups, and say "any finite simple group will be isomorphic to some group in this collection".)

And so my worry is that the foundation you need will need to be intuitive, and not just intellectual; and the way to get an intuitive understanding would be to work with code. (Which is also more than just writing it. Like half the things you name are related to the problem of "running code on a computer different than the one it was written on".) Not necessarily to a high skill level, but to some extent.

Unfortunately, if this is true, it's not likely an easy road. I think I'd been programming for some time before I felt like I understood what an API was. (Not just programming, but actually using APIs.)

comment by ESRogs · 2020-02-05T02:30:17.117Z · LW(p) · GW(p)
For example, I've had three people try to explain exactly what an API is to me, for more than two hours total, but I just can't internalize it.

Perhaps because they rehearsed their understanding at you rather than being more Socratic?

How would you describe what an API is (given your current level of understanding)?

comment by Viliam · 2020-02-04T22:54:15.318Z · LW(p) · GW(p)

If the books won't satisfy you, you can still ask individual questions here. But, as philh said [LW(p) · GW(p)], there is a chance that fully understanding something requires one to actually use it. Otherwise, your understanding will only be powered by analogies, and it will stop at the moment when there is no convenient analogy for something complex (that is, complex for people who never used it, but kinda intuitive for people who use it regularly), or you may misuse the analogy beyond its purpose. Also, you will be unable to distinguish between correct answers and wrong answers. -- That said, I am curious how far you actually can get like this.

For example, I've had three people try to explain exactly what an API is to me, for more than two hours total, but I just can't internalize it.

API is a list of functionality you are supposed to use (because the authors guarantee it will keep working tomorrow), as opposed to functionality that is either inaccessible to you (therefore you can't use it directly) or is technically accessible but you shouldn't touch it anyway, because the authors make no guarantee they won't change it tomorrow.

More generally, this idea of distinguishing between what is meant for use by others, and what are the internal details the others should not touch, is called "encapsulation".

Encapsulation can happen on multiple levels. You have small units of code, let's call them classes, which provide some functionality to the outside world, and keep some private details to themselves. Then you can compose a larger unit of code, let's call it a module, out of hundred such classes. Now there is a functionality that this module as a whole exposes to the outside world, and some details it wants to keep private. Then you compose the entire program or library out of dozen modules, and it provides some public functionality. API usually refers to the public functionality on the level of program or library.

Analogy time:

Imagine that I am a robot that offers some useful service. For example, I can remember numbers. The officially recommended way to use me is to come to my desk and say "Remember the number X" (some specific number, such as 42) and I will remember it; later you can come to my desk and say "Tell me the number I told you the last time" and I would tell you the number (42, in this case). The list of commands you should officially use with me is my API.

You can either use my services as a person, or you can send your own robots to interact with me. My behavior is the same in either case.

Now you are a curious person, and you notice that when you tell me a number, I will write it down on a piece of paper. When you ask me later, I will read the number from the paper. This inspires you to make an improvement to your process. You tell your robots that instead of asking me about the number, they should simply look at my desk and read the number on the paper. This is 40% faster, and that makes you happy!

Five weeks later, your factory stops producing stuff. It takes you a few hours to find out why. The robots that occassionally come to use my service, keep staying frozen at my desk and never return. That's because there is no paper on my desk anymore.

You complain to my owner and threaten to sue them. But my owner shows you the original contract, which specifies that you (or your robots) are supposed to ask me about the number; and there is nothing there about a paper on the desk. When you ask me, I give you the right answer. It's because this morning I was installed a new memory, so that I don't need to write things down anymore. (Which by the way makes me now 80% faster than before.) All other users of my services are happy about this change. Only you are mad, because now you have to reprogram all your robots that interact with me, otherwise your factory remains unproductive. It takes you three days to reprogram your robots, which means a great financial loss for you.

End of analogy.

The lesson is that when the user limits themselves to stuff they are supposed to use, it allows the service provider to make improvements, without breaking things. The only way to allow future improvements is to make some parts of the operation forbidden to use for the customers; otherwise you could not change them, and you can hardly improve things if you are not allowed to change anything.

comment by jmh · 2020-02-04T00:35:04.219Z · LW(p) · GW(p)

If your background is not software related, what is it. It might also help if you shared a bit more about just what writing you will be doing.

I tend to think philh is correct, you probably need to have some understanding of coding to be able to understand the higher level aspects and how things might relate. Then again, you might be able to borrow concepts from other areas you do know about that can serve as metaphor for what you're trying to understand in software engineering related subjects you are writing about.

Not sure if you've already looked at things like:

https://en.wikipedia.org/wiki/Software_engineering

Also, you might take a look at the introduction to some CE textbooks, probably free looking at Amazon, and get something of an industry overview (but that seems like such a broad level it might be meaningless for you goal).

Replies from: mingyuan
comment by mingyuan · 2020-02-05T02:14:14.216Z · LW(p) · GW(p)

Fair point, I probably should have said more about my background. Will also add it to the OP.

I have an undergrad degree in physics, which gave me extremely minimal exposure to Python. I also took two quarters of intro CS, one in C and one in Racket. As a result I know how to write a for loop and a bit about very basic algorithms; that's about it. I've been in my current job for nearly a year, and my primary task is to write about the skillsets of individual software engineers. This entails things like connecting someone's verbal knowledge of back-end web development to their experience creating microservices; I can do this quite competently and don't make many technical mistakes. I have also learned a bit on the job regarding a couple data structures, some web stuff, and smatterings of info about ML, data science, DevOps, front-end/UI, and mobile development.

And thanks for the Wikipedia link; I hadn't looked at that yet and might end up pursuing that :)

Replies from: jmh
comment by jmh · 2020-02-05T13:13:41.497Z · LW(p) · GW(p)

Pretty tough job it seems, and I get a better understanding of your needs. I think you covered the need for understanding basic programming. I wonder if maybe looking into areas like system analysis and "system" architecture -- system quoted here because that should cover the servers hardware, various software aspects, networking, security and... -- might identify some other books or online resources that software engineering might exclude. Those might help put all the part and details into that overview you are looking for.

comment by Pattern · 2020-02-04T07:26:06.584Z · LW(p) · GW(p)
For example, I've had three people try to explain exactly what an API is to me, for more than two hours total, but I just can't internalize it.

Many websites 'serve' some 'purpose'. You access that service through a web browser (while you have internet access). But if you wanted a program (or an app) to navigate a website, that might be difficult. APIs are 'Application (App) Programming Interfaces'. If you don't want to teach a program how to navigate a website*, you can instead understand how the website's API (if it has one) works, and then write a program that 'talks to' the website in that language, which is more minimalist. The way you interact with websites, they are very expressive - beautiful, but sometimes slow to load. When a program talks to the website via an API ('instead of' a web browser), it can get a response faster, and in a form it finds easier to read. (Unless your internet is out, or the website is down.) Part of how this works is that a website gives you everything as an image, whereas an API only answers specific questions when your computer asks them.

If you've ever seen a website with a youtube video stuck in the middle of it, then the website was probably made using the youtube API, because that's easier to do than the alternatives, like creating a video website yourself.

*This may seem do-able, but if you want to compare prices of products between websites, do

as a cohesive whole

I hope such a guide exists, but those things you mentioned may more easily fit together in smaller categories. 'The whole bucketload of terms you've learned' aren't all things that one person knows and handles by themself in every case.