Solve The Real Problem

Discussions about professional software development with a focus on building real, solid, performing, reliable, monitorable large-scale online services in asynchronous C++. You know, good solid breakfast food.

Monday, August 07, 2006

What goes into a good toolbox?

Let's start with this:

make HOST=i586-mingw32msvc

That's my kind of cross-platform portability. It's got two things going for it:

  1. It's cross-compilation, so you don't need to keep multiple development environments or a single complicated development environment that somehow supports multiple system types.
  2. It implies that your code base "just compiles" and "just works" on different platforms.

My development environment of choice is Linux, which makes sense given my development focus. But nowadays, it's an even more natural choice because you can create cross toolchains for dang-near everything, including Windows. This means I can use emacs, bash, GNU make, perl, and everything in their native environments to target any (C++) environment. There's a real benefit from this in terms of maintaining the toolchain and build system because even though the code you're building could be targetted to run wherever, you know that the build is running on Linux. So perl is there. And the slashes lean left. And you can unlink files while they're open. And so on. So all the scripty-makefiley-configury stuff you bake into your build environment doesn't have to be cross-platform. Just the code does.

But what about the code?

The code is the trickier part, of course. So, as with any problem of this sort, you encapsulate the concept that varies: in this case, the operating system. You could go grab one of those Portable Runtime environments like NSPR or APR, but they always seem to me to look like some other, obscure decievingly-familiar operating system API and the baggage level tends to be high.

It turns out that every operating system I would ever dream of developing for is close enough to POSIX as to make no odds. Because seriously, you've got POSIX (UNIX including Mac) and then you've got Windows which seems to slowly becoming POSIX. So for my money, making things look a POSIX-y as possible is Plenty Good. For Dave and I, this is made easier by the fact that we never intend to use threads, and we fully intended to write our own asynchronous full-performance networking subsystems (for example). Over time, we've refined a suite of libraries full of objects, functions, and other types that wrap up the OS-level APIs, where necessary, as thinly as possible (but not too thinly).

For example, we register a signal handler via the Resource Acquisition Is Initialization idiom to make it simpler and safer to use. That doesn't add any real cost and offers oodles of benefit. An example of wrapping things more would be making a real class to represent IPv4:port address pairs instead of using raw struct ::sockaddr_in types. That's a bit harder to get right, but if you make the right choices about responsibility deliniation, you get there over time.

A much larger example would be wrapping up sockets. Ahh, yes. Every third-year C++ student taking a networking course has written a socket class. You can download or buy dozens, if not hundreds or thousands, of such things, some written by professional programmers. And virtually all of them suck, at least in one way that you really care about. It's because they were probably written, like most such software, to solve some problem quickly so that someone can get on to the real work they wanted to do. Then, once they are written, they are put in the "glad I don't have to think about that again" bucket and forgotten. And now, we are at the nub of the toolbox issue.

Building a software toolbox is its own "real work".

By that, I mean that it is a task unto itself to be savoured and enjoyed. We must see that building the software toolbox is valuable (and fun) because if done properly it can achieve that desired end state where we "don't have to think about that again (for a while)". If you're going to write an asynchronous, single-threaded C++ socket library that supports TLS (or other extra transport layers), full-speed write queue management (with memory relinquishment), asynchronous DNS, and a serialization interface, you have to know some of the issues up front. Namely, the networking ones.

Nobody starts out wanting all those things. And I wouldn't argue that (without previous serious experience building them) you aim to end up with them in any short order. I do argue that a properly designed small library will easily allow such features to be added with minimal API disruption and no performance loss, most of the time. It's a simple balance between two software development rules:

  1. You can't know everything up front.
  2. The more you know up front, the better decisions you can make.

These statements are obviously true. If I know all of the issues surrounding plain sockets, TLS, DNS, and write buffering (for example), I can probably make some good design decisions out of the gate and implement in a fairly straight line. But in order to know all of those issues, I've probably had to have done it or something like it before. So, I should accept that usually, I can't hit all the targets all at once, and I should start out by aiming for something smaller. Most programmers I've worked with reach this conclusion, but I encourage us all to go one step further and embrace the principle of Solve the whole problem.

To solve the whole problem, we must know the whole problem. So when we make the responsible decision to, say, not try to implement all of our complex socket features at once, and instead just focus on getting "the basics" working, let's make sure we really understand the basics. And let's make sure we really think about what kind of interface we want to offer to our users (the first three of whom will be us, us and us as we implement the other features on our TODO list). In this case, that means cracking the Stevens books and really getting to know all about sockets and POSIX blocking primitives. Then, use that knowledge and understanding to offer a powerful interface that keeps easy things easy and make hard things less hard (or full-blown easy if you can manage it). Learn it all so your library's users don't have to. Deal with all the nasty issues so your users don't have to. Own the problem--the whole problem--and solve it. That's how you multiply productivity.

0 Comments:

Post a Comment

<< Home