Real Progress
I've got nothing signficant to say at the moment, except that I happened across The Secret of Success: Suck Less, and yes, I think he's on to something. I've had similar experiences using software that used to be great but then just accrues a pile of new useless-to-me features that I never intend to use until it gets to that point where it just sucks. I think this happens a lot with the "shiny thing" method of product development. Too often, people can be easily distracted by something shiny and new. If those people are driving a software product forward, that can lead to a loss of focus on the core goodness that makes people want it. It's especially bad when shiny features distract effort away from quality improvements (as Maz says in the referenced post). Of course, some whizbang features are great additions to the software . . . especially when your users have told you it would suck a lot less if you'd implement them. :) Oh, and while you're there, give Simplicity and Security a read, too.
Embarrassing Code
I've started reading Beautiful Code and the first three chapters are just as I've expected. Verterans talking about some specific problems more to give you insight in to how they think than to help you fully understand the problem. They've been good.
Now, however, I've read the fourth chapter, by Tim Bray, and I'm embarrassed for him. The article reads as biased Ruby evangelism with religious reasons for why Ruby is great, in stark contrast to the previous chapters that focused on making good design and implementation choices. I don't dislike or know Tim, but his chapter in this book actually bothered me enough to write about it.
Tim's style of prose is a little too informal and the chapter feels like it is targeted at beginner programers. However, it's the content that really bothered me. For example, he states the following:
This example (and most of the other examples in this chapter) is in the Ruby programming language because I believe it to be, while far from perfect, the most readable of languages.
If you don't know Ruby, learning it will probably make you a better programmer. In Chapter 29, the creator of Ruby, Yukihiro Matsumoto (generally known as "Matz"), discusses some of the design choices that have attracted me and so many other programmers to the language.
He then follows that with his first example program, which he elaborates on later to explore the problem domain he has chosen: showing the ten most popular articles on his personal blog (which is of course referred to with its URL). The program he shows is:
1 ARGF.each_line do |line|
2 if line =~ %r{GET /ongoing/When/\d\d\dx/\d\d\d\d/\d\d/\d\d/[^ .]+ }
3 puts line
4 end
5 end
Tim puts himself in religous territory right away by declaring his belief that Ruby is not just a beautiful langauge or among beautiful languages, but is actually the most beautiful language. If this was an isolated statement, it would just be a poor choice of words, but much of the article is rife with Ruby praise. And, where Ruby lacks (such as the need for the Pascal-like verbose block terminator "end"), he acknowledges this, but in a dismissive way that makes you feel like he's waving his hands in front of the warts. I found this especially distracting since he had to hand-wave away two of the five lines of his first program example.
The start of the second paragraph in his program's preamble almost offended me. It seems very presumptuous for Tim to declare that I'm not as good a programmer as I could (should?) be because I don't know Ruby. The sentiment is compounded by the next sentence which smacks of "all the other kids are doing it". I'm prepared to be surprised and disturbed by facts and even anecdotes in a book like this, but not by judgements and peer pressure.
So, after presenting the reader with this five-line program in "the most readable of languages", Tim then takes an entire page to describe what it does, line by line. Clearly, he needs to explain some of the unique, custom syntax (both of those adjectives tend to fall outside of my personal beauty bucket, by the way) used by the program. I'm not sure why he's needed this line by line explanation if the language and program are so readable and beautiful. This isn't something any of the previous chapters' authors have felt the need to do, and one was performing an incremental complexity analysis of Quicksort. Ironically, the line numbers in the example are apparently added by Tim to aid readability.
Much of the praise in this early section of the chapter is devoted to regular expressions, and that is justified. There seems to be an implication that that praise is somehow attributable to Ruby, but this is probably more the fault of the mood set by Tim than it is that of the actual relevant text.
Notice also that this program is equivalent to:
egrep "GET /ongoing/When/\d\d\dx/\d\d\d\d/\d\d/\d\d/[^ .]+ "
Given the application domain of reporting on log files, grep seems a more suitable solution so far. Tim expands the example after an overly detailed and unnecessarily exotic explanation of associative data structures (which he refers to as "Content-Addressable Storage") to instead count each article reference. I think it's important to see his expanded example for context.
counts = {}
counts.default = 0
ARGF.each_line do |line|
if line =~ %r{GET /ongoing/When/\d\d\dx/(\d\d\d\d/\d\d/\d\d/[^ .]+) }
counts[$1] += 1
end
end
keys_by_count = count.keys.sort { |a, b| counts[b] <=> counts[a] }
keys_by_count[0 .. 9].each do |key|
puts "#{count[key]}: #{key}"
end
From what I can tell, it becomes equivalent to:
egrep -o "/ongoing/When/\d\d\dx/(\d\d\d\d/\d\d/\d\d/[^ .]+) " \
|sort |uniq -c |sort -r -n |head -10
Approaches similar to the command line above seems acceptable to Tim because later when discussing more complex problems he talks about using multiple programs that produce intermediate files (although not in a pipeline) and do the processing as a series of separate Ruby programs.
Doing the same took another six lines of code in Ruby, and all of the useful syntax appears to be directly borrowed from Perl. In uncompressed first-draft Perl, you can write that as:
while(<>) {
++$counts{$1} if m!GET /ongoing/When/\d\d\dx/(\d\d\d\d/\d\d/\d\d/[^ .]+) !;
}
@sorted = sort { $b->[1] <=> $a->[1] } map { [$_, $counts{$_}] } keys %counts;
for $i (0..9) {
print "$sorted[$i]->[0] $sorted[$i]->[1]\n";
}
For my logs on my system (with a different regex, of course), the command line runs in ~2.4 seconds and the perl runs in about ~0.76 seconds. Tim's Ruby version processes my logs in ~2.0 seconds. My Perl took me longer to write than the command line, of course, but it didn't take more than a couple of minutes.
Tim says he wondered if his Ruby program was slow, so he wrote a Perl version to compare it to. But he doesn't include his Perl version in the article, so he doesn't draw any comparisons between the two implementations. Given his assertions about how much more readable Ruby is, I would like to have seen Tim do a direct comparison, especially since an alternative implementation was prepared for performance comparisons anyway. Its omission leaves his claims looking unfounded and tenuous.
The chapter does raise good points about when to spend time optimizing, and there are a few paragraphs that explicitly credit the "scanning lines of text with regexes" approach to awk, but they are in a different typeface and I wondered if they'd been inserted by the editors (but they aren't; they're written by Tim himself as a sidebar). The discussion of binary search and its Java implementation are just textbook cases with a small amount of advice, but nothing you wouldn't guess or know already. The chapter ends with an out-of-place discussion about the large Internet search engines that has nothing to do with design or code that I could see.
The point is that there is no great improvement in programmer efficiency, performance or readibility demonstrated here with the use of Ruby, so why does Tim focus on the language instead of the application domain and how its problems can be solved elegantly? Why did he chose such a simplistic problem that any experienced programmer can solve with a one-liner UNIX command or a few lines of a common log-processing language (Perl)? I don't like being fed an evangelist message that comes without respect or substance, but that's what much of this chapter feels like. Given Tim Bray's experience and expertise in information systems, I was looking forward to something really interesting and worth exploring.
On Simplicity
I had a discussion today with someone about software, and when I was trying to describe some of the key qualities I seek when building systems, I listed off the usual set of descriptors: scalable, fault-tolerant, simple, monitorable, maintainable, and so on. What surprised me was that the other person picked right up on the word "simple". As soon as I said "simple", they understood exactly what I was getting at, and they could see the kinds of benefits that simplicity could have. The surprising parts were that they aren't a software expert by any means, and I almost didn't even include "simple" in the list.
Why was it an afterthought? Because, for me, it has become so obvious, so necessary that it almost literally goes without saying. Which is funny, because if I had to chose one word, and only one word, to describe what I aim for, "simple" would be it. I'm reminded of a quote I read in the official Bell Labs history:
"Cognitive engineering" is what [Joe] Condon called it, "...that the black box should be simple enough such that when you form the model of what's going on in the black box, that's in fact what is going on in the black box."
This is an interesting way to think about design. Through appropriate uses of abstraction, even well-designed complex systems can be see as a collection of simple "black boxes" that, at a high level, just do what you think they do. Their complexity comes out of the number of abstraction layers and the number of abstractions that interact throughout those layers, and not from within any of the individual abstractions. Any individual black box has a clear, simple function. It might need several inner black boxes in order to perform that function, but those again, are clear and simple. This self-similarity forms a very organized chaos, which, if you tried to view it all at once with all the walls of the boxes taken away, would be an incomprehensible mess. It is the power of abstraction, the power of the black box in turn enabled by the power of the well-defined interface that turns complexity into simplicity.
So, as a systems designers and builder, I will frequently be faced with a complex problem that needs solved. By identifying the individual "real problems" within the overall problem space, I am applying the abstraction tools to impose an order, piece by piece. Whether I do this top-down, bottom-up, or any which way, the point is that I start building the walls of the black boxes and start defining the interfaces between them at a variety of levels. I try to use as many existing boxes as I can (sometimes with renovations), and, in fact, my experience is such that this process itself exposes more opportunities to use the same kinds of boxes in different problem areas. If someone can look at the end result and see simplicity, then I have been successful in designing a flexible, trustable solution. That's when we know we've done well.
Responsibility and Optimization
I was reviewing an external large-scale system architecture the other day, and suffice it to say the documents made a point of avoiding referential integrity (i.e. "no foreign keys"), stored procedures, and so on, in their databases, all in the name of scale and performance. Now, from what my database-expert friends tell me, sometimes you can drop certain key constraints in production once you're confident you don't have many (any?) bugs in that area, and you can squeeze some more performance out that way, but that was not looking like the case here.
In fact, it sounds like the same misguided approach to scale I've seen even on occasion in large-scale organizations like the mothership. To me, being the C++ programmer I am, it's analogous to saying you're going to use C-strings instead of std::string because there's less overhead. And then you miss the point entirely of high-level optimizations. And, while clearly you need fast basics, we all know that high-level optimizations are where most of the win is. Lose one high-level op and you save billions of operations. Love one memcpy and you save hundreds, or even thousands?
Then take the fact that higher-level constructs (created by the imposition of the same constraints that cause things to be "slower") give you a better optimization framework. Sticking with the string example, just think about how much time and memory you can save if you take advantage of std::string's (usual) copy-on-write semantic. Try doing that while still letting every libc string function every
assembly-level C string "optimization" be legal and safe. You can't because you end up with the essentially same constraints imposed on you as are on the higher-level std::string construct . . . for the same reasons. You need rules so that you know what can't happen and thus you can make better assumptions about what can and thus better optimizations.
Constraints and interface limitations yield implementation freedom within that black box which equals optimization freedom. Without clean interfaces and clear lines of responsibility, you can't have a practically optimizable system because you don't get to change your mind when your requirements change or you learn new things: you're
stuck with your initial optimization "guess" 'cause it's baked in to your design by way of your lack of good interfaces. And then it just boils down to premature optimization, and we all know what that means.
On the flip side, constraints and interface limitations split the system into human-manageable pieces, which is incidentally one of the original goals of C++ (see A History of C++: 1979 - 1991). This simplifies the system from the designers' perspectives and often informs us of (very-)high-level optimizations that can be made through this clarity of understanding. Plus, we then take a system with clean interfaces and divide the work up amongst domain experts who further analyze and optimize within a clear area of responsibility. This saves us runtime effort and improves the efficiency of the production pipeline by parallelizing the development effort. Of course, one of the keys to doing this right is to avoid producing six thousand pages of useless documentation in the process, but to instead have tools that let you write your interfaces in human-readable form first, but can then still serve as the canonical form for actually implementing the interface. After all, if you must live without all but one kind of documentation, it is interface documentation that you keep. But try to live better than that!
For me, it is always structure and organization first towards a correct and maintainable system, and then optimization within that space. You break the rules only when you prove to yourself that you must.
A Little Deconstructive Criticism
So Tim sends me email yesterday:
Ok, I'm not trolling here, I really want to learn something.
Came across
http://scienceblogs.com/goodmath/2006/11/the_c_is_efficient_language_fa.php
which states among other things that C/C++ are no good for numerical
computing.
The quote that got me wondering was "In fact, the fundamental design
of them makes it pretty much impossible to make really good, efficient
code in C/C++".
He quotes an experiment he did where his OCaml implementation of some
algorithm beat his C implementation and his C++ implementation of the
same language. (Aside, the Java and Python implementations were
crushed by these first three).
Thoughts?
Clearly that's gibberish because C is basically a macro assembler and can do anything. This reminds me of when Keith Lea that said Java was faster than C++ . . . because he wrote all of his C++ naively with the most obvious approach instead of the most efficient. He was open to the having an email discussion, however, but IIRC he seemed determined that I was unfairly making the C++ faster or equal because I was a good programmer.
In this most recent example, Mark makes a similar error in his analysis. His sample code loop exploits limitations of the particular optimizer he was using and not the language. It sounds like this OCaml is just an array and matrix aware language that has high-level constructs for such. And obviously, OCaml is implemented in C. So clearly C, given the right optimizer (in this case OCaml), performs on an equal footing.
Nobody in their right mind writes C++ code with raw arrays as primitives. And this leads to The Eternal Flaw in language comparisons as I always see them. People are always saying, "Language X is better than Language Y because of [insert library feature here]." Just because a language's standard distribution does not come with a
library piece you need to do your thing well does not mean that language is no good. Get the right tools for the job, people. Some specialized domains can benefit from the expressive power of a specially-tuned language. Some languages (like C++) have the ability to be effectively extended by letting libraries overload their native notational forms and thus offer what looks like a specialized language
but actually isn't (Boost.Spirit being the most extreme example I know of).
As an aside, most of the problems I have with Java are to do with the fact that it is not as extensible, and, more importantly, tends to force a paradigm (classic OO, typically) instead of offer tools. To me, Java is a framework and C++ is a toolbox.
I was disappointed to see that the poster of the comment Tim refered to lists himself as a Computer Scientist, but doesn't seem to have applied the scientific method. It certainly looks like he's saying "I did one experiment on one case and it proves C sucks." I would say the weight of his argument is outweighed by the strength of his evidence (i.e. it has none).
Multicore, Singlethreaded, Megascale
The talk recently is all about how we're getting really close to the clock speed "barrier" and the best way to scale up our software is to go multicore. This is really scary, but not because we're nearing engineering limits, and not because we're going multicore, and not because we're going to have to write software a different way. It's scary because everyone is talking about this as if it is something truly new and we as a community have no tools to address it. This set me to thinking . . .
But first, a word about scale
At the mothership, we've had the problem of having to do way more than any single computer could do for ages. To support truly tens-of-millions of online users at once requires more than just a tricked-out computer. You start taking problems and breaking them down into pieces based not only on functionality, but also based on how they scale. And different pieces of a system scale in different ways. Some pieces are true workhorses, and you can farm them off in the ideal manner, as completely independent units each of which is equally suited to being used for some processing. Other pieces need to work together in huge dynamically-hashed clusters that can each handle a certain partition of the problem space. Others can be organized around large-scale redundant databases of various forms. And still other pieces scale in other ways.
These forms of scalability are another dimension of building block that you need when building large-scale high-performance systems. You don't just decide that you need a video codec, a Session class and a user database. You decide that you need a farmed array of video codecs controlled by a farmed array of session-processors which interfaces with a dynamically-hashed cache cluster in front of a redundant database cluster or partition set. The task of design goes beyond just picking the algorithms or objects and includes consideration of how each needs to scale and how you can address those scale problems over the lifetime of the running system.
Systems do not grow linearly
Because scale needs change, you need to build so that you can add hardware and/or software (and thus capital and operating cost) to beef up one part of your system without having to force everything else to scale together. With real systems, you find that one part (user login, perhaps) takes a bigger hit than others. You want to build software that lets you add only the hardware (and software) you need. Sticking with the example, if your login piece takes 10 or 100 times the hits that each other individual pieces do, why would you design something that requires you to scale all the other pieces up 10 or 100 times just to meet the login load? You'll end up running way too much hardware and software, and that's just a waste of money and human time.
Same Problem, Smaller Datacenter
So back to the talk about the multicore "paradigm shift". I agree that it is an interesting research topic to investigate how to optimally take advantage of massively parallel hardware with a shared memory and hundreds or thousands of processors. But news reports all carry the same message about programming such a machine: "No one is ready for it." And by no one, computing science researchers are including themselves.
So does this mean that for the next ten or so years software speeds must freeze while this research happens and the best and brightest minds apply themselves to this newly-important research area? Definitely not. Keep in mind the lessons of scale from those who have already had to go beyond this single computer (let alone single CPU) to solve their problems. Sun's mantra of "the network is the computer" is how this kind of stuff is done. You create many semi-independent pieces of software that run together on a network to create a single large processing entity, or, really, "a computer". This meta-computer is the one you use to get the real work done. In the same way that you don't write a program using only libc, you don't design a massive system using only a normal computer. The components of this large system are the pieces of software running all over the network, and together they are your meta-program.
Through much experience and much thought, my colleagues and I have come to the simple conclusion that virtually all the programs the average person needs to write need only be single-threaded to be simple, scalable, correct, understandable, high-performing, highly-available, and maintainable1. So, we write all2 of our software as single threaded applications. If we need to take advantage of multiple CPUs, we run multiple instances. Sometimes they are on the same multi-CPU box, and sometimes they aren't. Since we're dealing with a meta-computer anyway, who cares?3
1 I'll save elaboration on this huge topic for future blog entries. This isn't really the focus of this posting, so if you must, pretend I said "multi-threaded" and step away from the guillotine. In the mean time, find me a real, large scale piece of multi-threaded software or system that has no concurrency issues, even under extreme, crippling load. :)
2 Yes, I said "all". Including the ones that do "the math".
3 And, since we're writing proper single-threaded applications, most of them use little CPU under heavy load anyway, but again, I digress.
So, if we need to multicore because we're stuck at a certain clock speed, let's not loose sight of the fact that the certain clock speed we're stuck at is around 4 GHz! We can write very coarse-grained large-problem software that doesn't even come close to chewing up that kind of power. Surely we're not going to go to a hundred or a thousand 33 MHz processors in these new massively parallel machines? And even if we do drop the clock speed by a factor of 2 or 5 or 10, we're still talking about pretty powerful CPUs. So the only real difference between this parallel machine and today's meta-computers is that the former share a bunch of memory chips. My experience teaches me that just because you have a shared memory doesn't mean you need to use it all the time. It's a classic case of premature optimization being the root of all evil. Sure, we might be able to do better once we know more about the research area. But we don't know how much better, and we shouldn't exclude alternative approaches to proposed massively parallel programming languages that even the experts can't fully envision today if for no other reason than that we shouldn't let perfect stand in the way of very good.
It is definitely the same problem in a smaller datacenter. Instead of it taking two football fields to house 10,000 CPUs, maybe it all fits in a couple of racks of massively parallel multi-core machines. But the programming problems of dividing the system up into areas of responsibility based on functionality and scale are the same. And we don't need brand-new computing science to deal with it or programming languages that inherently support uber-sophisticated implicit parallel algorithms that look just like normal code. That's the same folly that network file systems and threads suffer from: they try to make "new" higher-level problems look like old lower-level ones.
We already have the tools that let processors communicate with each other: networks. A single piece of single-threaded network software that receives, processes, and sends events is a very generic concept that can be (and is, ultimately) used to model a traditional single processor system, threads, a network of nodes, or a parallel machine. We all know this to be true as professional programmers and computing scientists. Why do we continually fight it and try to use less general models when there's a simple one that just works, even for the machines of tomorrow? Then, when tomorrow's tomorrow has come and we're learned even more about these machines by experience, we can do as we always should do and encapsulate the fully-understood problem in an elegent, abstract solution, whatever that turns out to be.
Desirable System Qualities
This is a brief discussion of general desirable system qualities that we seek
when designing systems. These ideas took this solid form in September 2003 when I was aiming to document how I think about design as part of an effort to spread new ideas within the mothership. Reading over it today, it's interesting to note how if you're in something for the long haul and you want to keep people interested past the hype, you need quality. For online services (especially, but not exclusively), the following are definitely the kinds of qualities you need to aim for and achieve to earn long-term success.
Systems have different jobs to do and different requirements. However, there
is a set of qualities that all systems can seek to offer.
Scalability
Because it is often impossible to know the number of clients or
eventual load requirements of a system, we must design in a way the allows
system to grow to meet future needs.
Different pieces of any system scale differently. Some areas of
responsibility are more costly than others or are used more often and these
will need to "scale more" than other pieces. Key to dealing with this is a
system design allows for this growth by decoupling the system's components via
well-described interfaces across various hardware platforms and instances.
Said another way, well-defined division of responsibility is essential to
offering systemic scalability.
Performance
All systems have performance requirements, and many systems need to perform
in realtime or near realtime. Meeting performance requirements
often translates to a need for scalability: by distributing responsibility,
more computing power can be applied to a given task. Meeting realtime
performance requirements means distributing areas of responsibility in order
to control the time need to perform those responsibilities.
Performance can be thought of as being achieved via algorithmic and systemic
means. Components need to choose good algorithms to perform work within their
areas of responsibility. Systems need to connect the right components using the
right interfaces to offer the right information at the right time.
High Availability
All systems have components that fail; designing systems that do not fail as
a consequence of a few component failures is key to offering high
availability. This provides another reason for clear divisions and
distributions of responsibility, as any responsibility that is not
distributed, and thus resides in one component, leaves the system susceptible
to failure of that component.
When parts of a system fail, replacements must be pressed into use manually
or automatically. Manual replacement is acceptable in some circumstances,
but it must be supported by good detection and reporting of problems within
contexts that quickly and easily identify failed components. Automatic
replacement is something that needs to be part of the design: components must
expect other components to fail, and they must deal with that failure
realistically.
Some systems require near zero information loss. Other systems can lose some
information provided they "cut over" to backup components. Real systems
require both to some degree within various areas of responsibility. Knowing
when loss of availability is acceptable is just as important as knowing when
it is not. Knowing which information is "system-critical" and which
information can be lost is also important. Systems need to preserve the
information that matters and they need to recover from failure to continue to
provide service as much as is necessary.
Levels of Abstraction
To actually carry out its duties, the system must know the intricate details
of the systems with which it interfaces and the tasks it performs in its
lowest-level components. To be useful and successful, it must know how to
connect components to meet the system's overall goals and carry out
high-level tasks (and implement high-level interfaces).
Through well-defined interfaces, components must offer to other components
functionality that falls into their area of responsibility. By taking
responsibility for an area, the component agrees to do "real work" in that
area, thereby simplifying it for the rest of the system via an interface. A
component must not expose unnecessary detail to other components; doing so
would limit the system's ability to have these qualities. A component must
communicate in terms of its area of responsibility as seen from the
outside. Well-defined interfaces designed to match the external view of
that area of responsibility support this requirement.
Aggregates of components may form subsystems (as seen from the bottom) and
super-components (as seen from the top). The role of any particular entity
within the system is matter of perspective. That is, all entities are both
components and systems at the same time.
This highlights that there is no real difference between the algorithmic and
systemic approaches to meeting performance needs. Systemic approaches in a
subsystem are merely algorithmic approaches in a super-component, and vice
versa.
Simplicity
Systems must battle complexity. No single component of the system should be
so big that it can not be understood by its builders or diagnosed by its
operators. Said another way, each component should have a clear area of
responsibility made concrete by the well-defined interfaces it implements.
Maintaining clear areas of responsibility through well-defined
interfaces and levels of abstraction not only supports the other system
qualities described here, it also limits the complexity of the system, and in
turn, its design and implementation. One can drill down from systems to
components, then view those components as systems and drill down again.
If the system offers the other
qualities described here, simplicity may not be automatic, but it will be
within reach.
|
|