Good evening world!
Recently I came across a small task that reinforced my belief in the importance of C programming. The task was the following:
- Generate 500,000,000 pseudo-random numbers using Linear Congruential Generator algorithm [1, 2].
- Use Box-Mueller transform  to get normally distributed random variables (RV).
- Compute the next step in the evolution of a deterministic system, and add noise using the generated normally distributed RV.
- Write the result of every 1000th step to a file.
In other words, we need to run a for-loop for 500,000,000 steps, doing some calculations (generating RV + evaluating deterministic function), and writing to a file once in 1000 steps.
This doesn’t sound particularly challenging, and the whole thing can be done in less than 80 lines of C code. Same task can also be done in about 45 lines of Python. However, LOC is not the metric I want to look at here. I want to talk about performance of the code written, and some general educational caveats.
Is the field set level?
Let’s talk a tiny bit about optimization, under-optimization and over-optimization here.
The moment I shared this story with my friend, he immediately said that the performance comparison doesn’t make sense. However, the argument provided was the following: if the Python code takes that much longer to run, it clearly was not written well. I agree, when I was coming up with the comparison I was not using any fancy race-track wheels for Python. The entire script I wrote is as vanilla Python, as one can possibly imagine. Does this mean that I am cheating by employing “bad” or “inefficient” Python coding practices? I would say no, or at worst, just a little bit.
In both cases: C and Python, I wrote a vanilla implementation of the given task. Hence, no parallelism, no non-standard libraries, no pre-compiled/hardware optimized shenanigans. Did I manage to cheat a bit? Yes, of course I did, I compiled my C code with
-O3 optimization flag. This of course is not the full story either. I did run my Python script naively invoking
python ./generate.py rather than trying to compile it into optimized binary and then running it. However, for all of these “sins” I have a quite simple answer: I don’t do that with Python 99.9% of the time. I do not compile my Python scripts. I do not roam the web for pre-compiled computational libraries for Python. I do not tend to care that much about performance in the first place, when I code in Python.
How is this a conversation about optimization then? Well, I think we need to consider several parameters to be optimized, and then checkout what we get in terms of the relative performance. Hence, I will be thinking about 3 metrics here:
- Human minutes spent writing code (including debugging).
- Human minutes spent waiting for the program to finish running.
sysruntime of the programs written.
In the context of these 3 parameters I can clearly define what I mean by optimizing, over-optimizing and under-optimizing performance of a task.
Over-optimizing: This is the case when I will spend a lot of time writing code that supposedly is great in terms of wall and sys times. Not surprisingly majority of the over-optimization in my case does not come from the assembly injections leveraging latest microarchitecture features. When I over-optimize with probability 0.9 it is due to me finding a paper proposing a fast algorithm that I am trying to write from scratch. Clearly this brings a caveat: asymptotically better performance, does not always translate into cycle-count performance on small enough examples.
Optimizing: Once in a blue moon, when working on a one-off personal project, I do hit the right spot. Just enough of complexity in the implementation to get a good average for the runtimes. Any properly optimized code should be optimized both in terms of human minutes spent writing it, and human minutes spent waiting for the results. However, as with anything in the world of computer programming, or life at large, there is a caveat: optimization is context dependent. Spending more development hours over code that has to be reused on a regular basis is worth it, as long as the eventual benefit in runtime pays for it.
Under-optimizing: This is what happens when the deadline is far away. Hacking together 25 lines of your favorite vanilla high-level language, and letting it run overnight, because you still have a week of time left, and one run only takes 14 hours. Surprisingly, I think that from a practical perspective this is more justified than over-optimizing. If I had to choose between code that takes 14 hours to run, but gets the job done, and code that takes 12 hours to develop and only 2 to run, I might go for the first one, because at least I can sleep or read for those 12 hours of difference. However, the caveat here is simple: if you need to run the code more than once, the unoptimized runtime will cost you a lot.
I was compiling and running all code on my personal laptop. The specs are listed below.
MacBook Pro (Retina, 13-inch, Mid 2014)
- 2.6 GHz Intel Core i5 (4278U)
- 8 GB 1600 MHz DDR3 RAM
Runtimes measured with
As you can see all across the times, the performance differs drastically. This is by no means a shocking or unexpected result, but it matters for the rest of the discussion.
This post ultimately is about teaching and learning, so let’s finally talk about why any comprehensive course[F1] on computer programming must cover some basics of C language.
First, C is a compiled language. While the intricacies of compiling as a process lie beyond the introduction level, the acknowledgement of compilation as a step in a lifecycle of a program is critical. Virtually anything that has to do with computer programming in its broadest definitions can benefit from a better understanding of the full picture. As an example I can bring up a recent workplace story, where as we discovered certain business logic scripts where ultimately compiled into SQL statements. When the underlying tables changed, SQL statements became invalid, while the surface level logic remained perfectly sound. Thus, it took a bit of tinkering around to find out that in fact we had to trigger a re-compile for the SQL to become valid again. Hence, if you have a better knowledge of the full picture, then your bug fixing abilities are also better.
Second, C has great performance metrics. As the first part of this story shows, C does in fact yield quiet great performance in its vanilla form. Of course you have to be mindful of your project scope. In terms of over-optimization failures C is probably at the top of the list in close competition with C++. Just think of all the linked list implementations ever written in C. Now, think of all double linked list implementations, and FFT implementations, and Dijkstra’s algorithm implementations, and so on ad nauseam. Writing code in C oftentimes feels like re-inventing the wheel. In fair part because it is. However, when the task at hand boils down to arguably simple arithmetic operations that need to performed at medium scale, writing it up in C is probably the best bet.
Third, C is ubiquitous (unless you are on Windows). If you have *nix system it comes with either
clang or some other form of C compiler. No need to download a myriad of heavy IDEs and weird things. To be fair the same can be said about vanilla Python, which in part is why I love it so much (but still use Anaconda).
Fourth, C builds discipline and rigor. I am not talking about painstaking debugging of memory leaks and segfaults. I am not talking about arcane magic of pointer arithmetic. Those things are clearly important, but I am talking about very very basic C. You need to declare variable types. You need to allocate stuff ahead of time. You need to keep track of what moving parts you’ve got in the game. These things amount to cleaner code and better style. You have to at least minimally organize your thoughts before writing in C. Hopefully, that generalizes to the same concept for all of the code you will write.
Finally, C is just another programming language. I firmly believe that breadth of programming knowledge is equally if not more important than depth for about 80% of people who will ever write code. In the job market it is hard to guarantee that your amazing knowledge of C++ will always be equally demanded. In the academia you might realize that the lab you just joined does things in Perl. You can probably still write half of your code in Java, but then you need to interface to the rest of the Perl codebase… You get the general idea. On the other hand, “the jack of all trades, but master of none” kind of programmer will be more likely to pick up a new language from the docs, because it is needed. In this regard C serves as a good training ground for general programming principles.
Hence, in the end we have a performant language that exposes you to some fundamental programming concepts and builds up a better coding discipline.
- S. K. Park, K. W. Miller. Random number generators: good ones are hard to find. Comm. ACM (31):10, pp. 1192-1201, 1988. [DOI: 10.1145/63039.63042]
- P. A. W. Lewis, A. S. Goodman, J. M. Miller. A pseudo-random number generator for the System/360. IBM Systems Journal (8):2, pp. 136-146, 1969. [DOI:10.1147/sj.82.0136]
- G. E. P. Box, M. E. Muller. A Note on the Generation of Random Normal Deviates. Ann. Math. Statist. (29):2, pp. 610-611, 1958. [DOI: 10.1214/aoms/1177706645]
[F1] By comprehensive course I mean an academic to a calendar year long introductory sequence on computer programming. Examples would include any “Intro to Computer Science… n = 1, 2, 3, …” sequences, and any bootcamps that aim to teach you computer programming. I do agree that there are shorter courses that clearly cannot cover learning C. However, I would also argue that such courses by no means are comprehensive.