I guess I will continue my rant that many programmers do not understand hardware. I stress that they should. Dear programmers, before you feel offended, please read my previous posts bashing hardware guys for not learning software:-)
I have made my list of top ten items every programmer should know. I do not explain the concepts but provide keywords you can google or find in your favorite book. I will write tutorials here in the future if you guys want.
Update 6/19/2011: I have written a computer science self-assessment for developers to test their knowledge about computer science.
Update 5/31/2011: In response to a couple of comments on this post, I have written a more detailed motivation for this article. It tries to answer why programmers today have a higher motivation to understand hardware.
1. Data types
This is a general concept but its essential to understand the different data types. 2′s complement, unsigned vs. signed numbers, floating point.
2. Boolean algebra and converting numbers from decimal to hex/binary and back.
Keywords: binary, decimal, hexadecimal
3. Caches: You cannot write good efficient code unless you understand caches. You must understand:
What do caches do?
How are caches organized? (Hint: lookup tag store, data store, cache block, index, set, valid bit, dirty bit, associativity, and replacement policy).
Why do they work? (spatial and temporal locality, REUSE — I emphasized it because very few people really get it)
Understand multi-level cache hierarchies (lookup L1 cache, L2 cache, etc)
4. Cache coherence (especially if you are in parallel programming).
Keywords: MSI, MESI, false sharing
5. Virtual memory: You do not need to understand it down cold to write good code but its always a plus.
Keywords: Pages, frames, demand paging, page walk, page fault, TLB, permissions
Less important keywords: page table, valid bit, dirty bit, reference bit,
6. Pipelining, especially in context of branch prediction penalty:
You can read up my article on branch prediction for some basic information.
Keywords: gshare, 2-bit counter, 2-level predictor
7. Memory layout of data structures like arrays, linked lists, trees, and hash tables.
8. Some assembly programming. This will help you understand why some code is better than the other.
9. Basic compiler optimizations
Keywords: pointer aliasing, dead-code elimination, register allocation, SSA,
10. Memory bandwidth constraints:
Keywords: memory bus, memory banks, page mode
While there is an infinite amount you can learn, this is my top ten list. I don’t think any of these concepts takes more than few hours to learn but it can significantly improve your understanding of your code’s performance and help you clear job interviews. Anyone else has a different list?
New: : Ten things every programmer must know about hardware http://www.futurechips.org/tips-for-power-coders/programmer-hardware.html
RT @FutureChips: New: : Ten things every programmer must know about hardware http://www.futurechips.org/tips-for-power-coders/programmer-hardware.html
@FutureChips Nice Summary – For similar reasons, we actually make Embedded Systems a core subject for all CS students.
Great to hear that. I liked my embedded systems class a lot (9 yreas ago). I learned several items, including OS and file systems.
I’m not sure about the experiences of the readers here but i could never understand why my professors didn’t teach the basics of parallel programming when i was in uni – a shame, really.
Instead of (or in addition to) writing an article on hardware issues software developers should know about, could you, perhaps, create or update (as required), and link to appropriate Wikipedia articles?
That way the information would be more widely available.
Of course, Wikipedia tends more towards “what” as opposed to “why”, so an article that covered the “why” would be most useful.
–
Thanks for reading.
Interesting thought. I will do both. Add a Wikipedia article and also add reasons here.
Hey HMW:
As promised, I have just written a post on “why” programmer should learn these hardware concepts. Please take a look and leave feedback. Thanks!
Why programmers should understand hardware?
Could you write a post on the memory layout of data structures? Any pointers to it would really be helpful.
Thanks,
Just to reinforce your point about cache … I had a compute-bound problem that I was solving using the Stata statistics program running on an Intel Netburst Pentium-4 CPU with a single core, hyperthreading and 1 MB of L2 cache (on-die). Testing my program on a small sample of my data, I estimated that the program would require 48 hours to complete the entire data set. It occurred to me that it might run faster if the data fit into the cache, so I tested a variety of “chunk” sizes to see if it affected the speed. The optimum “chunk” size turned out to be about 800 KB, which fit in the cache with room for OS and program code. The optimum-sized-chunk version ran 26 times faster than my original version. This was the difference between the data ALWAYS being in the cache and the data NEVER being in the cache (due to the way the code was organized). I am now convinced that cache tuning can pay off better than almost any other optimization that you might consider short of algorithm replacement.
So yes, knowing a little bit about hardware can make a big difference to a programmer.