May 192011
 

I guess I will continue my rant that many programmers do not understand hardware. I stress that they should. Dear programmers, before you feel offended, please read my previous posts bashing hardware guys for not learning software:-)

I have made my list of top ten items every programmer should know. I do not explain the concepts but provide keywords you can google or find in your favorite book. I will write tutorials here in the future if you guys want.

Update 6/19/2011: I have written a computer science self-assessment for developers to test their knowledge about computer science.

Update 5/31/2011: In response to a couple of comments on this post, I have written a more detailed motivation for this article. It tries to answer why programmers today have a higher motivation to understand hardware.

1. Data types

This is a general concept but its essential to understand the different data types. 2′s complement, unsigned vs. signed numbers, floating point.

2. Boolean algebra and converting numbers from decimal to hex/binary and back.

Keywords: binary, decimal, hexadecimal

3. Caches: You cannot write good efficient code unless you understand caches. You must understand:

What do caches do?

How are caches organized? (Hint: lookup tag store, data store, cache block, index, set, valid bit, dirty bit, associativity, and replacement policy).

Why do they work? (spatial and temporal locality, REUSE — I emphasized it because very few people really get it)

Understand multi-level cache hierarchies (lookup L1 cache, L2 cache, etc)

4. Cache coherence (especially if you are in parallel programming).

Keywords: MSI, MESI, false sharing

5. Virtual memory: You do not need to understand it down cold to write good code but its always a plus.

Keywords: Pages, frames, demand paging, page walk, page fault, TLB, permissions

Less important keywords: page table, valid bit, dirty bit, reference bit,

6. Pipelining, especially in context of branch prediction penalty:

You can read up my article on branch prediction for some basic information.

Keywords: gshare, 2-bit counter, 2-level predictor

7. Memory layout of data structures like arrays, linked lists, trees, and hash tables.

8. Some assembly programming. This will help you understand why some code is better than the other.

9. Basic compiler optimizations

Keywords: pointer aliasing, dead-code elimination, register allocation, SSA,

10. Memory bandwidth constraints:

Keywords: memory bus, memory banks, page mode

 

While there is an infinite amount you can learn, this is my top ten list. I don’t think any of these concepts takes more than few hours to learn but it can significantly improve your understanding of your code’s performance and help you clear job interviews. Anyone else has a different list?

  11 Responses to “Ten things every programmer must know about hardware”

  1. @FutureChips Nice Summary – For similar reasons, we actually make Embedded Systems a core subject for all CS students.

  2. I’m not sure about the experiences of the readers here but i could never understand why my professors didn’t teach the basics of parallel programming when i was in uni – a shame, really.

  3. Instead of (or in addition to) writing an article on hardware issues software developers should know about, could you, perhaps, create or update (as required), and link to appropriate Wikipedia articles?

    That way the information would be more widely available.

    Of course, Wikipedia tends more towards “what” as opposed to “why”, so an article that covered the “why” would be most useful.

  4. Could you write a post on the memory layout of data structures? Any pointers to it would really be helpful.

    Thanks,

  5. Just to reinforce your point about cache … I had a compute-bound problem that I was solving using the Stata statistics program running on an Intel Netburst Pentium-4 CPU with a single core, hyperthreading and 1 MB of L2 cache (on-die). Testing my program on a small sample of my data, I estimated that the program would require 48 hours to complete the entire data set. It occurred to me that it might run faster if the data fit into the cache, so I tested a variety of “chunk” sizes to see if it affected the speed. The optimum “chunk” size turned out to be about 800 KB, which fit in the cache with room for OS and program code. The optimum-sized-chunk version ran 26 times faster than my original version. This was the difference between the data ALWAYS being in the cache and the data NEVER being in the cache (due to the way the code was organized). I am now convinced that cache tuning can pay off better than almost any other optimization that you might consider short of algorithm replacement.

    So yes, knowing a little bit about hardware can make a big difference to a programmer.

  6. Hello, i think that i noticed you visited my web site thus i came to go back the favor?.I’m attempting to find things to enhance my website!I assume its adequate to use some of your concepts!!
    TAYA http://www.net-ict.be/

 Leave a Reply

(required)

(required)

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>