Jun 202011
 

The list of Top 500 fastest computers in the world just came out and the Japanese K-computer is the fastest and the most energy-efficient computer at the same time. It is hard to build computers that are both fast and energy-efficient so I set out to understand what Fujitsu has done right. This quick post is a summary of my investigation. For the very impatient, my crude experience-based analysis says that the special purpose instructions and highly specialized functional units in the core give them their edge.

Continue reading “Why the K-computer is the fastest and energy-efficient?” »

Jun 192011
 

I wrote a list of ten items all software programmers must know about hardware. Today, I want to provide a small quiz for you to evaluate yourself. Some questions are very simple but they exist to test the fundamentals. Enjoy!

The answers to the self assessment are available here.

Continue reading “Computer Science Self-assessment Quiz” »

Jun 182011
 

In shared memory systems, multiple threads are not allowed to update shared data concurrently, known as the mutual exclusion principle. Instead, accesses to shared data are encapsulated in regions of code guarded by synchronization primitives (e.g. locks). Such guarded regions of code are called critical sections. The semantics of a critical section dictate that only one thread can execute it at a given time. Any other thread that requires access to shared data must wait for the current thread to complete the critical section.

Continue reading “Parallel Programming: Understanding the impact of Critical Sections” »

Jun 172011
 

Heterogeneous computing has taken the center stage after recent announcements from Intel, AMD, and ARM.  In heterogeneous computing, the hardware provides multiple types of processing cores that the software leverages to improve performance and/or save power. There seems to be a lot of confusion about the topic because the term, heterogeneous computing, is overloaded. I will try to provide an overview of the different heterogeneous computing efforts today.

Continue reading “Heterogeneous Computing: Past, Present, and Future” »

Jun 142011
 

Traditionally, RAM, or Random Access Memory, was used to describe a memory which offered the same access latency for all its memory locations.  This is barely the case with modern DRAM systems. In this post, I describe a ten thousand foot view of how modern DRAMs work with the hope that it can help the programmers in choosing their algorithms and data structures wisely.

Continue reading “What every Programmer should know about the memory system” »

Jun 142011
 

Parallel programming consists of four distinct phases: finding the parallelism, writing the code, debugging the code, and optimizing the code. In my opinion, frameworks like the Apple’s Grand Central Dispatch and Intel’s TBB only help with writing the code; they do not help with finding parallelism, or debugging, or optimizations (e.g., false-sharing, thread waiting, etc). I think that the difficulty in finding the parallelism , which can be an insurmountable barrier for many inexperienced parallel programmers, is often underestimated. In this post, I try to explain this challenge using a couple of parallel programming puzzles.

Continue reading “Parallel Programming: Why new frameworks only solve a part of the problem?” »

Jun 092011
 

In my last post on “Understanding the Cortex A8 architecture,” I promised that I will make a web-based front-end to my Cortex A8 test harness so that the readers can run experiments themselves. The tool is ready and kicking. This post shows how a developer can use it to learn the A8 architecture.  As far as I know, this is the first setup of its kind.

Continue reading “Tips for iPhone Developers: The web-based sandbox for understanding Cortex A8 is ready (Part 3)” »

Jun 062011
 

There are two types of people in this world: those who use cloud computing knowingly and those who use it unknowingly. In fact, the latter often use the clouds more. Apple’s shift to the cloud today marks an important landmark. While most people are discussing it in the light of how it will impact consumer’s lives, I can’t stop thinking about how it will change computer science and the work lives of the “geeks.” Some old skills will become irrelevant and some brand new fields will emerge. In this post, I present my take on how computer science and engineering job descriptions will be impacted by this change.

Continue reading “iOS 5 goes to the cloud. How does it impact us computer scientists?” »

Jun 052011
 

In my first post on “Tips for iPhone App Developers,” I showed how to setup a development board using as iPhone 3GS. In this post, I will use that setup and a micro-benchmark to explore the ARM Cortex A8 architecture.

Update 6/9/2011:  The web-based sandbox for understanding Cortex A8 is ready

Continue reading “Tips for iPhone App Developers: Understanding Cortex A8 core (Part 2)” »