Heterogeneous computing has taken the center stage after recent announcements from Intel, AMD, and ARM. In heterogeneous computing, the hardware provides multiple types of processing cores that the software leverages to improve performance and/or save power. There seems to be a lot of confusion about the topic because the term, heterogeneous computing, is overloaded. I will try to provide an overview of the different heterogeneous computing efforts today.
Academics and companies are investigating two entirely different types of asymmetry:
The idea is to integrate cores with different functional capabilities, e.g., integrating special purpose accelerators for video and graphics. Recent examples of such chips include AMD Fusion, Intel’s Sandybridge, ARM SOCs, and IBM Cell.
Functional asymmetry saves power because a core which is tailored for a particular type of code can run it faster and power-efficiently. However, there are three problems with this approach. First, software is required to chose the core to run each code segment which increases the programmer’s work. Second, code scheduling is inflexible because a code segment can only run on specific cores. Third, different types of cores are integrated loosely which makes communication between them expensive, thereby prohibiting fine-grain work splitting.
An alternate is to integrate cores with similar functional capabilities but different power/performance characteristics. Such asymmetry can be created by integrating some fast and slow cores on the same chip, e.g., integrating Intel Xeon and Intel Atom cores or running cores at different frequencies.
This type of asymmetry is a middle ground between a symmetric chip and a chip with functional asymmetry. Its power requirements are in-between the two options and its performance can be close to the performance of all large cores if the system can run the critical bottlenecks on the fast cores. Unfortunately, successful identification of bottlenecks is a daunting task. Programmers often cannot predict the bottlenecks as it depends on run-time parameters and hardware lacks the big picture view the programmers have.
The Future of Heterogeneous Chips
It is clear that there are fundamental challenges in both approaches. There are issues that hardware can fix in isolation, e.g., tighter integration, and there are issues software can fix in isolation, e.g., choosing the right core, but a real wholistic solution can only come from hardware/software symbiosis. My take is that we need to provide pseudo functional symmetry where programmers are given the illusion that cores are functionally similar and hardware/OS is designed to seamlessly migrate the work to cores that are capable of executing certain instructions better. Any suggestions?