Apr 302013
 

Sorry for the delay in this post. I could not get to this post in time and wanted to be sure it is well-researched. The final post in this series is a comparison of the hardware support in the ARM and x86 world. As mentioned in the previous post the biggest reason for ARM to include virtualization in their architecture is to be viable in the server market against x86. So I think a comparison of x86 and ARM hardware support for virtualization is warranted.

Continue reading “ARM Virtualization – ARM vs x86 (Part 5)” »

Apr 082013
 

In the last few posts we discussed the hardware support needed to provide virtualization. In this post how virtualization can empower the user. We’ll discuss the use cases we already see in the server and desktop space, and mobile specific applications like big.LITTLE and lowering production costs for handsets.

Continue reading “ARM Virtualization – Applications (Part 4)” »

Apr 012013
 

In the second part of the series we introduced memory management and interrupt handling support provided by virtualization hardware extensions. But effective virtualization solutions need to reach beyond the core to communicate with peripheral devices. In this post we discuss the various techniques used for virtualizing I/O, the problems faced, and the hardware solutions to mitigate these problems.

Continue reading “ARM Virtualization – I/O Virtualization (Part 3)” »

Mar 242013
 

In the first part of this series, I introduced the topic of virtualization. Today I will venture deeper into the ARM virtualization extensions for memory management and handling of interrupts. Within the core, virtualization mostly provides controls over the system registers. But as we move further from the core, and start to communicate with the outside world, difficulties and nuances in the problem start to emerge and the need for hardware support for virtualization becomes apparent.

Continue reading “ARM Virtualization Extensions – Memory and Interrupts (Part 2)” »

Mar 182013
 

Sorry guys for another hiatus, my job at Calxeda keeps me busy. I was recently discussing ARM’s virtualization support with my friend Ali Hussain (yup, that’s our idea of a fun dinner conversation) and found some very interesting facts. I requested Ali to share his knowledge in a blog post series on this topic, so here you go. Ali is in ARM’s performance modeling team and has been working on ARM cores since 2008.

The idea for this blog post stemmed from talking to people that had the impression that ARM’s virtualization support, even with the virtualization extensions in Cortex-A15, is limited. I plan to write a few posts exploring virtualization, and the support for it in the ARM and x86 ISAs. This post will draw heavily on my understanding of the ARM architecture and operating systems.

Continue reading “ARM Virtualization Extensions — Introduction (Part 1)” »

Jul 162012
 
photo

Raspberry Pi, Mele A1000, MK802, and … . the market is getting filled with these low price geek toys. I personally see a lot of potential here. These “devicelets” can do to hardware what apps did to software. Some readers may remember that I posted a tutorial to create a simple evaluation board out of a iPhone 3GS last year. Back then, Pandaboard was the only choice to get an ARM computer in the market and it was never available. Now there are so many vendors and sellers that it has become difficult to chose. This post is just a concise summary of all the available choices I have come across so far.

Continue reading “Which little PC should I buy? Raspberry Pi? Mele A1000? or …” »

Jul 132012
 
photo

After I downloaded  iOS6 on my iPhone last week, the first icon I clicked on was Passbook only to find that Apple had not put any example passes in there. Since Passbook was the primary reason I had downloaded iOS6, I dug into the API and learned how to create a pass myself. It was a great learning experience that I want to share with others. I also provide a shell script to automate the pass generation process and also present to you, iPass.pk, a user-friendly GUI-based service to create passes.

Continue reading “Generating passes for iOS6′s Passbook” »

Jun 302012
 

Yet another hiatus. Sorry, I was very busy with my job as a performance architect at Calxeda. Will try to be regular again. 

I have recently been interviewing people at Calxeda, my new employer. There are a few fundamental concepts I expect every engineer/CS major to understand, regardless of what position they are applying for. One of them is the difference between a channel’s throughput and its latency. It is surprising how many candidates get it wrong. I will not only try to explain the concepts of latency and throughput using a simple analogy, but also try to hypothesize why IMO most people get them confused.

Continue reading “Clarifying Throughput vs. Latency” »

Aug 242011
 

I typically do not share articles on this blog but I found this white paper today which was very enlightening and doesn’t seem to have gotten the deserved attention. The author has done an excellent job of explaining the shortcomings of GNU Make. I now question why I use Make:-)

 

Below is excerpt and a link to the article. Since the original post doesn’t have space for comments, we can use this post for our discussion.

 

GNU make is a widely used tool for automating software builds. It is the de facto standard build tool on Unix. It is less popular among Windows developers, but even there it has spawned imitators such as Microsoft’snmake.

Despite its popularity, make is a deeply flawed tool. Its reliability is suspect; its performance is poor, especially for large projects; and its makefile language is arcane and lacks basic language features that we take for granted in other programming languages.

Admittedly, make is not the only automated build tool. Many other tools have been built to address make’s limitations. Some of these tools are clearly better than make, but make’s popularity endures. The goal of this document is, very simply, to educate you about some of the issues with make—to increase awareness of these problems.

 

Read more

Aug 092011
 

Similar to other prediction mechanisms, branch predictors are also better at predicting strongly biased branch outcomes.

This rule is well-understood and commonly used, e.g., the Intel Itanium compiler assumes that prediction accuracy = MAX(percentage_taken, percentage_not_taken) when performing its profile-guided optimizations. Thus, to improve branch prediction, we must increase the number of biased branches while reducing branches that are oscillating. This post shows a simple trick to do so.

Continue reading “Quick Post: Software Trick to Improve Branch Prediction” »