*Yet another hiatus. Sorry, I was very busy with my job as a performance architect at Calxeda. Will try to be regular again. *

I have recently been interviewing people at Calxeda, my new employer. There are a few fundamental concepts I expect every engineer/CS major to understand, regardless of what position they are applying for. One of them is the difference between a channel’s throughput and its latency. It is surprising how many candidates get it wrong. I will not only try to explain the concepts of latency and throughput using a simple analogy, but also try to hypothesize why IMO most people get them confused.

Quick background: I was first asked to differentiate between latency and throughput when I was a college sophomore. It was during an internship interview at nVidia Austin in Spring 2002, more than a decade ago. Sure enough, I was not very clear about it myself and the interviewer had to explain it to me. I wish I remembered his name to give him credit since I am using the same analogy that he used.

When you go to buy a water pipe, there are two completely independent parameters that you look at: the diameter of the pipe and its length. The diameter determines the *throughput* of the pipe and the length determines the *latency*, i.e., the time it will take for a water droplet to travel across the pipe. Key point to note is that the length and diameter are independent, thus, so are are latency and throughput of a communication channel.

More formally, **Throughput** is defined as the amount of water entering or leaving the pipe every second and **latency** is the average time required to for a droplet to travel from one end of the pipe to the other.

Let’s do some math:

For simplicity, assume that our pipe is a 4inch x 4inch square and its length is 12inches. Now assume that each water droplet is a 0.1in x 0.1in x 0.1in cube. Thus, in one cross section of the pipe, I will be able to fit 1600 water droplets. Now assume that water droplets travel at a rate of 1 inch/second.

**Throughput**: Each set of droplets will move into the pipe in 0.1 seconds. Thus, 10 sets will move in 1 second, i.e., 16000 droplets will enter the pipe per second. Note that this is independent of the length of the pipe.

**Latency**: At one inch/second, it will take 12 seconds for droplet A to get from one end of the pipe to the other regardless of pipe’s diameter. Hence the latency will be 12 seconds.

**Queuing: **Note that droplets may be arriving at a rate faster than 16,000/second, say 16,100/second. Since the pipe cannot let more than 16,000 droplets to enter each second, the extra 100 droplets will have to wait. Said another way, they are put in queue where they will get to enter the next second. The time a droplet waits in the queue before entering the pipe is called the **queuing delay. **

**Possible source of confusion:** Since queuing delay is related to throughput –NOT latency– many scientists/engineers confuse queuing delay with latency and conclude that latency is related to throughput — which is wrong. Latency itself is independent of throughput, its the queuing delay which isn’t.

This and many other cool concepts about queuing are explained by Queuing theory.

one cross section can fit 16,000 droplets, not 1600

(4* 4) / (0.1 * 0.1 * 0.1) = 16,000

It’s supposed the length of a cross section is equals to one water droplet’s length, I think. So one cross section has 4 x 4 x 0.1 cubic inch.

Therefore, one cross section can fit (4 x 4 x 0.1) / (0.1 x 0.1 x 0.1) = 1.600 droplets.

When dealing with cross sections, you’re working in 2D.

So the formula is:

Number of water droplets going through cross-section

= Area(cross-section) / Area(water droplet)

= (4 in. x 4 in.) / (0.1 in. x 0.1 in.)

= 16 sq.in / 0.01 sq.in

= 1600

Hi,

In our day to day lives either latency or message size is important but not both. Basically, either you’re using the mail system or you’re talking on the phone (can you name a non-computer situation where both have to be balanced?). Given that networks are rated on their throughput it’s assumed that there’s no latency. It requires a bit of a kick to get people out of this mode of thinking; that both latency and throughput can be important for the same system.

- Andrew

Hi Andrew, call from your past! Sorry, I missed this comment entirely. Well said point. Latency is becoming vey important now IMO. With Amazon and Google and many others saying that latency is == revenue. Google said that 500ms delay costs them 25% searches. Amazon said that 100ms delay costs them 1% in revenue. I think the mindset will get fixed with time. Money is after all the only thing that matters;)

Aater

My example is usually not water pipes,

It is car engines, and I compare throughput =horsepower and latency=torque , but that assumes that the person understands car engines….

Jacob, Interesting suggestion. Will add that to my set of analogies as well, with your permission of course.

Hi,

very ease to understand the throughput and latency.

I guess that confusion stems from another source. It seems that most people imagine networks, applications and so on as if it would be single production line manufacturing cars, for example. And in this case latency is inverse of throughput. Assume, you have one car off the line each 4 hours. It is latency. You have 1/4 of car per hour or 6 cars per 24 hours. It is throughput. The difference appears when we add parallelism – second production line.

Hi Rorik. You are correct about that relationship between latency and throughput. It even applies with parallelism. The relationship is defined by Little’s Law. That might be an interesting topic for a blog post. I’ll talk to Aater and create it if I get the time.

I think throughput is not related to latency only if you have infinite number of workers in the channel. In your example, the pipe has 4 x 4 x 10 x 10 x 10 x 12 workers moving the droplets from one end to the other. Let’s say if the pipe has only one worker (i.e. it can only move one droplet at a time), then latency would become a major parameter in determining throughput. Think about it…

Although out of the scope of this particular blog entry, will you please point out some good reference or share your thoughts on the following concepts?

1) Effect of memory bank conflicts on thoughput in parallel computing on shared memory systems.

2) Are memory channels, bank-level parallelism and throughput directly related?

3) What is the maximum/ideal speed-up for a memory-intensive operation on a given shared memory system?

[...] I’ve been doing a lot of analysis of latency and throughput recently as a part of benchmarking work on databases. I thought I’d share some insights on how the two are related. For an overview of what these terms mean, check out Aater’s post describing the differences between them here. [...]

So the performance monitor counter for the “queue” that you refer to is avg disc queue length?

Viewing this analogy from my fluid mechanics instruction in mechanical engineering, this is a highly simplified model of flow through a pipe. It might be interesting, though I don’t know how useful to the IT world, to further compare some of the concepts of physical fluid flow to data flow concepts such as latency and throughput. On the physical side, there are factors such as the pressure of the fluid supply, the length of the pipe, changes in elevation of the pipe, and the resistance to flow in the pipe due to the roughness of the inside wall of the pipe, the shape of the cross-section of the pipe, obstacles and changes in the pipe geometry between inlet and outlet, the velocity of the fluid, the viscosity of the fluid (which is related to the density and the internal resistance of the fluid), and the resulting characteristics of the flowing fluid (laminar/smooth vs. turbulent/chaotic vs. a combination of the two). Queuing can occur while the fluid is waiting to enter the pipe.

For an engine, horsepower is related to the speed or rate at which torque is produced. For a given torque, the horsepower increases as the engine speed (rpm) increases. One way to express power is that it is torque per unit time. The unit of latency or delay is time (or clock cycles). The units of throughput are basically bits per unit time. I don’t agree that horsepower and torque are good analogies for throughput and latency. Yes, time matters to both throughput and latency, but the units of latency are not bits, and thus throughput cannot be latency per unit time.

[...] Little’s Law- An insight on the relation between latency and throughput Home Blogs Blogs Benchmarking Little’s Law- An insight on the relation between latency and throughput Little’s Law- An insight on the relation between latency and throughputI’ve been doing a lot of analysis of latency and throughput recently as a part of benchmarking work on databases. I thought I’d share some insights on how the two are related. For an overview of what these terms mean, check out Aater’s post describing the differences between them here. [...]

[...] and Throughput means “how many can I issue to the pipeline in one cycle.” One very good analogy uses pipes transporting water to illustrate the [...]

Thanks for the useful explanation. “There are a few fundamental concepts I expect every engineer/CS major to understand, regardless of what position they are applying for. One of them is the difference between a channel’s throughput and its latency.” I disagree. As a CS major you don’t need to know that and I never came across it myself before I researched it out of private interest, when digging into *hardware* and RTOS.

Value judgements and expectations what one should know are annoying at best and neither objective, interesting, nor useful.

Sorry, I had to note that, because what is expected from you is always “obvious and mandatory” if you have a given perspective. Honestly speaking, there is so much to know it is simply illusory to know it all, and these arrogant expectations are an illness. Thanks.

@someone, you realize why what you say is ironic right? you’re doing the same thing he is. you’re passing a value judgement and expectation on how he should know what he’s doing is annoying.

[...] Aater Suleman, Clarifying Throughput vs. Latency. page link [...]

birkenstock madrid sale

In a situation when throughput is not enough and queuing is occurring.

I add throughput, my queuing delay is reduced and my total latency is reduced.

It was confusing to me because it looked like latency depends on throughput: Increasing throughput reduces latency

I don’t agree with that Latency is independent of Throughput.

For example, let’s compare Car and Bus on transporting passengers from A to B, the distance between A and B is 10 miles.

Car: speed = 60 miles/hour, capacity = 5

Bus: speed = 20 miles/hour, capacity = 60

Latency of traveling 10 miles:

Car = 10mins, Bus = 30mins

Throughput of transporting passengers from A to B:

Car = speed * 1 hour / (10 miles * 2) * 5 = 15 people / hour

Bus = speed * 1 hour / (10 miles * 2) * 60 = 60 people / hour

Note: 10 miles * 2 means the distance of round trip. The vehicle has to travel back to pick new passengers.

Obviously speed and throughput are related in this case.

Very good info. Lucky me I came across your website by chance (stumbleupon).

I’ve saved it for later!

Not just informative, but VERY lucid also. Thanks A LOT!

Question 1: For sake of completeness, could you also derive the value for the Queuing Delay in your water-drop example? If 100 drops per second must wait (for every 16000 that are able to pass through) per second, what would be the queuing delay: (100 drops) / (16000 drops per second) = 0.00625 seconds ??

Question 2: Could you also recommend a good book/online resource for Queuing Theory, that’s less mathematical and more intuitive or applications-oriented, just like your example above is?