IO, throughput and blocking: a simple model

I’ve recently come across some comments and questions that discuss the idea of non-blocking IO and how it works. Which led me to this video

described as

..here is a short video that tries to explain the advantage of non-blocking javascript. It’s oversimplifying but it should illustrate the basic concept.

A simple model

Continuing with oversimplification, here I’ll present a model that describes throughput for a blocking and non-blocking IO process where messages have to be sent across a communication channel. This channel is characterized by two parameters, bandwidth and latency, that together with a third parameter, message size, completes the model. Thus

m = message size — the size of the messages to be sent

b = bandwith — the channel’s capacity, data that can be sent per unit time

l = latency — round tri,p time for a ping message

and what we want to identify is the throughput, as a function of these parameters

t = throughput — messages processed per unit time = f(m, b, l)

Throughput is defined as the amount of message replies arriving at the source per unit time, which is equivalent to the rate of messages arriving at the target under the assumption that message replies are not constrained by bandwidth. This throughput is what defines the performance of the IO process. We’ll use arbitrary units for the quantities, but you can think of message size, bandwidth and latency in the usual units (ie Kb, Kb/s, ms) if that makes it clearer.

First, the non-blocking throughput is simply the bandwidth divided by the message size

tn = b / m

this results from the assumption that non-blocking message sending will achieve 100% bandwidth utilization; the number of messages per unit time is directly given by the bandwidth.

Blocking

With blocking, each message will only be sent once a reply has been received for the previous one; previous messages block subsequent ones. This causes the number of message replies that arrive at the source per unit time to not be fully constrained by bandwidth, but by bandwidth and latency. The net effect is to add a time overhead to each message’s total processing time. Since this overhead occurs per message, we first calculate transmission time for one message using available bandwidth, which is

transmission time = message size / bandwidth

the overhead is half the latency, since that is the time it takes for a message reply to arrive, triggering the next message send. So the total time per message is

total time = transmission time + return delay = m / b + l / 2

because this is the total time required for each message send, the throughput is the reciprocal

tb = 1 / (m / b + l / 2)

This is the function we were looking for. As a sanity check, we can see that if the latency is zero, the above equation reduces to the non-blocking case. This also shows that t≤ tn for all values.

Note that this throughput corresponds both to that of message replies and messages arriving at the destination, the important point is that the latency blocks further sends, besides adding time to the roundtrip reply. Finally, if delays corresponding to message processing were variable, the term l / 2 would have to be substituted with the average processing time plus half the latency to yield an average throughput.

Plotting f

Let’s plot f using some example values to see what happens. We will use the following

message size = 5

bandwidth = 1-100

latency = 0.01-1.0

To compare performance with and without blocking, we’ll define a quantity

r = ratio of blocking throughput to non-blocking throughput = tb / tn

We will use octave to generate 3D plots for this data, using this simple script

function draw(message)
b = [1:100];
l = [0.01:0.01:1];
[x,y] = meshgrid(b,l);
r = (1./((message./x) + y/2))./(x/message);
n = x/message;
b = (1./((message/x) + y/2));
figure(1);
mesh(x,y,n);
figure(2);
mesh(x,y,b);
figure(3);
mesh(x,y,r);
endfunction

which generates three plots, one for non-blocking throughput, one for blocking throughput, and one for the ratio. The vertical axis shows the throughput, with the two horizontal axes corresponding to bandwidth (left) and latency (right). Message size was fixed at 5 units.

Non-blocking throughput (tn)

Nothing special going on here, throughput scales linearly with bandwidth. Because there is no blocking, latency has no effect, the surface is a plane.

Blocking throughput (tb)

The interesting thing to note here is how latency (bottom right axis) controls the degree to which throughput scales with bandwidth (bottom left axis). With high latency, throughput barely scales, the curve rises but is almost flat. At the other extreme, with zero latency the scaling is the same as in the previous graph, a straight line with the same gradient rising up to the red region. The transition from no scaling to scaling occurs in the middle of the graph, as the shape of the surface changes. Overall, throughput is of course reduced in comparison to non-blocking.

Ratio (r = tb / tn)

A ratio of 1.0, equal throughput, occurs with zero latency, the peak at the top right that continues along the ridge at the top of the graph. But what is interesting is the other ridge that extends towards the bottom. In fact, the trend that we can see from the blue area towards the latency axis is that lower bandwidth produces higher ratios closer to 1.0, equal performance. After a moment’s thought, this makes sense, the lack of bandwidth negates the effects of blocking. Or in other words, blocking still manages to utilize 100% bandwidth when it is scarce.

Javascript and non-blocking IO

The last point I’d like to make here is that the association that some people make between javascript and non-blocking IO is wrong. Javascript as a language says nothing about blocking or non-blocking IO. What javascript as a language does have is anonymous functions (or lambdas, or whatever you want to call them). Anonymous functions as first class citizens makes it easy to define callbacks, and it is through these callbacks that non-blocking IO can occur. But in this day and age, not having anonymous functions is a stone age thing rather than the norm, so most modern languages should be able to do non-blocking IO with ease.

Type classes as a decoupling mechanism

Say you have a class T and a method m. represents an operation that is applicable to many types, including T. In good OO practice, the only information needs about the types it operates on is precisely that part of the types’ nature that makes m applicable to them. So you create an abstraction representing that nature, and define m according to it. In turn, classes like T conform to that abstraction.

If the abstraction is an interface, we implement that interface. If the abstraction is a class, we inherit that class. Both of these mechanisms will work, they are a standard solution to the problem. But even though the use of an abstraction reduces the coupling between T and m, it still exists. If you want an existing class to be usable with m you have to modify its type and implementation. That class would now be aware of m.

The abstraction that m requires modifies T

Type classes are an alternate abstraction used to define without altering the types that m operates on. Instead of specifying that T must have some type, the type class mechanism merely states that there must exist some code somewhere that m can rely on in order to operate on T. This code, which does not necessarily exist at T, is the abstraction that m depends on, but that T need not be aware of.

The type class decouples T from m

The classic example would be sorting objects, where a sorting method requires some way to compare the objects it must sort. Using the standard solution, you would create an abstraction like Comparable, and make m require the objects it sorts to have that type. In Java we have

public static <T extends Comparable<? super T>> void sort(List<T> list)

In the type class approach, however, the abstraction is separate from the type of the objects to be sorted. Instead, it requires that some code, a Comparator, be accessible to the sorting method. In Java,

public static <T> void sort(List<T> list, Comparator<? super T> c)

or in a Scala example, where the type class is Ordering

def sorted[B >: A](implicit ord: Ordering[B]): List[A]

I’ve shown examples of the type class mechanism for Java and Scala[1]. What makes type classes even more effective in Scala is that the Ordering[B] parameter need not be passed explicitly. This remedies the two problems that the java type class solution has. First, users of the sort operation should not have to worry about passing the Comparator, it is an implementation detail. Second, in Scala it is possible to vary the Ordering implementation to be used just by bringing in a different Ordering object into scope, without having to modify any of the call sites.


[1] It could be said that the type class approach by definition requires transparent passing of type class instances, so the Java code would not really qualify as an example

[2] http://ropas.snu.ac.kr/~bruno/papers/TypeClasses.pdf

I’ve inherited 200K lines of spaghetti code—what now?

Here is an article on arstechnica about what to do when you inherit a gigantic code mess. Nothing new here, but still nice to get all these ideas into one place, and many of them are applicable in general, not just when something nasty’s been dumped on you.

Other related topics to think about are how and why does code rot into spaghetti, and more ambitiously, how to characterize programming in terms of engineering , science and craftmanship.

Continuous improvement and TDD

It’s an old message, in a 2009 talk. But it’s good to recite the arguments every now and then, especially with such a great talk. And of course there’s the bonus of the other material which is actually the focus of the talk; the history of Smalltalk and how it “died”, according to Martin, due to messy code, parochialism/arrogance, and inability to deal with real world or enterprise requirements.

But back to the reason I’m linking this video, continuous improvement. Code tends to be messy, and needs constant correction, continuous improvement, to not collapse into an unmanageable, unmaintainable, unintelligible ball of yarn. So, code tends to be messy. What does this mean, and why is this the case?

Well, software is complex relative to our ability to write it. So it’s unlikely to get code right the first time. Secondly, writing code occurs when not all the information is available. This compounds the first point, and makes it even less likely to get things right initially.

But not only that, the fact that information is not available, or that in fact, changes as code is written, has the consequence that the code will evolve to hit a moving target. Changes to code again leads to messy code. The reason is different than that above. It’s a matter of coherence, not of correctness.

So in this simple model, we have two main driving forces that point to messy code. Intrinsic difficulty of getting things right and incoherence due to changing requirements. In theory, these are not unsurmountable. “All” it requires is constant improvement and correction. However, constant improvement requires a lot of discipline and mental effort to undertake. So constant improvement is not carried out and code gets messy.

The main point of test driven development is to lower the mental effort required for continuous improvement, by addressing its biggest obstacle, fear of breaking things. So the logic goes, reduce the fear of breaking things -> reduce the mental effort for continous improvement -> reduce the level of mess.

As I said, it’s an old message, nothing new here, but it’s worth remembering every now and then.