The pairwise-beta model for pairwise voting

In a pairwise vote, voters are asked to repeatedly pick between pairs of options, selecting the one they favor of the two. The procedure then combines all these pairwise choices in order to establish a global ranking for all options. A pairwise vote is statistical in nature, it must infer preference data that voters have not explicitly stated in order to obtain a result.

This statistical property allows obtaining an approximate preference ordering over a large list of options without overwhelming the voter with too much work. For example, if each voter was asked to establish a preference over 50 items, they would be exhausted and participation would suffer.

The pairwise-beta is a simple bayesian method used to rank items in pairwise votes. It is based on the beta-binomial model [1], which is composed of a beta prior and a binomial likelihood. This model is very tractable: because the beta distribution is conjugate to the binomial, the posterior also has beta form and is easily obtained:

bb1

I will not present a formal justification of the pairwise-beta model for pairwise comparisons, rather I will present some intuitions that should convey how and why the model works.

The key bridge to interpret pairwise comparisons in terms of the beta-binomial model is to realise that the better-worse relation between options maps directly into the success/failure outcomes of a bernoulli trial. We can thus establish a correspondence[4] between pairwise comparisons and bernoulli trials:

  • each item i corresponds to a sequence of bernoulli trials, Bi
  • a comparison in which i wins corresponds to a success in Bi
  • a comparison in which i loses corresponds to a failure in Bi

The question we are trying to answer is

Given the proportion of comparisons in which i wins, what is the proportion of items that are globally better than i?

that reformulated in terms of our correspondences becomes

Given a sequence of bernoulli trials Bi, what is the proportion of successes/losses for i?

Which is a case of standard binomial proportion estimation[2]. As we noted before, the posterior of the beta binomial is also a beta distribution, given by

bb3

If we want a point estimate we just use the mean for this distribution which is

bb2

This gives us, for each item i, an estimation for the number of items that are better/worse than itself. This leads directly to a global ranking: the best ranked items will be those which are estimated to be better than most other items.

 In summary, the procedure is

  1.  For each item i, obtain the corresponding sequence of bernoulli trials Bi
  2. For each item i, calculate the posterior beta distribution mean given the data from 1)
  3. Create a global ranking based on the proportions for each item, as calculated in 2)

The pairwise-beta model is simple but not perfect. In particular it does not exploit information about the strength of the opposing item in a pairwise comparison. However, despite this drawback it performs well in practice. Please refer to [3] for details.

 


[1] http://www.cs.cmu.edu/~10701/lecture/technote2_betabinomial.pdf

[2] http://homepage.ntu.edu.tw/~ntucbsc/%A5%CD%AA%AB%C2%E5%BE%C7%B2%CE%ADp%BF%D4%B8%DF%A4@971008/[3]%20Chapter%208.pdf

[3] http://arxiv.org/pdf/1202.0500v2.pdf

[4] The correspondence is two to one, as each comparison yields two bernoulli trials

Error handling: return values and exceptions

My colleague edulix started a discussion on the golang list about the merits of go’s error handling. This got me thinking about the problem of error handling in general, and that no language seems to have has gotten it quite right. What follows is a quick braindump.

Two approaches

There are two prominent approaches to error handling in software engineering today, using return values and exceptions.

With return value error handling, errors are indicated through the values returned from functions, the caller has to write checks on these values to detect error conditions. Originally, error returns were encoded as falling outside the range of data corresponding to normal operation.

For example, when opening a file with fopen in C++, the normal return is a pointer to an object representing the file. But if an error occurs the function returns null (additional information about the error is available from a global variable, errno). Just as the caller of the function uses normal return values, said caller is responsible for dealing with errors present in the return.

Exceptions establish a separate channel to relay error information back from functions. When using exceptions, a function’s return only holds values that correspond to normal behavior. In this sense the values returned by functions are more immediately identified with the function’s purpose, details regarding what can and does go wrong are separated into the exceptional channel.

For example, the purpose of the Java’s parseInt function is to parse an integer. So the return value’s content is exclusively a consequence of this purpose: the integer parsed from the string. If something goes wrong when attempting to achieve this purpose it will be represented in the throwing of an exception which holds error details. Whereas with return values there is no need for additional machinery beyond that provided by normal function calls, exception requires an extra mechanism: try/catch blocks.

Exceptions’ claimed virtues

exc

Exceptions try to declutter code (http://mortoray.com/2012/03/08/the-necessity-of-exceptions/)

Exceptions aim to be an improvement over return values. Here are some possible shortcomings of the return value approach:

  • The code is cluttered with error checking that reduces its readability, intent is obscured.
  • Sometimes the scope at which an error occurs is not the best one to handle it. In these cases one has to manually propagate error’s backwards, using several returns until an adequate error handling scope is reached. This is cumbersome.
  • If one forgets to check for errors the program may continue to run in an inconsistent state, causing fatal errors down the line that may not be easy to understand.

Exceptions try to address these possible shortcomings:

  • With exceptions normal program flow is separated from error handling code, which lives in special catch blocks for that purpose. This declutters normal behaviour code, making intent clear and readable.
  • With exceptions errors programmers have the option of handling errors at the right scope, without having to manually return across several levels. When exceptions are not handled at the scope the error occurs, they are propagated automatically up the stack until a suitable handler is found.
  • If exceptions are not handled, the program crashes immediately instead of continuing in an inconsistent state. A stack trace shows the error, and the call stack leading to the error.

But it’s not all good

The story doesn’t end here of course, exceptions have their problems. Some say that they are worse than the solution they were trying to improve upon. This is the position that many proponents of Go’s error handling defend, stating that returns with multiple values (feature not present orignally in C, hence errno and the like) are not only sufficient but simply better than exceptions:

  • Because exceptions propagate up and can be handled elsewhere, programmers are tempted to ignore errors and “let someone else deal with it”. This lazy behaviour can lead to errors being handled too late or not at all.
  • It is in principle impossible to know whether a given function call can throw an exception, short of analyzing its entire forward call tree. As a consequence, the use of exceptions in a language is equivalent to hidden goto’s that can jump control back up the call stack at any function invocation. In contrast, when using return values error information is formally present in the function signature.

Whereas before we listed exception propagation as an improvement (allowing error handling at the right scope), here it is listed as a weakness: it can encourage bad programming practice. But it is the second criticism that I find more significant: exceptions are opaque and may lead to unpredictable behaviour. With such an unpredictable ingredient, the programmer is unable to ancipate and control the potential paths his/her code can take, as lurking under every function call is the potential for a jump in execution back up the call stack.

Of course, in real world practice, exceptions are not the randomness disaster that this description may suggest[1]. Properly documented functions do communicate to the programmer, to a reasonable approximation, what can and cannot happen exception-wise. And properly implemented functions do not whimsically throw exceptions at every opportunity, “magically” making your code jump to a random location.

Good intentions: checked exceptions

What about checked exceptions, you say? The rationale for checked exceptions in Java is precisely one that formally addreses the very problem we just described, because

With checked exceptions, function calls must, in their type signature, specify what exceptions can be thrown.

Checked exceptions are part of a function’s type and the compiler ensures that exceptions are either caught or declared to be thrown (a use of the type system that fits well with what I’ve advocated here). Isn’t this exactly what’s needed? The subtle problem is that the phrase “caught or declared to be thrown” does not mean the same as “correctly handled”. The use of checked exceptions encourages lazy behaviour from programmers that makes things even worse. When forced by the compiler to deal with checked exceptions, programmers bypass it by

  • Indiscriminately adding throws clauses, which just add noise to the code
  • Writing empty try/catch blocks that never get populated with error handling code, potentially resulting in silent failures

The second problem is especially dangerous, it can result in exceptions being swallowed and silenced until some larger error occurs down the line, an error which will be very hard to diagnose. If you recall, this is precisely the third drawback we listed for error handling via return values. Both problems arise, not due to the language feature itself, but out of imperfect programming.

The bottom line for a language feature is not what some ideal programmer can do with it, but the use it encourages in the real world. In this case the argument goes that checked exceptions (and as we saw above, error return values) make it harder to write correct error handling code; it requires additional discipline not to shoot oneself in the foot. Which is a shame, since the rationale behind checked exceptions is very appealing: documentation at the type level and compiler enforcement of proper error handling.

Quick recap. Exceptions aim to be an improvement over return values offering benefits mentioned above: code decluttering, error handling at the right scope via propagation and failing fast to avoid hidden errors. Of these three features, the second is questioned and undermined by the two points that proponents of return values make. Opacity in particular is a strong drawback. Checked exceptions aim to resolve this by formally including exception information in the function type and enforcing error handling through the compiler, but the industry consensus seems to be that they do more harm than good.

The way forward

It’s difficult to reach a general conclusion in favor of exceptions or return values, the matter is unclear. But let’s assume that exceptions are no good. Granted this assumption, it would not mean that the problems with return values magically went away. Is there some way, besides exceptions, to address these problems?

One approach I find promising achieves decluttering but within the confines of errors as return values, using a functional style. This technique exploits several programming language features: sum types, pattern matching and monads. But those are just details, what’s important is the end result, see this example in Rust:

The first two lines of the function are operations that can fail, they return a sum type that can represent either a success or failure. The desired behaviour is such that if any error occurs at these lines, then the function from_file should stop executing and must return that error.

Notice how the logic that achieves this is not cluttering the code, it is embedded in the try! macro, which uses pattern matching on the sum type to either return a value to be used in the next line (the file variable), or short circuit program flow back to the caller. If everything works the code eventually returns Ok. Let’s restate the main points

  • from_file has a sum type return that indicates that the function may fail.
  • the individual invocations within the function itself also return that type.
  • the try! macro extracts successful values from these invocations, or shorts circuit execution passing the error as the return of from_file.
  • the caller of the from_file function can proceed the same way or or deal with errors directly (via pattern matching)

Besides the internal machinery and language features in use, I hope it’s clear that this style of handling errors mimics some of the positive aspects of exceptions without using anything but return values. Here is another example which has a similar[2] intent, this time in Scala

This code shows a series of operations that can each fail: getting a row from a database, getting one of its columns, and then doing a lookup on a map. Finally, if any of the steps fail a default value is assigned. The equivalent code with standard null checking would be an ugly and repetitive series of nested if-blocks checking for null. As in the previous example, this error checking logic is not cluttering the code, but occurs behind the scenes thanks to the Option monad. Again, no exceptions, just return values.

Error handling is hard. After years of using exceptions it is still controversial whether they are a net gain or a step back. We cannot expect that a language feature will turn up and suddenly solve everything. Having said that, it seems to me that the functional style we have seen has something to offer in at least one of the areas where traditional return values fall short. Time will tell whether this approach is a net gain.


Notes

[1] If real world application of exceptions was in fact disastrous, software written with exceptions would just never work, and apparently it does; exceptions are widely in use in countless systems today.

So a reasonable position to take is that criticisms of exceptions are on the mark when pointing out that the exception mechanism is opaque and potentially unpredictable. This criticism does not mean that software written with exceptions is inherently flawed, but that exceptions make correct error handling hard. But then again, error handling is a hard problem to begin with. Does the opacity of exceptions null its advantages?

[2] In this case the problem is how to deal with failure in the form of null values.

[3] In Rust for example, the problem of swallowing errors is also addressed with #[must_use] , see http://doc.rust-lang.org/std/result/

Further reading

Exceptions

http://www.joelonsoftware.com/items/2003/10/13.html

http://www.lighterra.com/papers/exceptionsharmful/

http://mortoray.com/2012/03/08/the-necessity-of-exceptions/

Checked exceptions

http://www.artima.com/intv/handcuffs3.html

http://googletesting.blogspot.ru/2009/09/checked-exceptions-i-love-you-but-you.html

Rust

http://doc.rust-lang.org/std/io/index.html

http://rustbyexample.com/result/try.html

http://www.hoverbear.org/2014/08/12/Option-Monads-in-Rust/

Scala

http://danielwestheide.com/blog/2012/12/26/the-neophytes-guide-to-scala-part-6-error-handling-with-try.html

http://twitter.github.io/effectivescala/#Error handling-Handling exceptions

http://blog.protegra.com/2014/01/28/exploring-scala-options/

https://groups.google.com/forum/#!topic/scala-debate/K6wv6KphJK0%5B101-125-false%5D

Integer encoding of multiple-choice ballots (2)

In the last post we saw how simple arithmetic with the right choice of base can encode integer lists for multiple ballot choices in an elegant and compact way. A couple of points were mentioned in the notes section, one of them was

our scheme allows encoding repeated choices, which are usually not legal ballots. For example, one can encode the choice 1,1,1 but that would be an illegal vote under all the voting systems we mentioned as examples. The fact that legal ballots are a subset of all possible encodings means that the encoding is necessarily suboptimal with respect to those requirements. We will look at an alternative scheme in the next post.

The alternative scheme we show below attempts to optimize the encoding by using a variable base as digits are encoded. The main idea is that as chosen integers in the list are encoded, the remaining ones are constrained since they cannot be repeated. Thus the base can be reduced as less choices are possible. In practice the process is complicated by the need to keep track of what digits correspond to what, as gaps form in the remaining integers and the corresponding meaning of digits changes. Here is a python implementation

and a sample session using both encoders

Note the shorter value produced by the second encoder

encode: 14606467545964956303452810

encode2: 36697695360790800022

Despite the reduction, the second encoder is not optimal (the first encoder is optimal given repeatable choices); the range of output numbers is larger than that of legal ballots. It would be interesting to see how to obtain the most compact solution, a detailed analysis could compare these schemes systematically to get quantitive measures of space efficiency.


Integer encoding of multiple-choice ballots

Secure voting systems supporting privacy through encryption must encode ballot contents into integers before they can be encrypted[1]. This encoding step is mostly trivial. For example, imagine a yes-no-abstain ballot. One can simply apply the following mapping to yield integer plaintexts for corresponding ballots:

Yes => 1
No => 2
Abstain => 3

But things can get a bit more involved when dealing with multiple-selection ballots. These are ballots where the voter makes more than one choice. They can be either ranked ballots, where the voter specifies a preference relation over the selections, or unranked ballots where no such a preference exists. Examples of voting systems using the former are single transferable vote or instant runoff voting.  Systems like approval voting or plurality at large are examples of the second type, using unranked ballots.

Imagine we are using one of these systems to elect a candidate out of a field four: Alice, Bob, Charlie, and Donna. We first apply the trivial mapping:

Alice => 1
Bob => 2
Charlie => 3
Donna => 4

But how do we encode a complete ballot, for example, a ballot with (X corresponds to marked choices)

Alice X
Bob X
Charlie O
Donna O

Unlike the yes-no-abstain ballot above, the content of the ballot corresponds to a list of integers: 1 and 2. We could use the following mapping

encode: Alice, Bob => 1, 2 => 12

The ballot is encoded as the number 12, resulting from the concatenation of the string representations of each of the integers. But what if there are more candidates, and we need to encode something like:

encode: 11, 3 => 113

That won’t work, because 113 could represent either 11 and 3, or 1 and 13.

decode: 113 => ?

We can remedy this by adding some padding such that each choice is well separated:

encode: 11, 3 => “1103” => 1103

Then when decoding, we convert the integer to a string, split it every 2 characters, and finally obtain integers for each of the candidates:

decode: 1103 => “1103” => “11”, “03” => 11, 03

But there’s still a problem, what about this choice:

encode: 1, 13 => “0113” => 113

We run into trouble here, because the string “0113” corresponds to the integer 113; there is no mathematical difference between “0113” and “113”. To fix this, when decoding we can first check that the string length is a multiple of 2 (since we are using 2 chars per candidate integer), if it is not we prepend the required zeros. The encode-decode process would be

encode: 1, 13 => “0113” => 113
decode: 113 => “113” (prepend zero) => “0113”  => “01”, “13” => 1, 13

I hear you complain that all this concatenation, padding, and prepending looks a bit hackish, can we do better?

Let’s go back to our first example, when we simply wanted to do

encode: Alice, Bob => 1, 2 => 12

This looked very nice and simple. Can’t we do something like this in general, without any string hackery? The first step is to go back to the definition of decimal numbers.

Decimal numbers (cplusplus.com)

In these terms, the encoding 1, 2 => 12 corresponds to

(10^1) * 1 + (10^0) * 2 = 12

Here we have expressed the encoding of 1, 2 using arithmetic, no string operations involved. The ballot choices are interpreted as digits according to the mathematical definition of decimal numbers. (In fact, this is what goes on under the covers when you convert a string like “12” into the number 12.) This gives us a a purely arithmetical description of the simple mapping we started with. Things then got complicated when we considered the possiblity of more choices (candidates) in the ballot. Let’s apply our mapping to that problematic ballot:

encode: 11, 3 => (10^1) * 11 + (10^0) * 3 = 113

Our new procedure fails the same way: the simple scheme where each digit represents one choice cannot be accommodated by the decimal digit representation of choices, and the result 113 is ambiguous. But wait, who says we have to encode according to a decimal representation? What if we were to map choices to hexadecimal digits,:

encode: 11, 3 => (10^1) * B + (10^0) * 3 = B3

And we’ve restored simplicity and correctness to our scheme. B3 encodes the choice 11, 3 with one choice per digit and no ambiguity! If the B3 looks like cheating, just remember, B3 is a representation of a number that in decimal format turns out to be 179. The encode-decode process could just as well be written as

encode: 11, 3 => 179
decode: 179 => 11, 3

The bottom line is we can encode lists of integers into an integer provided we use the optimal base, which is equal to the number of possible choices in the ballot plus one.

Let’s revisit our original example, with Alice, Bob, Charlie and Donna. Since we have four candidates, our base is 4 + 1 = 5. The encoding is thus:

encode: Alice, Bob => 1, 2 => (10^1) * 1 + (10^0) * 2 = 12 (base 5) = 7 (decimal)

in short:

encode: 1, 2 => 7
decode: 7 => 1, 2

Note that not only is this method simpler with no string operations or padding, but the encoded values are smaller. Compare:

encode: 1, 2 => 12
encode: 11, 3 => 1103

with

encode: 1, 2 => 7
encode: 11, 3 => 179

Which should not come as a surprise, encoding with the specified base is the most compact[2] encoding possible (proof left as excercise for the reader). A larger base wastes space encoding ballot contents that are not possible, whereas a smaller base is insufficient to encode all possible ballots.

Finally, here is a python implementation of the encoder we have proposed

In the next post we will further discuss details as to the compactness of the encoding mentioned in [2].


[1] In the case of ElGamal encryption used in Agora Voting, the plaintext must be encoded into an element of the multiplicative subgroup G of order q of the ring Zp, where p and q are suitably chosen prime numbers. In order to do this, the plaintext must be first encoded into an integer, after which it is mapped to a member of G, and subsequently encrypted.

[2] A few caveats must be mentioned. First, we are using 1-based indices to represent choices, which means some values are unused. Second, our scheme allows encoding repeated choices, which are usually not legal ballots. For example, one can encode the choice 1,1,1 but that would be an illegal vote under all the voting systems we mentioned as examples. The fact that legal ballots are a subset of all possible encodings means that the encoding is necessarily suboptimal with respect to those requirements. We will look at an alternative scheme in the next post.

Voter fraud and bayesian inference – part 3

Here’s part1 and part2.

Welcome back. In the previous posts we saw how to do inference using the beta-binomial to get probabilities for the proportion of fake ballots in an election, as well as an upper bound on the probability that the election result is incorrect. We briefly mentioned the hypergeometric distribution but did discuss it further nor use it.

Like the binomial (and beta-binomial), the hypergeometric distrbution can be used to model the number of successes in a series of sampling events with a binary outcome. The distinction is that the binomial models sampling with replacement, whereas the hypergeometric models sampling without replacement. In other words, if we are sampling from a box, the binomial applies when the sample is returned to the box before drawing more samples. The hypergeometric applies when the sample is not returned. But wait, doesn’t that mean that we’ve been doing it wrong?

When auditing ballots we keep track of those already checked, a ballot is never audited twice. Shouldn’t we then be using the hypergeometric distribution? It turns out that the binomial distribution approaches the hypergeometric distribution in the limit of a large total number of items compared to the number sampled. This fits our case, as we can only audit a limited number of ballots compared to all those cast.

beta5

Hypergeometric for increasing values of N. The bottom right is the corresponding beta-binomial.

As we saw in the previous post, the beta distribution is a conjugate prior for the binomial, which makes inference very easy. Unfortunately this is no the case for the hypergeometric. But because of the converging behaviour seen above, we can stick to the beta-binomial’s easy calculations without sacrificing accuracy. For the sake of completeness we will quickly show the posterior for the hypergeometric, following [1]. Incidentally this “manual calculation” is what allowed us to obtain the images above, through the javascript implementation in the jsfiddle.

hyp

Again, this is just Bayes theorem with the hypergeometric likelihood function and a uniform prior. In [1] it is also pointed out that the normalization factor can be computed directly with this expression

hyp2

We use this in the jsfiddle implementation to normalize. Another thing to note is that the hypergeometric posterior is 0 at positions that are inconsistent with evidence. One cannot have less successes than have been observed, nor more than are possible given the evidence. These conditions are checked explicitly in the implementation. Finally, the jsfiddle does not contain an implementation for obtaining the upper bound on the probablity of election error, only code for the posterior is present. How about forking it and adding that yourself?

In these three posts we have used bayesian inference to calculate probabilities over proportion of fake ballots, and from there to calculate probabilities that an election result was incorrect. These probabilities could be used to achieve trust from stakeholders in that everything went well, or conversely to detect a possible fraud and invalidate an election.

I’ll finish the post by mentioning that the techniques we have seen here can be generalized beyond the special case of detecting fake ballots for plurality votes. For example, one could use bayesian inference to conduct ballot audits for the sake of checking tally correctness, not just from failures in authentication, but from errors in counting. See [2] for this kind of more general treatment


[1] http://www.wmbriggs.com/public/HGDAmstat4.pdf

[2] https://www.usenix.org/system/files/conference/evtwote12/rivest_bayes_rev_073112.pdf

In this work, because results of audits are not just binary, and because tallies are not only plurality, the authors use dirichlet distrbutions and sampling using posteriors to project possible alternative tallies.