Liquid democracy and spectral theory

In this post we will show some practical examples of how liquid democracy can be understood in terms of, and make use of results from spectral graph theory. For more background please see [Vigna2009]. Wikipedia says:

In mathematics, spectral graph theory is the study of properties of a graph in relationship to the characteristic polynomial, eigenvalues, and eigenvectors of matrices associated to the graph, such as its adjacency matrix or Laplacian matrix.

What does this have to do with liquid democracy? To answer this, let’s remember what defines liquid democracy is: system of transitive proxy voting. Proxy in that I can delegate my vote to some entity. Transitive because that entity can itself delegate the vote, and so on. Imagine a simple case with three voters, Alice, Bob, and Charlie. Alice delegates to Bob, and Bob delegates to Charlie. It is very natural to represent this as a graph, this is what we call the delegation graph

graph_decycled

Alice chooses Bob, Bob chooses Charlie

Assuming each voter starts with one vote, this would give us the following weights for each voter:

Alice = 1, Bob = 2, Charlie = 3

Because Alice and Bob have delegated their votes, Charlie will end up casting three votes, one for himself, one for Bob and one for Alice. What determines these weights is the structure of the graph; who is voting for who is encoded in the connections between vertices. In graph theory, the object that encodes this information is the adjacency matrix. Here’s the adjacency matrix for our example:

M

Where each row shows who each voter delegated to. Alice (A) delegated to Bob (B), hence the 1 in the second position of the first row. Similarly, Bob (B) delegated to Charlie, (C) as can be seen in the second row. Because Charlie did not delegate, the third row is all zeroes.

We can express the liquid tally above with these equations (1’s represent voter’s initial weights)

0*A + 0*B + 0*C + 1 = A

1*A + 0*B + 0*C + 1 = B

0*A + 1*B + 0*C  + 1 = C

Note how the 0’s and 1’s above correspond to the columns of the adjacency matrix. The above can be represented[1] in matrix form:

(A B C) * AdjacencyMatrix = (A B C)

This is an eigenvalue equation, whose eigenvector (the (A B C) row vector) corresponds to the result of the liquid tally. Note how the equation is recursive, which fits the recursive nature of transitive delegation. Vote weights are calculated in terms of vote weights themselves, the same way each delegate transmits votes that result from previous delegations.

When the adjacency matrix is used to evaluate the importance or influence of nodes in a graph in this way we are speaking of eigenvector centrality. Our example shows that calculating centrality is basically equivalent to tallying in liquid democracy. This is what makes the connection between spectral theory and liquid democracy.

Eigenvector centrality, Katz and Pagerank

Yes, that’s PageRank as in google, in case you thought this whole talk of centrality was a bit abstract, it’s what made google what it is today. Eigenvector centrality, Katz centrality, and PageRank are related methods to measure the importance of a node in a network. We won’t go into the differences between each measure, besides noting that both Katz and PageRank include an attenuation factor that decreases contributions from distant nodes, whereas eigenvector centrality does not.

In order to run some examples we will use the networkX library, which includes several functions for calculating centrality as well as many other useful features. If you want to play along with this code you will need to install the library as well as its dependencies, including matplotlib and numpy. If you’re on windows I suggest downloading the comprehensive WinPython package which includes everything you need here.

Let’s first create the graph corresponding to our example with Alice, Bob and Charlie. Here’s the code that does that

This generated the image shown earlier. Now to calculate the tally, we will run both Katz and PageRank.

which gives us

Both results match the tally we showed before. A couple of minor points above. First, the PageRank result was rescaled to make it match Katz. Second, the adjacency matrix for Katz was reversed as the networkx 1.8.1 Katz implementation is using a right eigenvector (this has been changed to left eigenvector in master).

More importantly, the alpha parameter is a damping factor. In the language of liquid democracy it modulates just how transitive delegation is by reducing contributions the further away the originate. For example, let’s change the above to alpha = 0.5:

Now Charlie receives 25% of Alice’s vote and 50% of Bob’s vote. So alpha quantifies the fraction of the vote that is effectively delegated. We can interpret then that a liquid democracy tally is a special case of Katz centrality and PageRank. In fact, liquid democracy is the limiting case of Katz and PageRank when alpha = 1.0, ie no damping (which is why you get viscous democracy in [Boldi2011]).

What about cycles?

One of the first things you have to deal with if you’ve implemented a liquid tallying algorithm is the possibility of cycles in the delegation graph, otherwise the procedure will blow up. Having detected a cycle at tally time the standard treatment is to consider votes that enter it as lost. In order to prevent that undesirable situation you can do cycle detection at vote time to warn the user that his/her vote may produce such a cycle.

What happens if we add a cycle to our example? Let’s try it

graph

and we get

The reason this happens has to do with the details of the algorithm that calculates eigenvectors; in particular the relationship between its convergence and the attenuation factor alpha[2]. The short story is this: using an attenuation factor of 1.0 on a graph with cycles may cause problems.

Just as liquid tally algorithms have to deal with cycles, so do we in order to make centrality work correctly. Fortunately there are fast algorithms to detect cycles in graphs. NetworkX offers an implementaton of an improved version of Tarjan’s strongly connected components algorithm, we will use it to define a function that removes cycles in a graph

Using this function we can obtain liquid tallies for any delegation graph correctly, using either Katz or PageRank. See the bottom of this post for the full python script demonstrating this.

Liquid democracy and degree (or branching factor)

Before we said that liquid democracy is the limiting case of Katz centrality and PageRank when alpha = 1.0. In the last section we saw another requirement besides that of alpha = 1.0: that the delegation graph must be acyclic, in other words a DAG. There is one more property that we can consider, degree.

A node’s degree is the number of (in our case, outward) connections with other nodes. In terms of delegation, it is the number of delegates that a voter chooses. Standard liquid democracy uses degree = 1, but such a requirement could in theory be relaxed. How does this fit in with Katz and PageRank? Lets construct a graph where voters may choose one or two delegates.

which gives

graph

resulting values

We see how Katz centrality does not yield a correct tally as it is not dividing outgoing weights for voters who split their delegation among two delegates, instead we get inflated weights. But the PageRank result does work, Bob’s two votes are split correctly, and the delegation proceeds normally from then on.

In summary

  • Liquid democracy is a special case of Katz centrality given
    • a damping factor alpha = 1.0
    • a directed acyclic graph of degree d = 1
  • Liquid democracy is a special case of PageRank given
    • a damping factor alpha = 1.0
    • a directed acyclic graph of degree d >= 1

That’s it for our quick tour of the relationship between liquid democracy and spectral theory. We have also seen how liquid democracy could be extended to include damping (as in [Boldi2011]), or to allow “multi delegation”.


Notes/References

[Vigna2009] Sebastiano Vigna – Spectral Ranking  http://arxiv.org/abs/0912.0238

[Page1998] The PageRank Citation Ranking: Bringing Order to the Web http://ilpubs.stanford.edu:8090/422/1/1999-66.pdf

[Boldi2011] Viscous Democracy for Social Networks http://chato.cl/papers/boldi_bonchi_castillo_vigna_2011_viscous_democracy_social_networks.pdf

[1] For simplicity, I have ignored the initial weights associated with each voter in the matrix equation. These initial weights are what makes liquid tallying equivalent to undamped Katz centrality rather than eigenvector centrality.

[2] For details as to alpha and PageRank See http://vigna.di.unimi.it/ftp/papers/PageRankFunctional.pdf section 5 Limit behaviour.

In the case of the networkX implementation of Katz centrality an alpha of 1.0 is guaranteed to converge as all eigenvalues of an acyclic graph are 0 (see http://www.emis.de/journals/JIS/VOL7/Sloane/sloane15.pdf and http://networkx.github.io/documentation/latest/reference/generated/networkx.algorithms.centrality.katz_centrality.html#networkx.algorithms.centrality.katz_centrality)

Python Script: Random liquid Tallying with Katz/PageRank

Open proposal elaboration

(image from cartoonmovement.com)

Democracy is not just about voting, but about the wider problem of making decisions in a community. Good systems for making decisions are more likely to yield decisions that reflect the community’s preferences and exploit its collective knowledge. A decision making system, therefore, takes as input preferences and knowledge and produces as output a decision. Such a decision making system is effectively an information aggregator; high quality decisions are the result of aggregating as much information as possible: no preferences are ignored, and no valuable knowledge is left unused.

This brief description brings us to the distinction between voting and the general case of  decision making. In traditional voting, voters are asked to cast a vote which represents a choice between a predefined set of already existing alternatives. The alternatives presented to the voter are fixed prior to the vote, they have been previously elaborated by some other agent independent of the voters. The voters simply choose among the set presented to them.

In the terms presented above, the knowledge and preferences that voters input into the system only serves to discriminate between a predefined set of options. If the process by which the options were elaborated has not previously already aggregated this information and materialized it into the options themselves, that information could be lost.

In other words, voting is too coarse grained to aggregate the information in all its richness. The grain is determined by the options, if these options do not previously already represent the best quality decisions, the resulting decision will be suboptimal. The optimal decision implicit in the lost information is unavailable as a choice.

The rise of information technology is very relevant to this problem. It is clear, nominally speaking, that a problem of aggregating information is directly related to information technology. More specifically, the communication possibilities that the modern day internet afford are precisely those that are necessary to open up the possibility of aggregating information from hundreds of thousands of sources in a much more refined way than the coarse grained procedure of voting allows.

The key idea is this: information technology and the internet allow extending voters’ participation from just the voting phase, back in time to include the proposal drafting/elaboration (what before were merely fixed options) phase. In contrast to the coarse grain input that voting allows, contributions to a proposal as it is being written are fine grained; modifications to proposals can be arbitrarily small and precise. From adding (or removing) a comma, to a sentence, to a paragraph, to an entire proposal.

We call this opening up of proposal elaboration to the general voter population Open / Distributed / Federated proposal elaboration.

It is open because anyone can contribute. It is distributed/federated because there is no central authority that drafts proposals and imposes them on the voters. Proposals are drafted spontaneously in a distributed fashion. In conclusion, Open/Distributed/Federated proposal elaboration is a decision making medium that need not suffer from information loss that can affect democracy-as-voting. This strong focus on participation makes this approach strongly related to ideas such as collaborative governance[1] and participative democracy[2].

There are two sides to the coin, however. In opening up proposal elaboration to the general public, the gates to more information are opened, but they are also opened to too much information. The key then, is to balance the benefits of the extra possibilities and ideas that this system allows with the extra noise and less useful contributions that can potentially flood the medium. If the signal to noise ratio is too low, optimal decisions can be discarded, not because the required information was not present as we discussed,  but because those decisions were buried under noise that makes them impossible to find.

In summary, a three stage model of information aggregation as decision making is proposed. Although we use the term stage, these three opportunities need not occur chronologically, they may occur in any order and repeatedly.

  1. The first stage of decision making occurs when a set of individuals sets about collaboratively writing[3] or modifying an existing proposal. Ideally this occurs in realtime, these individuals discuss and debate the changes as they are being made. Or they can exchange views or ideas with more traditional means of interaction, offline. In any case, the proposal is the result of their collaboration and contribution, information in terms of preferences and knowledge is integrated to yield the proposal.
  2. The second stage is proposal cross-fertilization[6]. Whereas the focus on aggregation in the first stage was accross individuals collaboratively writing a proposal, in this stage the aggregation occurs across proposals. Proposals may be forked, merged, recombined and mutated. Sections from one proposal may be imported/transcluded into another one. Although the process is still carried out by individuals, proposal cross-fertilization does not  only gather contributions from a subset of users collaboratively making modifications, but rather contributions are drawn from the entire proposal pool.
  3. The third stage is proposal selection, which corresponds to our traditional idea of voting. Having a given pool of proposals that have evolved during the other two stages of aggregation, the voting population is asked to vote for one (or many).

Due to the problems of noised mentioned above, it may be necessary to add extra stages that act as filters. For example, a filter could discard proposals that are not supported by a threshold number of participants.


References

[1] http://en.wikipedia.org/wiki/Collaborative_governance

[2] http://en.wikipedia.org/wiki/Participatory_democracy

[3] By collaborative proposal editing we mean realtime online editing of document by several individuals. The most common examples of this today are found in Google Drive and Etherpad.

[4] http://en.wikipedia.org/wiki/Google_Drive

[5] http://en.wikipedia.org/wiki/Etherpad

[6] Tools that support this model are for example Wikis[7] and Version control systems[8]. The idea of cross fertilization has been suggested in [9]

[7] http://en.wikipedia.org/wiki/Wiki

[8] http://en.wikipedia.org/wiki/Version_control

[9] http://zelea.com/w/Stuff:Votorola/p/position_space_rationalization

Plurality voting as preferential voting with incomplete information

In this post I will compare two voting systems, plurality voting (aka simple majority) and preferential voting (Condorcet method), under some general assumptions including equal voter intent. First I will consider an abstract example where voter preference information is incomplete. Then I will present a concrete example with complete preference information.

Consider a simple example of preferential voting, defined by the following

1) Single winner

2) Three candidates, A, B and C

3) The number of first choice votes for A is a, for B is b, for C is c

4) a > b and a > c

We have no further information as to voter preferences beyond the first choice. We model this as equal preference as to the to other candidates. If we describe a ballot as

(first choice, second choice, third choice)

then we have that

a ballots contain (A, B/C)

b ballots contain (B, A/C)

c ballots contain (C, A/B)

where for example B/C represents a lack of preference between B and C.

We will now determine the Condorcet winner for this vote. According to wikipedia, the Condorcet winner is

the candidate whom voters prefer to each other candidate, when compared to them one at a time

Calculating the winner is a matter of iterating over all the pairwise combinations and determining each net pairwise preference. There are six possible combinations, which reduce to three due to symmetry. The net pairwise preference is therefore

A vs B: a – b + 0 + 0 = a – b

A vs C: a – c + 0 + 0 = a – c

B vs C: b – c + 0 + 0 = b – c

Where the zeroes are due to the fact that there are no preferences beyond the first choice. These net results represent the net preference of a candidate over another one, given all votes. A positive net value in A vs B means that A is preferred to B by a majority of voters. A negative net value means the opposite. These values can be visualized in a preference matrix

 A

 B

 C

 A

 a-b

 a-c

 B

 b-a

 –

 b-c

 C

 c-a

 c-b

 –

 

We have that a > b and a > c, and the relationship between b and c is unspecified. This yields

A

B

C

A

 WIN

 WIN

B

 LOSS

 ?

C

 LOSS

 ?

 

So the  Condorcet winner is A, irrespective of the contents marked with ‘?’, and the exact values of a, b and c.

But recall that a > b and a > c. This implies that under plurality voting (where the ballots only record one choice) the same voter intent would also yield A as the winner.

In conclusion, preferential voting reduces to plurality voting when there is insufficient information as to voter’s preferences[1]. Or in other words, plurality voting is an approximation to preferential voting, because preference information necessary to produce the correct[2] result is unavailable.

Let’s see an example where this information is available and how it produces different results for the two voting systems. Say we have voters with the following preferences

10 voters with preference (A, B, C)

8 voters with preference (B, C, A)

6 voters with preference (C, B, A)

Under plurality voting, this would result in A winning the election, as A would have more votes than B or C. Now let’s calculate the Condorcet winner:

A vs B: 10 – 8 – 6 = -4

A vs C: 10 – 8 – 6 = -4

B vs C: 10 + 8 – 6 = 12

The matrix is

 A

 B

 C

 A

 -4

 -4

 B

 4

 –

 12

 C

 4

 -12

 –

 

which yields

 A

 B

 C

 A

 LOSS

 LOSS

 B

 WIN

 –

 WIN

 C

 WIN

 LOSS

 –

So the Condorcet winner is B, while in plurality voting it was A. The interpretation is that the extra information available allowed a better choice to be made, improving upon the approximation of plurality voting.


[1] Under the given assumptions, of course.

[2] Note that when using better or more correct I am assuming that the Condorcet winner, that is the result of all the pairwise comparisons, is fundamentally better than a simple majority. This assumption seems intuitively correct, a candidate that is pairwise preferred to all other candidates should be the winner.