Vote-time loop detection and transitive closure

tc

Previously we have spoken about vote-time loop detection as a mechanism to prevent votes from being lost, as the tally in Agora will discard votes that enter a cyle

vote-time loop detection is a mechanism that detects cycles just as they are formed, and prevents them from being established. A feedback message instructs the voter that his/her choice is invalid as it would create a loop, the user must make another selection. How does this detection algorithm work? Well essentially in the same way as it does in the tally algorithm, except that it only checks  loops starting from delegation choices just as they are made by a user, on demand.

But this is not the entire picture, for the following reasons

  • Delegation loops do not invariably result in lost votes. In the event that any of the members in the cycle votes directly, that member will channel all the delegation weight  present in the loop. Still, loops are in principle undesirable.
  • Instead of allowing the user to select an invalid (see the previous point) option and then requiring them to change it, the system could identify problem delegates before the user selects one, pre-emptively.
  • Following the point above, the vote-time loop detection algorithm would have to operate differently, as the starting point would not be a specific choice of delegate, but rather all possible choices. Using the tally-time algorithm would be computationally expensive.

This leads us to two different possibilities, pre-emptive vote-time loop detection, and reactive vote-time loop detection. Reactive vote-time loop detection corresponds to our earlier notion of vote-time loop detection. When a user selects a delegate, the system runs an algorithm that, based on that selected delegate, detects where a loop exists. That is, loop detection begins from the path User => Delegate-choice. This algorithm can proceed by walking the graph, and keeping a record of already visited nodes.

In pre-emptive vote-time loop detection, the system must calculate, for every user of the application, what choices of delegate would result in a delegation loop.  Thus the path begins at each possible user. Because such a calculation is expensive, it should be done “offline”, prior to the display of the delegate choices. The result of such a calculation is then stored so that it can be looked up to determine problem delegates whenever presenting the delegation vote interface.

So how does this data look like? It must contain all the delegation graph information necessary to carry out loop detections cheaply, for all possible users. In effect, this is a matter of graph reachability, and the structure we are looking for is the transitive closure of the relation defined by all existing delegate votes (that is, the delegation graph). The transitive closure of a relation can take the form of a binary adjacency matrix, where a ‘1’ represents that two vertices are reachable, and ‘0’ otherwise.

tc2

The transitive closure (TC) of a directed graph can be calculated using Warshall’s (pdf) algorithm[3] that runs in O(n^3) time. Let’s consider how this could be implemented. First consider that the transitive closure can change whenever the delegation graph changes, that is, whenever a delegate vote is cast, changed or removed. So a naive approach would be to calculate the TC each time this happens, and store it for use during lookup.  Since the TC is an adjacency matrix, it can be easily represented as a database table with two columns.

Lookup is also very easy, as transitive reachability is encoded directly in the adjacency matrix. To determine if a delegate D will cause a loop if selected by user U, we simply have to ask if U is reachable from D, because if it were, the extra path would cause a loop. If stored in a rdbms table, the sql to determine if D is problematic is simply

which would return 1 row if U is rechable and 0 rows otherwise.

But perhaps a naive approach would be inefficient. For every delegated vote cast this would entail

  • Loading all delegated votes (the entire delegation graph)
  • Running Warshall’s algorithm

A better solution would be to calculate the transitive closure incrementally, such that only the varying parts of the TC are recalculated. Even better, if the delegation graph is stored in a database, it would be more efficient if the calculation could be done locally on the database without having to load votes and then run the algorithm.

These two desirable properties bring us to the paper Maintaining Transitive Closure of Graphs in SQL, where such a method is described. As an added benefit, it is formulated in terms of simple sql, requiring no specialized extensions[2] that some databases offer that support graph related operations. We can implement this method as two stored procedures[1] that serve to incrementally maintain the TC table.

As an example, we will construct the TC for the following graph

tc3

which is reflected in the following sql session

tc4

This session shows the edge insertions and the resulting transitive closure table.  Now imagine that delegate 4 has decided to delegate his vote. What delegates would cause a loop? The answer is a straightforward lookup on the TC table:

which returns the rows with values 1, 2, 3, 5, since any of these will result in a loop.

What delegates would cause a loop for delegate 2? We have

 which returns rows with values 1, 5.

As can be seen from these examples, detecting possible loops using the TC table is very easy, provided the TC is maintained properly as delegate votes are casted, changed or removed.

 


[1] Following are two stored procedures based on this method implemented for mysql.

Adding an edge

 

Removing an edge

[2] See also https://beagle.whoi.edu/redmine/projects/ibt/wiki/Transitive_closure_in_PostgreSQL

[3] See also http://books.google.es/books?id=iOy6LHeKhWgC&pg=PA509&lpg=PA509&dq=incremental+warshall&source=bl&ots=O3EUb0_tXr&sig=sq_2pfiFSmL4z_flk9mrxRc7g5E&hl=en&sa=X&ei=Z7-mUff1Aaap7Abf-oEI&ved=0CDIQ6AEwAQ#v=snippet&q=incremental%20graph%20closure&f=false

Open proposal elaboration

(image from cartoonmovement.com)

Democracy is not just about voting, but about the wider problem of making decisions in a community. Good systems for making decisions are more likely to yield decisions that reflect the community’s preferences and exploit its collective knowledge. A decision making system, therefore, takes as input preferences and knowledge and produces as output a decision. Such a decision making system is effectively an information aggregator; high quality decisions are the result of aggregating as much information as possible: no preferences are ignored, and no valuable knowledge is left unused.

This brief description brings us to the distinction between voting and the general case of  decision making. In traditional voting, voters are asked to cast a vote which represents a choice between a predefined set of already existing alternatives. The alternatives presented to the voter are fixed prior to the vote, they have been previously elaborated by some other agent independent of the voters. The voters simply choose among the set presented to them.

In the terms presented above, the knowledge and preferences that voters input into the system only serves to discriminate between a predefined set of options. If the process by which the options were elaborated has not previously already aggregated this information and materialized it into the options themselves, that information could be lost.

In other words, voting is too coarse grained to aggregate the information in all its richness. The grain is determined by the options, if these options do not previously already represent the best quality decisions, the resulting decision will be suboptimal. The optimal decision implicit in the lost information is unavailable as a choice.

The rise of information technology is very relevant to this problem. It is clear, nominally speaking, that a problem of aggregating information is directly related to information technology. More specifically, the communication possibilities that the modern day internet afford are precisely those that are necessary to open up the possibility of aggregating information from hundreds of thousands of sources in a much more refined way than the coarse grained procedure of voting allows.

The key idea is this: information technology and the internet allow extending voters’ participation from just the voting phase, back in time to include the proposal drafting/elaboration (what before were merely fixed options) phase. In contrast to the coarse grain input that voting allows, contributions to a proposal as it is being written are fine grained; modifications to proposals can be arbitrarily small and precise. From adding (or removing) a comma, to a sentence, to a paragraph, to an entire proposal.

We call this opening up of proposal elaboration to the general voter population Open / Distributed / Federated proposal elaboration.

It is open because anyone can contribute. It is distributed/federated because there is no central authority that drafts proposals and imposes them on the voters. Proposals are drafted spontaneously in a distributed fashion. In conclusion, Open/Distributed/Federated proposal elaboration is a decision making medium that need not suffer from information loss that can affect democracy-as-voting. This strong focus on participation makes this approach strongly related to ideas such as collaborative governance[1] and participative democracy[2].

There are two sides to the coin, however. In opening up proposal elaboration to the general public, the gates to more information are opened, but they are also opened to too much information. The key then, is to balance the benefits of the extra possibilities and ideas that this system allows with the extra noise and less useful contributions that can potentially flood the medium. If the signal to noise ratio is too low, optimal decisions can be discarded, not because the required information was not present as we discussed,  but because those decisions were buried under noise that makes them impossible to find.

In summary, a three stage model of information aggregation as decision making is proposed. Although we use the term stage, these three opportunities need not occur chronologically, they may occur in any order and repeatedly.

  1. The first stage of decision making occurs when a set of individuals sets about collaboratively writing[3] or modifying an existing proposal. Ideally this occurs in realtime, these individuals discuss and debate the changes as they are being made. Or they can exchange views or ideas with more traditional means of interaction, offline. In any case, the proposal is the result of their collaboration and contribution, information in terms of preferences and knowledge is integrated to yield the proposal.
  2. The second stage is proposal cross-fertilization[6]. Whereas the focus on aggregation in the first stage was accross individuals collaboratively writing a proposal, in this stage the aggregation occurs across proposals. Proposals may be forked, merged, recombined and mutated. Sections from one proposal may be imported/transcluded into another one. Although the process is still carried out by individuals, proposal cross-fertilization does not  only gather contributions from a subset of users collaboratively making modifications, but rather contributions are drawn from the entire proposal pool.
  3. The third stage is proposal selection, which corresponds to our traditional idea of voting. Having a given pool of proposals that have evolved during the other two stages of aggregation, the voting population is asked to vote for one (or many).

Due to the problems of noised mentioned above, it may be necessary to add extra stages that act as filters. For example, a filter could discard proposals that are not supported by a threshold number of participants.


References

[1] http://en.wikipedia.org/wiki/Collaborative_governance

[2] http://en.wikipedia.org/wiki/Participatory_democracy

[3] By collaborative proposal editing we mean realtime online editing of document by several individuals. The most common examples of this today are found in Google Drive and Etherpad.

[4] http://en.wikipedia.org/wiki/Google_Drive

[5] http://en.wikipedia.org/wiki/Etherpad

[6] Tools that support this model are for example Wikis[7] and Version control systems[8]. The idea of cross fertilization has been suggested in [9]

[7] http://en.wikipedia.org/wiki/Wiki

[8] http://en.wikipedia.org/wiki/Version_control

[9] http://zelea.com/w/Stuff:Votorola/p/position_space_rationalization