Vote-time loop detection and transitive closure

tc

Previously we have spoken about vote-time loop detection as a mechanism to prevent votes from being lost, as the tally in Agora will discard votes that enter a cyle

vote-time loop detection is a mechanism that detects cycles just as they are formed, and prevents them from being established. A feedback message instructs the voter that his/her choice is invalid as it would create a loop, the user must make another selection. How does this detection algorithm work? Well essentially in the same way as it does in the tally algorithm, except that it only checks  loops starting from delegation choices just as they are made by a user, on demand.

But this is not the entire picture, for the following reasons

  • Delegation loops do not invariably result in lost votes. In the event that any of the members in the cycle votes directly, that member will channel all the delegation weight  present in the loop. Still, loops are in principle undesirable.
  • Instead of allowing the user to select an invalid (see the previous point) option and then requiring them to change it, the system could identify problem delegates before the user selects one, pre-emptively.
  • Following the point above, the vote-time loop detection algorithm would have to operate differently, as the starting point would not be a specific choice of delegate, but rather all possible choices. Using the tally-time algorithm would be computationally expensive.

This leads us to two different possibilities, pre-emptive vote-time loop detection, and reactive vote-time loop detection. Reactive vote-time loop detection corresponds to our earlier notion of vote-time loop detection. When a user selects a delegate, the system runs an algorithm that, based on that selected delegate, detects where a loop exists. That is, loop detection begins from the path User => Delegate-choice. This algorithm can proceed by walking the graph, and keeping a record of already visited nodes.

In pre-emptive vote-time loop detection, the system must calculate, for every user of the application, what choices of delegate would result in a delegation loop.  Thus the path begins at each possible user. Because such a calculation is expensive, it should be done “offline”, prior to the display of the delegate choices. The result of such a calculation is then stored so that it can be looked up to determine problem delegates whenever presenting the delegation vote interface.

So how does this data look like? It must contain all the delegation graph information necessary to carry out loop detections cheaply, for all possible users. In effect, this is a matter of graph reachability, and the structure we are looking for is the transitive closure of the relation defined by all existing delegate votes (that is, the delegation graph). The transitive closure of a relation can take the form of a binary adjacency matrix, where a ’1′ represents that two vertices are reachable, and ’0′ otherwise.

tc2

The transitive closure (TC) of a directed graph can be calculated using Warshall’s (pdf) algorithm[3] that runs in O(n^3) time. Let’s consider how this could be implemented. First consider that the transitive closure can change whenever the delegation graph changes, that is, whenever a delegate vote is cast, changed or removed. So a naive approach would be to calculate the TC each time this happens, and store it for use during lookup.  Since the TC is an adjacency matrix, it can be easily represented as a database table with two columns.

Lookup is also very easy, as transitive reachability is encoded directly in the adjacency matrix. To determine if a delegate D will cause a loop if selected by user U, we simply have to ask if U is reachable from D, because if it were, the extra path would cause a loop. If stored in a rdbms table, the sql to determine if D is problematic is simply

SELECT * FROM TC WHERE SOURCE = D AND TARGET = U

which would return 1 row if U is rechable and 0 rows otherwise.

But perhaps a naive approach would be inefficient. For every delegated vote cast this would entail

  • Loading all delegated votes (the entire delegation graph)
  • Running Warshall’s algorithm

A better solution would be to calculate the transitive closure incrementally, such that only the varying parts of the TC are recalculated. Even better, if the delegation graph is stored in a database, it would be more efficient if the calculation could be done locally on the database without having to load votes and then run the algorithm.

These two desirable properties bring us to the paper Maintaining Transitive Closure of Graphs in SQL, where such a method is described. As an added benefit, it is formulated in terms of simple sql, requiring no specialized extensions[2] that some databases offer that support graph related operations. We can implement this method as two stored procedures[1] that serve to incrementally maintain the TC table.

As an example, we will construct the TC for the following graph

tc3

which is reflected in the following sql session

tc4

This session shows the edge insertions and the resulting transitive closure table.  Now imagine that delegate 4 has decided to delegate his vote. What delegates would cause a loop? The answer is a straightforward lookup on the TC table:

SELECT START FROM TC WHERE END = 4

which returns the rows with values 1, 2, 3, 5, since any of these will result in a loop.

What delegates would cause a loop for delegate 2? We have

SELECT START FROM TC WHERE END = 2

 which returns rows with values 1, 5.

As can be seen from these examples, detecting possible loops using the TC table is very easy, provided the TC is maintained properly as delegate votes are casted, changed or removed.

 


[1] Following are two stored procedures based on this method implemented for mysql.

Adding an edge

 

-- --------------------------------------------------------------------------------
-- Routine DDL
-- Note: comments before and after the routine body will not be stored by the server
-- --------------------------------------------------------------------------------
DELIMITER $$

CREATE PROCEDURE `test`.`ADD_EDGE` (IN source INT, IN dest INT)
BEGIN

drop table IF EXISTS TCNEW;

CREATE TABLE TCNEW AS
SELECT *
FROM (SELECT TC.start as start, dest as end
FROM TC
WHERE TC.end = source
UNION
SELECT source as start, TC.end as end
FROM TC
WHERE dest = TC.start
UNION
SELECT TC1.start, TC2.end
FROM TC AS TC1, TC AS TC2
WHERE TC1.end = dest AND TC2.start = source
) AS T;

INSERT INTO TCNEW (Start, End) VALUES (source, dest);

DROP TABLE IF EXISTS DELTA;

CREATE TABLE DELTA AS
SELECT * FROM TCNEW
WHERE NOT EXISTS (SELECT * FROM TC, TCNEW WHERE TC.start=TCNEW.start AND TC.end=TCNEW.end);

INSERT INTO TC
SELECT *
FROM DELTA;

SELECT * FROM TC;

END

Removing an edge

-- --------------------------------------------------------------------------------
-- Routine DDL
-- Note: comments before and after the routine body will not be stored by the server
-- --------------------------------------------------------------------------------
DELIMITER $$

CREATE PROCEDURE `test`.`REMOVE_EDGE` (IN source INT, IN dest INT)
BEGIN
DROP TABLE IF EXISTS SUSPECT;

CREATE TEMPORARY TABLE SUSPECT AS
SELECT * FROM (
SELECT X.Start as Start, Y.End as End FROM TC AS X, TC AS Y WHERE X.End = source AND Y.Start = dest
UNION
SELECT X.Start as Start, dest as End From TC AS X WHERE X.End = source
UNION
SELECT source as Start, X.End as End FROM TC AS X WHERE X.Start = dest
UNION
SELECT source as Start, dest as End FROM TC AS X WHERE X.Start = source AND X.End = dest
) AS T2;

DROP TABLE IF EXISTS TRUSTY;

CREATE TABLE TRUSTY AS
SELECT * FROM
(SELECT * FROM TC WHERE NOT EXISTS
(SELECT * FROM SUSPECT WHERE SUSPECT.Start = TC.Start AND SUSPECT.End = TC.End)
UNION
SELECT * FROM G WHERE G.Start <> source AND G.End <> dest) as T3;

DROP TABLE IF EXISTS TCNEWD;

CREATE TABLE TCNEWD AS
SELECT * FROM
(SELECT * FROM TRUSTY
UNION
SELECT T1.Start, T2.End FROM TRUSTY T1, TRUSTY T2 WHERE T1.End = T2.Start
UNION
SELECT T1.Start, T3.End FROM TRUSTY T1, TRUSTY T2, TRUSTY T3 WHERE T1.End = T2.Start AND T2.End = T3.Start)
AS T4;

DELETE FROM TC WHERE NOT EXISTS (SELECT * FROM TCNEWD T WHERE T.Start = TC.Start AND T.End = TC.End);

SELECT * FROM TC;
END

[2] See also https://beagle.whoi.edu/redmine/projects/ibt/wiki/Transitive_closure_in_PostgreSQL

[3] See also http://books.google.es/books?id=iOy6LHeKhWgC&pg=PA509&lpg=PA509&dq=incremental+warshall&source=bl&ots=O3EUb0_tXr&sig=sq_2pfiFSmL4z_flk9mrxRc7g5E&hl=en&sa=X&ei=Z7-mUff1Aaap7Abf-oEI&ved=0CDIQ6AEwAQ#v=snippet&q=incremental%20graph%20closure&f=false

2 thoughts on “Vote-time loop detection and transitive closure

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>