Scala, Functional Programming and Play 2.0

Here’s a short but interesting interview with Sadek Drobi who’s behind the Play framework. Drobi discusses

  • What is functional programming, why it is becoming more popular and how it relates to Object Orientation (OO)
  • How Scala approaches functional programming from an OO base
  • Scala’s complexity as a language and how to manage it through guidelines
  • Play as a rails-type framework for the jvm with special emphasis on asynchronous processing for scalability
  • How abstracting over javascript is a mistake for web frameworks, as complexity mounts in the long run

Liquid democracy and conditional delegation

In liquid democracy voters can delegate their votes, hence its hybrid direct/representative nature. But what if vote delegation as described is not flexible enough? What if voter A wants to delegate to different voters, say B or C, depending on the subject that is being voted on? Vote delegation that is dependent on the vote’s subject matter category is what I call conditional delegation. An example would be

Voter A delegates to voter B if the vote is about foreign policy

Voter A delegates to voter C if the vote is about economic policy

Here foreign policy and economic policy are vote categories. Thus, voters can express their preferences with greater flexibility and precision than with a single, unconditional delegation. It remains to be seen how useful this mechanism is in practice. Perhaps the added flexibility is overkill, complicating things more than the benefits it offers. Nonetheless it can always be offered as an optional feature that more demanding users can exploit.

Throwing conditional delegation into the mix brings with it its own challenges. Here are a few

How is the vote category determined? What vote categorization mechanism can be used such that it does not present an opportunity to manipulate results?

The most obvious method is to include a global service that categorizes votes as they come up, and posts the category for each. This service can be operated by the administrators of the liquid democracy system, analyzing the subject of each vote and coming up with a suitable category. The tally algorithm takes that category as input when resolving conditional delegated votes.  Pretty straightforward.

The problem with this system is what the second question above addresses. Such a global service puts too much power in the hands of administrators, as vote categorization has a direct bearing on tally results via conditional delegation resolution. Even if said administrators were honest, the potential for manipulation would exist, making the whole set up suspect.

A more robust solution is to allow anyone to create a categorization service, and to require voters to choose which categorization service they wish to subscribe to. For convenience, there could be a default categorization service operated by the administrators or by a trusted institution. By opening this service up to anyone that wishes to offer it, categorization is decentralized, anybody can choose which service to use and change it on demand. Thus the potential for manipulation is almost zero, especially if no default service exists.

Taking the above idea even further, one could place the responsibility of categorization on the delegates themselves. In this way a delegate also plays the role of categorization service, and the choice of delegate serves the double role of vote delegation and category service selection. This offers the most distributed and manipulation free option, but it places a larger burden on delegates, as they have to not only vote, but choose a category for each vote as well (although transitivity could also apply to vote categorization).

How does conditional delegation affect loop detection?

As we discussed previously, vote-time loop detection is necessary to avoid lost votes. It works by detecting loops on demand when a voter chooses a delegate, and asks the user to change their selection. But is vote-time loop detection possible with conditional delegation? The answer is probably not.

Vote categorization is not known when a voter makes their conditional delegation, hence checking for loops is uncertain. In theory it would be possible to check loops given all possible categorizations, but that would not make much sense, as the possibility of a loop is not the same as a certainty. It seems undesirable to ask users to change their delegation because of a possibility that could never materialize. So the short answer is that conditional delegation precludes vote-time loop detection, and therefore opens up the possibility of lost votes.

Tally-time loop detection is unaffected of course, the tally will complete correctly as usual.

Should vote categories be fixed and unified? Should they overlap or be mutually exclusive?

The most simple scenario uses a fixed set of categories that are mutually exclusive. Using a fixed set of categories makes it easier for category services and conditional delegations to match. I do not see any reason why a sufficiently rich set of them could not be arrived at ahead of time in a non controversial way.

The question of overlap is a bit trickier. Votes usally fit more than one category, and choosing categories to avoid this will be unnatural. Overlapping categories, although more natural, pose technical problems. Conditional delegation resolution with overlapping categories could yield ambiguous results, if more than one target delegate satisfies its corresponding condition. Reusing the example above

Voter A delegates to voter B if the vote is about foreign policy

Voter A delegates to voter C if the vote is about economic policy

If a certain vote were deemed to be both about foreign policy and economic policy, delegation resolution would be ambiguous.

Recap

We can list the set of technical problems associated with conditional delegation as follows

– ambiguous delegation due to overlapping categories

– lost votes due to

– difficult/impossible vote-time loop detection

– vote subject matter not categorized by delegate/service

– vote not cast by delegate (this is shared with unconditional delegation)

In the next post I will discuss a conditional delegation scheme that can partially address these problems. However, as I’ve stated above, it remains to be seen how useful this mechanism is in practice. Perhaps the added flexibility is overkill, complicating things more than the benefits it offers.


A question I’ve left out is

What logic should be used to express conditions?

I’m pretty sure that anything beyond a specific category (i.e. category => delegate) is overkill. But if you really want to get fancy.. you could use propositional logic/boolean algebra.

Epistemological dive

 
Note: I wrote this piece before the two posts presenting simple model of learning using bayesian inference. There is significant overlap, and  conclusions are stated without complete explanations.
 
 

I attended an informal talk on climate change recently, after which I had several discussion regarding the scientific process and the foundations of knowledge (in science).

One question was Is scientific knowledge is inductive or deductive? Well, the scientific method requires deductive inference to establish the logical consequences of a theory in order to make predictions. But the justification of theories, the method by which a theory is temporarily accepted or discarded is inductive. In the language of Bayes, theory confirmation/invalidation occurs by updating theory posteriors inductively (P(H|E)), whereas evidence conditioning on theories (P(E|H)) is derived deductively.

So, although deduction plays a part in establishing the logical consequences of theories in the form of testable predictions, the nature of the knowledge, or rather, the process by which that knowledge is gained, is fundamentally inductive.

What does this say about the foundations of knowledge in science? If scientific knowledge were deductive, we could simply say that its foundations are axiomatic. We could also talk about incompleteness and other interesting things. But if as we have stated this knowledge is inductive, what are its foundations? Is induction a valid procedure, and what are its endpoints?

This is a very deep subject, trying to go all the way to the bottom is what I have titled this post as an epistemological dive. I’m not going to give this a thorough treatment here, but I’ll briefly state what my position is and what I argued in discussion that day.

The way I see it, the foundations of scientific knowledge are the postulates of probability theory (as derived for example by Bernardo-Smith or Cox) together with Occam’s razor. In fact, given that most people are aware of probability theory I would say that the best single answer to the foundation of knowledge, in the sense that it is something we are less aware of, is Occam’s razor. I will give a brief example of this, borrowed from a talk by Shane Legg on machine super intelligence.

Let’s consider a minimal example of a scientific process. An agent is placed in an environment and must form theories whose predictions correctly match the agent’s observations. Although minimal, this description accounts for the fundamental elements of science. There is one missing element, and that is a specification of how the agent forms theories, but for now we will use our own intuition, as if we were the agent.

For this minimal example we will say that the agent observes a sequence of numbers which its environment produces. Thus, the agent’s observations are the sequence, and it must form a theory which correctly describes past observations and predicts future ones. Let’s imagine this is what happens a time goes forward, beginning with

1

For the moment there is only one data point, so it seems impossible to form a theory in a principled way.

1,3

Among others, two theories could be proposed here, odd numbers, and powers of 3, with corresponding predictions of 5 and 9:

f(n) = 2n – 1

f(n) = 3^n

the observations continue:

1,3,5

The powers of three theory is ruled out due to the incorrect prediction 9, while odd number theory was correct.

1,3,5,7

The odd number theory has described all observations and made correct predictions. At this point our agent would be pretty confident that the next observation will be 9.

1,3,5,7,57

What?! That really threw the agent off, it was very confident that the next item would be 9. But it turned out to be 57.  As the builder of this small universe I’ll let you know the correct theory, call it theory_57:

f(n) = 2n – 1 + 2(n-1)(n-2)(n-3)(n-4)

which if you check correctly describes all the numbers in the sequence of observations. If the 5th observation had instead been 9, our odd number theory would have been correct again, and we would have stayed with it. So depending on this 5th observation:

9 => f(n) = 2n-1

57 => f(n) = 2n – 1 + 2(n-1)(n-2)(n-3)(n-4)

Although we only list two items, the list is actually infinite because there are an infinite number of theories that correctly predict the observations up until the 4th result. In fact, and here is the key, there are an infinite number of theories that correctly predict any number of observations! But let us restrict the discussion to only the two seen above.

What our intuition tells us is that no reasonable agent would believe in theory_57 after the fourth observation, even though it is just as compatible with the odd number theory. Our intuition strongly asserts that the odd number theory is the correct theory for that data. But how can we justify that on the basis of induction, if they make the same predictions (ie they have the same P(E|H))?

The key is that our intuition, in fact our intelligence in general, has a built-in simplicity bias. We strongly favor the odd number theory because it is the simplest theory that fits the facts. Hence induction, including our everyday intuitions and the scientific method, is founded upon Occam’s razor as a way to discriminate between equally supported theories.

Without this or some more specific bias (in machine learning we would call this inductive bias), induction would be impossible, as there would be too many theories to entertain. Occam’s razor is the most generally applicable bias; it is a prerequisite for any kind of induction, in science or anywhere else.

Plurality voting as preferential voting with incomplete information

In this post I will compare two voting systems, plurality voting (aka simple majority) and preferential voting (Condorcet method), under some general assumptions including equal voter intent. First I will consider an abstract example where voter preference information is incomplete. Then I will present a concrete example with complete preference information.

Consider a simple example of preferential voting, defined by the following

1) Single winner

2) Three candidates, A, B and C

3) The number of first choice votes for A is a, for B is b, for C is c

4) a > b and a > c

We have no further information as to voter preferences beyond the first choice. We model this as equal preference as to the to other candidates. If we describe a ballot as

(first choice, second choice, third choice)

then we have that

a ballots contain (A, B/C)

b ballots contain (B, A/C)

c ballots contain (C, A/B)

where for example B/C represents a lack of preference between B and C.

We will now determine the Condorcet winner for this vote. According to wikipedia, the Condorcet winner is

the candidate whom voters prefer to each other candidate, when compared to them one at a time

Calculating the winner is a matter of iterating over all the pairwise combinations and determining each net pairwise preference. There are six possible combinations, which reduce to three due to symmetry. The net pairwise preference is therefore

A vs B: a – b + 0 + 0 = a – b

A vs C: a – c + 0 + 0 = a – c

B vs C: b – c + 0 + 0 = b – c

Where the zeroes are due to the fact that there are no preferences beyond the first choice. These net results represent the net preference of a candidate over another one, given all votes. A positive net value in A vs B means that A is preferred to B by a majority of voters. A negative net value means the opposite. These values can be visualized in a preference matrix

 A

 B

 C

 A

 a-b

 a-c

 B

 b-a

 –

 b-c

 C

 c-a

 c-b

 –

 

We have that a > b and a > c, and the relationship between b and c is unspecified. This yields

A

B

C

A

 WIN

 WIN

B

 LOSS

 ?

C

 LOSS

 ?

 

So the  Condorcet winner is A, irrespective of the contents marked with ‘?’, and the exact values of a, b and c.

But recall that a > b and a > c. This implies that under plurality voting (where the ballots only record one choice) the same voter intent would also yield A as the winner.

In conclusion, preferential voting reduces to plurality voting when there is insufficient information as to voter’s preferences[1]. Or in other words, plurality voting is an approximation to preferential voting, because preference information necessary to produce the correct[2] result is unavailable.

Let’s see an example where this information is available and how it produces different results for the two voting systems. Say we have voters with the following preferences

10 voters with preference (A, B, C)

8 voters with preference (B, C, A)

6 voters with preference (C, B, A)

Under plurality voting, this would result in A winning the election, as A would have more votes than B or C. Now let’s calculate the Condorcet winner:

A vs B: 10 – 8 – 6 = -4

A vs C: 10 – 8 – 6 = -4

B vs C: 10 + 8 – 6 = 12

The matrix is

 A

 B

 C

 A

 -4

 -4

 B

 4

 –

 12

 C

 4

 -12

 –

 

which yields

 A

 B

 C

 A

 LOSS

 LOSS

 B

 WIN

 –

 WIN

 C

 WIN

 LOSS

 –

So the Condorcet winner is B, while in plurality voting it was A. The interpretation is that the extra information available allowed a better choice to be made, improving upon the approximation of plurality voting.


[1] Under the given assumptions, of course.

[2] Note that when using better or more correct I am assuming that the Condorcet winner, that is the result of all the pairwise comparisons, is fundamentally better than a simple majority. This assumption seems intuitively correct, a candidate that is pairwise preferred to all other candidates should be the winner.

“Human level AI” is a misleading concept

I’ve used and will probably continue to use the term human level AI to refer to a level of machine intelligence that is competent enough to carry out tasks that humans carry out. In particular, capable of carrying out those tasks that we consider evidence for intelligence when done by a human. But I’ve just realised that it’s a misleading term. Here’s why.

Let’s try to make the notion of level more precise. We have the universal intelligence measure, as well as AIQ, as precise mathematical constructs for this idea. If you recall, these two constructs are based around the idea of summing the number of environments at which an agent succeeds in as an indicator of intelligence. This sum is weighted by simplicity. Agents that are competent in a higher number of environments are said to be more intelligent, and the simplicity bias favors those agents that do so by virtue of a general capacity, rather than a sum of narrow specialisations.

Anyway, the point is that beyond the simplicity bias, such a notion of level does not require  success at any given environment, but rather it is the sum that counts, not which environments take part in that sum. Hence, two agents of equal intelligence could be competent in a very different set of environments.

Therefore, a human level AI could correspond to any agent whose sum of environments is equal to that of a human. But these environments could be widely different. This means that a human level AI could easily be incompetent at tasks that humans carry out. And this is precisely the original, intuitive meaning of human level AI. Human level AI is therefore misleading, because intuitively it means one thing, but precisely means something different.

What human level AI really means is artificial human intelligence, not just the level of intelligence, but also a specification of which environments the intelligence must succeed at. It’s the usual anthropocentric bias that incorrectly discards the generality of the level specification.

Hence we should really say artificial human intelligence, although I’d understand accusations of nitpicking!