When Shane Legg posted about AIQ recently I asked him to comment on Hibbard’s paper where an objection is made about the apparently counter-intuitive consequences of the universal intelligence measure. Hibbard notes that
Given arbitrarily small ε > 0, total credit for all but a finite number of environments is less than ε. That is, total credit for all environments greater than some level C of complexity is less than ε, whereas credit for a single simple environment will be much greater than ε. This is not the way we judge human intelligence.
In other words, an agent that succeeds in a number of very complex environment is considered less intelligent than another agent that succeeds in one very simple one, and that seems wrong.
But the important thing about the universal intelligence measure is that it’s a matter of expectation, of capacity. It is not obtained by just totaling number of environments the agent is successful at, but rather by the number of environments weighted by the expectation that they occur (according to a solomonoff prior). So the measure is really an indicator of the expected degree of success of an agent if all we know about the environments it may find itself in is that they are distributed according to the simplicity prior.
This ingredient of expectation, together with the assumed prior, is what necessitates a simplicity-biased generality to obtain a high score.
Legg suggests that the source of dissonance Hibbard remarks is a hidden assumption present in our every day intuition about intelligence. Given that in practice we measure intelligence of generally intelligent agents (people), that is, agents that succeed in environments by virtue of generally intelligent ability rather than through narrow domain specific ability, we take success at a complex task as strong evidence of the ability to succeed at less demanding ones. This hidden assumption turns out to be inapplicable in the case of the two agents described, and hence our intuition goes wrong.
Note how if we analyze the example above in terms of what cannot be done (ie, the simplest environment the agent fails at), the dissonance seems to go away: it is intuitively acceptable that an agent that fails at a trivial task is stupid. This is just inverting the “less-than-or-equal-to assumption” above. In this case, the assumption that an agent cannot succeed at tasks that are more difficult than one it has failed at is again wrong, but yields results that better correspond with the universal measure.
Following a different route, Hibbard proposes (if I understood his paper correctly) a measure that screens out non-general agents by requiring them to succeed not at a single environment, but rather at a class of environments, parametrized by complexity, Ef(m)
The intelligence of an agent p is measured as the greatest m such that p learns to predict all e ∈ Ef(m)
Presumably this prevents one-trick-pony agents from reaching intelligent scores that would seem unintuitive; these scores are not possible because success is necessary in all the environments of the class. I interpret that the argument can be extended to conclude that if an agent succeeds at a class of given complexity, it is guaranteed that it will also succeed at complexities below that, although I have not seen this explicitly stated in the paper.
 Where complexity is given by the number of steps of computation that define the evolution of the environment