Brain machine interfaces

The thought control phase feedback loop

I recently attended a talk on brain machine interfaces (BMI) given by José M. Carmena from Berkeley. Carmena described his current research into thought control of computers via brain implanted electrode arrays. The subjects were macaque monkeys, and the task was controlling a cursor, directing towards a target and “grasping” it. Successful execution of the task triggered a reward for  the monkey, juice.

Initially, the monkey was subjected to the task through motor manipulation of a control stick. This phase allowed the monkey to learn the task, as well as to identify motor neurons that could serve as stable output for the thought control phase. Additionally, a simple linear regression model was trained on those neurons in order to decode spike output into action commands for the cursor. In the later stage, the control stick was omitted, and the monkeys controlled the cursor directly through thought. After several days of learning the monkeys achieved a high success rate, even at the start of the trial. This suggests that single day learning was being consolidated into longer term motor memory, as occurs for normal every day motor tasks. The typical example for humans is riding a bicycle, a skill that once learned persists.

The brain’s natural plasticity is what allows it learn a direct a BMI in just days. Neural rewiring occurs in order to progressively adapt the input-output feedback loop. The monkey observes the cursor on the screen, this is the input, and through thought manipulates it, the output.

Of course, the task at hand was relatively simple, and performance was lower than through natural motor control of the stick. But still, it is remarkable that with a relatively crude interface (sampling a handful of neurons) and a simple regression model, the task is learned to high accuracy. It’s the brain that is doing most of the heavy lifting, suggesting that there is great potential for brain computer communication.

Performance increases and training time declines as motor memory consolidates

So what lies ahead? Firstly, better physical interfacing with the brain. Current approaches are limited to crude short lived electrode arrays that sample perhaps hundreds of neurons in a very invasive way. Secondly, better feedback loops. One of the reasons the monkey’s performance is higher in the control stick phase of the task is because it obtains very valuable feedback, not just from the cursor on the screen, but from its own body; the position and orientation of its arm/hand. All this feedback is missing in the thought control phase. So one way forward is to not just connect to neurons for output, but possibly to inject input as motor feedback.

But things don’t stop there. Injecting input into the brain need not only be a matter of feedback for control, but as a way to input sensory information in general. In one experiment, a rat was subject to artifcial stimuli corresponding to a virtual object, via directly stimulating neurons responsible for processing touch information from its whiskers (yes, its whiskers you read that right). So one could say that the rat “felt” the touch of a non existent object. Taking the idea even further, one could conceivably create entirely new senses, not linked to a specific part of the body, in order to sense surroundings in ways that do not correspond to anything that our bodies currently perceive.

This last line of thought is what I found to be most interesting, although I have to say it was not stated explicitly by Carmena in his talk, it’s speculation extrapolating the idea of virtual objects beyond current senses. But this is just one possibility. In general, brain computer communication opens up a lot of possibilities, many of which currently belong to the realm of science fiction. But as the speaker said, things that are being experimented today, like thought control, were the stuff of science fiction twenty years ago..

How I hacked a gaming site

A while back somebody showed me a casual gaming site where people play each other at a variety of card games. No money, you just play for points and to gain ranking among the thousands of players present there. The site offers a limited amount of daily free play time, beyond which you have to pay to get a subscription. I gave it a shot and played for a bit.

But then my AI instinct started thinking about how to make a computer play cards. Could I write a program that would play perfectly? And could I hack the system to obtain the stream of game data necessary for an automated player to participate in real, live games? It seemed an interesting technical challenge for a small project.

Having decided the particular card game to play, I had to figure out how to make the computer play it. I figured that with appropriate game play data I could implement a minimax type algorithm and a heuristic evaluation function to do a decent job. So that’s what I did, with a little tuning to get the heuristic correct and some basic probability judgements as to what cards could be dealt.

Minimax algorithm for tic-tac-toe (Sussex university)

Now came the hard part, getting the program to read in the game information as it’s being played. The program has to be in sync, seeing everything that happens in order to make its moves. A human just looks at the screen and sees the cards that are dealt, played, the points etc, but a program sitting on your computer can’t do that easily.

In fact, the visual route is one of three I considered. What it entails is capturing screenshots of the gaming area continuously, and then using sophisticated image recognition algorithms to detect dealt cards, played cards, scores and other visual components. It’s tricky and computationally expensive in real time. The game I selected plays very quickly, people make their moves in seconds, so it would be almost impossible in a mini project.

The second route I thought of was capturing network traffic with the gaming site. Since all the game information has to be transmitted to the game client (in the browser) then you could in theory monitor the connection and sniff out what you need. I brought up a browser console and looked at the requests and returned info. Not good. It seemed that the traffic was encoded with some scheme precisely to make it unreadable. Not something to be reverse engineered in a couple of days.

Now, something I haven’t said until now is that this gaming site uses java applets embedded in the web browser to create the gaming client interface, i.e. the gaming window where you see cards, players, make your moves etc. I inspected the html and found references to the code files (jar files). What about decompiling said code and trying to read the reverse engineered source code to see what was going on?

Compiling (Monash university)

I ran a decompiler and started looking at the reconstructed package structure and source classes. It was a lot, and some names were cryptic, but I managed to spot some files and packages that had meaningful names. Tracing calls between 20 or so files narrowed down the search to a candidate file that looked promising, its methods represented meaningful game events, and parameters and attributes appeared to contain the desired game state information.

But then what? Even if I could trace the code to the important locations, how could I get it to talk to my program? After all, that code could not be modified, it would come from the gaming site’s server. And even if there was some way to locally alter the client, the code would not compile after a decompiling process that was flaky.

Didn’t seem to be an easy way around this… except for using a tracing tool based on code instrumentation. Btrace allows you to inject code into an external jvm process at methods which you specify via type signatures. The code you can inject is subject to strong restrictions and requires you to use a special API to access the host code’s parameters and state. It’s cumbersome, but it works well. You connect to an external vm and it does its magic.

Simple btrace script (Lennart Schedin)

What I had to do, then, was inject the key methods with code to access the important information. I could output this data as text, and then pipe it to my program which would parse the input, synchronize game state, and run its minimax algorithm. The result would be a set of possible moves and scores, along with a search tree so that you could review the computer’s thinking.

All that was left was trying it out. Which meant connecting to the gaming site, launching the tracing tool piped to the game program, and making its moves manually in the browser. I could have done a bit more work to get the program to play directly, but the main interesting bits were done.

So, how did it do? Well, the whole setup had a 90%+ winning rate. Not surprising given the card data that it streamed combined with minimax. After a few days of free playing the rating shot up very quickly, and could easily achieve the top spot out of approximately 8000 players.

Wrapping up, it was an interesting technical challenge for a small project.

You’re probably thinking, isn’t using a computer bot cheating? Of course it’s cheating, that’s why I call this a hack, even though it’s just the game client that got hacked. The site should plug these holes so that people can’t do this to win systematically, especially considering that it’s a paid service. Although I suspect that anybody (more like nobody) setting up something like this is doing it for the challenge and shelves the solution as soon as it’s working.

Note: feel free to contact me for details if you think your site is the victim!

Liquid democracy and loop detection

Liquid democracy features transitive delegation, delegates themselves can delegate their votes. If voter A delegates his vote to voter B and voter B delegates his vote to voter C, then voter C casts three votes, his own, plus that of A and B[1]. Or if you want to be more precise, vote C’s vote has a weight of 3, as if voter A and B’s made the same choice as C.

Transitive delegation

Transitivity raises an important issue, the possibility of cycles, or loops, in the voting graph. A cycle will prevent votes entering it from being counted correctly, those votes will traverse the cycle forever. So a system supporting liquid democracy must include a mechanism to handle this possibility.

The first feature such a system must include is a loop detection procedure as part of the tallying algorithm. When running the liquid tally, the algorithm must not crash or hang due to the existence of loops. The loop must be detected an the traversal aborted. With simple caching such a liquid tally algorithm runs in time linear in the number of voters. The algorithm is pretty trivial so I won’t describe it.

Adding loop detection to the tally ensures it completes correctly, but there are still undesirable outcomes. Let’s assume the scheme is a single unconditional transitive delegation. This means that a delegation vote is unique (rather than multiple), transitive, and does not depend on vote categorisation (whatever the mechanism chose for determining categories). Under this scheme votes that reach a cycle in the voting graph will be lost. Although this is not fatal for the tally, it is undesirable, those voters whose choice was discarded will not be represented.

Oh oh, a loop!

So, a second mechanism must be added. We can call this vote-time loop detection. This is a mechanism that detects cycles just as they are formed, and prevents them from being established. A feedback message instructs the voter that his/her choice is invalid as it would create a loop, the user must make another selection. How does this detection algorithm work? Well essentially in the same way as it does in the tally algorithm, except that it only checks  loops starting from delegation choices just as they are made by a user, on demand.

My simple liquid tool app features vote-time loop detection and marks cycles in red so that the user makes the relevant correction. In fact the algorithm is a bit more complicated than necessary because it allows the loop to be established, and must keep track of subsequent changes that revalidate the voting graph.

Finally, some extra care must be taken to ensure that vote-time loop detection is accurate, as stale data may result in inaccuracies. This is especially important in the face of concurrent modification of the voting graph as users progressively emit votes. But the problem is still pretty trivial in this scenario, things can get somewhat more complicated with more sophisticated delegation schemes.


[1] In this post I am not distinguishing voters from delegates, all voters play the double role of voter and delegate. Any voter can delegate to anybody other voter. This type of scheme is only applicable with public voting, where delegate accountability does not require a special treatment.

Are intelligence and goals practically coupled?

There are many definitions of intelligence[1][2]. Most of them include as a central element the capacity to achieve goals, together with an ingredient of generality to distinguish from narrowly applicable abilities. In these definitions, the goals themselves are left unspecified, their content has no bearing as to whether something is considered intelligence or not. In other words, intelligence and goals are decoupled, or orthogonal.

However, definitions are just… definitions. The only requirement for a definition to be valid is logical consistence. Whether it applies to the real world as a useful concept is another matter all together.

This brings us to consider whether, in practice, intelligence and goals are independent or not. Not only empirically, which is a question of observing existing cases of intelligences and their associated goal content, but also physically. In other words, whether intelligence and goals are constrained to correlate in physically realizable intelligences that do not yet exist. The main constraint that a physically realizable intelligence is subjected to is a limit to computational resources[3].

So, in practice, is it possible to build an intelligence with arbitrary goals? And if not, what constraints are imposed on these goals, and how do these constraints come about?

I will stop here as I think it’s not yet possible to think rigorously about these questions, although I think the questions themselves are well defined and relevant (ie for matters of AI safety). Here is some related reading

Bostrom [2012] – Motivation and instrumental rationality in advanced artificial agents

Lesswrong – General purpose intelligence: arguing the Orthogonality thesis

Lesswrong – Muehlhauser-Goertzel Dialogue, Part 1

[1] Legg [2006] – A collection of definitions of intelligence

[2] I have considered intelligence from a naturalistic standpoint as an optimization process that arose in living beings to counter entropy through behavior

[3] With unlimited computational resources one could instantiate a model like AIXI where goals are not coupled and perhaps answer the question immediately

Learning and the subject-object distinction

Previously I presented a model where learning is impossible. In this post I want to emphasize

..not only must the phenomenon be learnable, but additionally the learning agent must incorporate a bias to exploit existing regularity. Without such a bias, the learner cannot penalize complex “noisy” hypotheses that fit the data.

In the model, the environment (phenomenon) that gives rise to observations in the form of the binary sequence of ‘0’ and ‘1’ is left unspecified. Nothing is said about how the environment evolves, whether it is deterministic or stochastic, or whether it follows a certain rule or not. The model I presented is thus completely orthogonal to the nature of the environment. And yet

Whatever the sequence of events, the learning agent does not gain any knowledge about the future from the past, learning is impossible.

So a property of the learning agent, the subject, makes learning impossible irrespective of the environment, the object. This property of the subject is the belief that all sequences of observations are equally likely, that is, the lack of a-priori bias favoring any of the outcomes[1]. Even if the object was completely predictable, the subject would be unable to learn. Learning imposes constraints on both, hence the subject-object distinction;

To drive this point home, we could specify any environment and note how the conclusions regarding the model would not change. I hinted at this by presenting an example of the environment’s evolution that began with 111. Now assume the environment is such that it produces ‘1’ indefinitely in a completely deterministic and predictable way. You could interpret this as the classical example in philosophical treatments of induction: ‘1’ means that the sun rises the next day[2], and ‘0’ means that the sun does not rise. But again, this would make no difference, the learner would never catch on to this regularity.

Conversely, specifying an unlearnable environment will not do the learner any good either, of course. In fact, the astute reader will have realized that the learning agent’s prior corresponds exactly to the belief that the sequence of ‘0’ and ‘1’ are the results of a series of coin flips, where the coin is fair. And of course, given the assumption describing the coin, previous coin flips do not yield any information that serve to make predictions about future coin flips; the environment has no structure to be learned.

Most problems of bayesian inference include, in the problem statement, a description of the environment that is automatically used as the agent’s prior knowledge or at least a starting point, as in the typical case of drawing balls from an urn. This prior knowledge is incomplete of course, as no inference would be necessary otherwise. But in these cases the subject-object distinction is not so apparent; analysis about the agent’s learning performance assumes the problem definition is of course true!

However, the subject-object distinction is more important when asking what model of inference applies to the scientific investigation of nature and the problem of induction. This is because in these models, there is no problem definition, prior knowledge is genuinely prior to any experience.

Pending questions: What problem definition applies to inductive inference in science? What happens when extending our model to cases with infinite observations/theories? Does learning logically require bias, and if so, what bias is appropriate universally and intuitively acceptable?[3]

[1] In fact, not only must there be bias, but it must be a bias that exploits structure. Altering the distribution such that it favors more ‘1’ in the sequence irrespective of previous observations is a bias, but does not allow learning. This is another important distinction, the entropy-learnability distinction.

[2] Or after 24 hours if you want to be picky about tautologies

[3] Another more technical question is, can prior knowledge in inference problems always be recast as a bias over theories that are deterministic predictions over entire sequences of possible events? (as we saw when noting that the binary sequence model is equivalent to a repeated coin flip scenario) If so, what property of these distributions allows learning?