Google does Knowledge Representation

Knowledge Representation is one of the main problems in AI. The aim is to encode knowledge formally to allow automated querying and reasoning about such knowledge, in effect allowing software agents to “know” things about the world. The exact representation may vary, but the language is usually based on some kind of logic. One fairly intuitive case of such a representation is the network, such as in semantic networks. These represent concepts as nodes of the network, and relationships as links. Here is some knowledge expressed in this way (image from Wikipedia)

where some simple facts about animals are expressed as nodes and links.

Bringing knowledge representation to the world wide web has been a recurring ambition for many years now. This initiative is known as the Semantic Web, the aim is to provide the currently free-form information on the web in structured-data equivalents that would allow computers to extract and reason about it. Here’s a quote by Tim Berners-Lee

I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers. A ‘Semantic Web’, which should make this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The ‘intelligent agents’ people have touted for ages will finally materialize.

The Semantic Web initiative has resulted in many standards to support this vision. To date however, this vision has not been realized, and it does not seem like it will any time soon. Which brings us to Google’s Knowledge Graph. Note the term graph which has the clear signature of a semantic-network-like approach to capturing knowledge. What google is trying here is to bring some benefits of knowledge representation to search, without having to realize the Semantic Web vision in all its glory. Instead of waiting for the entire web to go semantic, google will internally maintain a smaller scale equivalent of web knowledge and use that instead. This also bypasses another of the main problems of the semantic web vision, expecting the “wild” web to maintain quality, coherent data. It’s certainly more feasible with an internal, controlled knowledge base.

So what we can expect is for google to incrementally augment it’s traditional search via its knowledge graph. The core of search will still be keyword-based and statistical, but it will begin to recognize concepts in data retrieved through its keyword approach, in effect going from semantic-free strings of letters (words) to concepts. Once these concepts are recognized, the knowledge graph can be exploited to present more detailed search results, and also better present these results.

Will it be a substantial improvement, or just another gimmick that won’t go anywhere? Many “google killers” have come and gone over the years, some of them promising to revolutionize search by going beyond keywords. That this has never worked is evidence of how keyword based search is surprisingly effective and robust. But google knows this better than anybody, and the key word in my previous paragraph is incremental.

Photorealistic graphics, a factor of 2000 away?

In this talk Tim Sweeney discusses the long term future of computing and its impact on society, and specifically that of 3D graphics technology. It’s something I’ve asked myself before, how far away are photorealistic realtime 3D graphics? When will games and computer generated graphics in movies be realistic to the point of indistinguishable from the real thing?

The route that Sweeney follows to answer this question is to first ask what the limits of our perception are, in particular our vision (eyes plus visual cortex). Any improvements beyond these limits cannot be perceived, so they establish a definitive upper bound on realism. And this bound can be convincingly determined due to well established knowledge of our biology and sensory apparatus. A few numbers are given, for example in terms of maximum resolution and frames per second that we can perceive.

These numbers together with somewhat less objective intuitions as to which graphics approximations (such as lighting algorithms) are necessary to achieve realism, yield a factor of 2000 times today’s hardware performance to reach the target.

Sweeney then rightly notes that some aspects of presenting realistic graphics are not just a matter of simulating physical processes that are completely understood, but rather require theoretical advances. The main one is artificial intelligence, necessary to realistically simulate characters in terms of animation and even behavior. Still, the term photorealistic doesn’t necessarily include things like these.

Getting close..

Summing up, I’d say the 2000x figure is his rough answer, and it seems to me about right. How does this translate into years? You could use Moore’s law, but I suspect that would underestimate the potential improvements in graphics performance which could advance faster thanks to the particular parallel nature of rendering. So, I’d venture circa 2020 to get a nice round figure.