Google does Knowledge Representation

Knowledge Representation is one of the main problems in AI. The aim is to encode knowledge formally to allow automated querying and reasoning about such knowledge, in effect allowing software agents to “know” things about the world. The exact representation may vary, but the language is usually based on some kind of logic. One fairly intuitive case of such a representation is the network, such as in semantic networks. These represent concepts as nodes of the network, and relationships as links. Here is some knowledge expressed in this way (image from Wikipedia)

where some simple facts about animals are expressed as nodes and links.

Bringing knowledge representation to the world wide web has been a recurring ambition for many years now. This initiative is known as the Semantic Web, the aim is to provide the currently free-form information on the web in structured-data equivalents that would allow computers to extract and reason about it. Here’s a quote by Tim Berners-Lee

I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers. A ‘Semantic Web’, which should make this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The ‘intelligent agents’ people have touted for ages will finally materialize.

The Semantic Web initiative has resulted in many standards to support this vision. To date however, this vision has not been realized, and it does not seem like it will any time soon. Which brings us to Google’s Knowledge Graph. Note the term graph which has the clear signature of a semantic-network-like approach to capturing knowledge. What google is trying here is to bring some benefits of knowledge representation to search, without having to realize the Semantic Web vision in all its glory. Instead of waiting for the entire web to go semantic, google will internally maintain a smaller scale equivalent of web knowledge and use that instead. This also bypasses another of the main problems of the semantic web vision, expecting the “wild” web to maintain quality, coherent data. It’s certainly more feasible with an internal, controlled knowledge base.

So what we can expect is for google to incrementally augment it’s traditional search via its knowledge graph. The core of search will still be keyword-based and statistical, but it will begin to recognize concepts in data retrieved through its keyword approach, in effect going from semantic-free strings of letters (words) to concepts. Once these concepts are recognized, the knowledge graph can be exploited to present more detailed search results, and also better present these results.

Will it be a substantial improvement, or just another gimmick that won’t go anywhere? Many “google killers” have come and gone over the years, some of them promising to revolutionize search by going beyond keywords. That this has never worked is evidence of how keyword based search is surprisingly effective and robust. But google knows this better than anybody, and the key word in my previous paragraph is incremental.