Assume that you want to know about Python snake and you go to Google and type Python as the search query. You can see that some of the results will be about the programming language ‘Python’. Though Google is the best search engine, it can’t precisely tell which pages are about the snake ‘python’. It is also true that Google uses many algorithms to find the relationships between words in a document and group those results together. This is the reason why you get suggestions from Google if you enter ‘python’ as the search query.
This problem can be solved only if computers can understand human language. Even though natural language processing (NLP) algorithms are employed in many popular search engines, the tools are not very efficient.
Here comes the importance of creating a semantic web where computers can understand relationships between words and objects. This edition is about the idea behind Semantic networks and their implementation.
RDF
Resource Description Framework (RDF) is a layout proposed by the W3 consortium for enabling semantic networks. We can say that the idea had its origin in the concept of tagging. Here is a sample RDF description:
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/">
<rdf:Description rdf:about="http://aasisvinayak.com/">
<dc:title>Aasis Vinayak</dc:title>
<dc:publisher>aasisvinayak.com</dc:publisher>
</rdf:Description>
</rdf:RDF>
Here, what it tells is that ‘Aasis Vinayak’ is the title of the URL page and the publisher is aasisvinayak.com. This is a very simple case. But this effectively conveys some idea to the computer (or search engine crawler) in a semantic way.
RDFa
RDF is only a concept, we need to use RDFa (Resource Description Framework in attributes) tools for implementing this idea. RDFa uses two main concepts :
CURIE: You might know that URI refers to Universal Resource Identifier. The most common URI is URL (Universal Resource locator) which is the address of a web page. You may have noticed that many URLs are very along and hard to remember (like this one – http://techblog.aasisvinayak.com/a-twitter-application-using-java-and-swing-tutorial/ ). In RDFa, we use an elegant form of URL called CuteURI (CURIE) :
foaf: name
where, the first part is actually a representation of another URI.
N3 notation or Notation3: This is another concept in RDFa where we divide all the ‘ideas and concepts’ into three parts – subject, predicate and object (just like we did in 1st grade!). Let’s take an example:
Vinayak loves GNU
Here ‘Vinayak’ is the subject, ‘loves’ is predicate and ‘GNU’ is the object.
In N3 notation format, this should be expressed as:
@prefix pref<http://aasisvinayak.com/vocabulary#>.
<#vinayak> pref:loves <#GNU>.
@prefix indicates the CURIE and the URI to be used will be given as ‘pref’. And we use ‘.’ (period) for ending all statements. Here is another example expressed in tree form:
Vocabulary
In the previous case, we have seen the use of ‘vocabulary’. It is used to describe a particular subject or object. Say, if ‘vinayak’ is linked to ‘name’ vocabulary, it means that it is a name. Here is a more concrete example:
@prefix pref<http://xmlns.com/foaf/0.1/#>.
<#vinayak> pref:name <"Aasis Vinayak">.
This means that ‘vinayak’ is a name and the complete name is Aasis Vinayak.
How to implement this?
We have seen the syntax for implementing RDF; now let’s see how to put this into practice. For this I’m going to use the previous example itself. In xhtml, I can write:
<body xmlns:foaf="http://xmlns.com/foaf/0.1/">
<span about="vinayak" property="foaf=name">
Aasis Vinayak
</span>
</body>
This represents the same thing. If change the human readable part as:
<body xmlns:foaf="http://xmlns.com/foaf/0.1/">
<span about="vinayak" property="foaf=name">
Aasis Vinayak PG
</span>
</body>
The machine will still understand the idea – that ‘Aasis Vinayak PG’ is the full name.
There is another property called ‘typeof’ which can be used to specify what kind of subject or object that we are meddling with:
<body xmlns:foaf="http://xmlns.com/foaf/0.1/">
<span about="#vinayak" typeof="foaf:Person" property="foaf:name">
Aasis Vinayak
</span>
</body>
This means that ‘Aasis Vinayak’ is also person. In short, ‘Aasis Vinayak’ is the ‘name of a person’. In this way, a computer will ‘understand’ the idea.
Let’s go one more step by using a ‘friend-of-a-friend’ relationship. Say,
Jane knows Mac
(I assume that you know how to divide this sentence and rewrite that as RDFa). This can be expressed in the following way:
<body xmlns:foaf="http://xmlns.com/foaf/0.1/">
<span about="#jane" typeof="foaf:Person" property="foaf:name">
Jane Blah
</span>
<span about="#mac" typeof="foaf:Person" property="foaf:name">
Mac Blah Blah
</span>
<span about="#jane" rel="foaf:knows" resource="#mac">
Jane knows Mac
</span>
</body>
Here you can see that we have mentioned full names of two people and then in the third segment we linked them using the relationship property ‘foaf:knows’. Now, the computer ‘knows’ that ‘Jane knows Mac’.
Now imagine if we could do this in every page; Google can precisely tell which page talks about the python snake!




Join Techblog
Facebook Group
Read
Digg entries
Add techblog to
Google reader
Blog looks really good mate, keep it up! Inspires me to keep building a following of my own.