Buzz’s Blog: On Web 3.0 and the Semantic Web

Jul 29 2009   2:17AM GMT

The Semantic Web: RDF and SPARQL, part 4

Roger King Roger King Profile: Roger King

This posting is a continuation of the previous posting. We are discussing RDF, the “triples” language that is serving as a cornerstone of the Semantic Web effort. In this posting, we will look at SPARQL, the web language designed to search data that has been specified as RDF triples. The goal of the Semantic Web is to partly automate the searching of the Web, by using RDF to capture deeper semantics of information and SPARQL to query that information. This is in comparison to today’s technology, which does not allow us to do much more than search for individual words in the text of webpages.

From the last posting.

Here is a piece of the RDF code from the previous posting:

<rdf:RDF

xmls:rdf=”http://www.w3.org/1999/02/22-rdf-syntax-ns#”>
xmls:zx=”
http://www.someurl.org/zx/”>

<rdf:Description

rdf:about=”http://www.awebsite.org/index.html”>

<zx:created-by>http://www.anotherurl.org/buzz</zx:created-by>

</rdf:Description>

</rdf:RDF>

This can be interpreted as the webpage at awesite.org/index.html was created by Buzz.

Another representation of RDF-based information: 3 triples.

We see from the above that RDF simply represents triples. We could simplify it even more as:

http://awesite.org/index.html was created by Buzz

Part of the reason that the original RDF code above is so much more complex is that the full syntax lets us specify that we are using terms that are defined at specific web addresses. This allows people to use standardized terms and greatly enhances the specitifity of an RDF specification. The full syntax also allows us to reference pieces of information that reside on the Web. (See the previous three postings, 1, 2, 3.)

Before we launch into a SPARQL example, we need to make an important distinction between syntax and symantics. The code above is written in a particular syntax for RDF, one that uses XML. We note that because syntax needs to be very precise, it tends to be verbose. This can cause syntax to obsure the conceptual simplicity of underlaying semantics, or meaning.

But this isn’t the only way to specify RDF triples. Let’s look at some information that is much simpler, and at the same time, let’s look at using a different syntax for specifying RDF-like triples. Here are three triples:

<http://awebsite.org/ > was-created-by “Buzz”

<http://awebsite.org/ > was-created-by “Suzy”

<http://anotherwebsite.org/> was-created-by “Alice”

This is a very simple program. It consists of a two triples that say that a website named awebsite was created by Buzz and Suzy, and another triple that says that Alice created a website called anotherwebsite. We are not saying that was-created-by is a widely used term; it may have been invented only for particular RDF specification, and its meaning would therefore not be precise. We can only interpret it from our general understanding of English words. We also have no idea who these people Buzz and Suzy and Alice are, and we have no other information about them.

SPARQL: searching triples distributed across the Web.

Now, here is a piece of code:

prefix website1: <http://awebsite.org/ >
SELECT ?x
WHERE
{ website1:was-created-by ?x }

We’re getting very close to real SPARQL, by the way, and if you know SQL, you can see the extremely similarity. But syntax is not our issue here. We’re trying to look at concepts.

This code will find the creators of http://awebsite.org. You could imagine that there are actually many thousands of these triples, and that they tell us who built a large number of different websites. Now, we see the power of this query. It will search through all of these triples and find the two of interest to us, and then pluck off the names of the creators.

In fact, these triples could be distributed all around the Web, and we could imagine a search engine taking this query and running it everywhere on the Web where was-created-by triples are stored, and then having it bring back all the creators of awebsite, even if there are a hundred developers, and even if these names are spread around the Internet.

Next, the bigger issue.

In the next posting, we’ll look more closely at SPARQL. One thing we will consider is why it does look so much like SQL. There is a powerful reason for this that has to do with searching information in general.


 Comment on this Post

 
There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when other members comment.

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to: