 




<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Buzz’s Blog: On Web 3.0 and the Semantic Web &#187; triples</title>
	<atom:link href="http://itknowledgeexchange.techtarget.com/semantic-web/tag/triples/feed/" rel="self" type="application/rss+xml" />
	<link>http://itknowledgeexchange.techtarget.com/semantic-web</link>
	<description>Defining the necessary skills for future software professionals</description>
	<lastBuildDate>Sun, 16 Dec 2012 04:42:23 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	
		<item>
		<title>Web privacy and the Vast Machine: You ain&#8217;t seen nothing yet.</title>
		<link>http://itknowledgeexchange.techtarget.com/semantic-web/web-privacy-and-the-vast-machine-you-aint-seen-nothing-yet/</link>
		<comments>http://itknowledgeexchange.techtarget.com/semantic-web/web-privacy-and-the-vast-machine-you-aint-seen-nothing-yet/#comments</comments>
		<pubDate>Tue, 01 Dec 2009 03:29:22 +0000</pubDate>
		<dc:creator>Roger King</dc:creator>
				<category><![CDATA[assertions]]></category>
		<category><![CDATA[inferences]]></category>
		<category><![CDATA[RDF]]></category>
		<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[SPARQL]]></category>
		<category><![CDATA[triples]]></category>
		<category><![CDATA[Web 2.0]]></category>
		<category><![CDATA[Web 3.0]]></category>

		<guid isPermaLink="false">http://itknowledgeexchange.techtarget.com/semantic-web/web-privacy-and-the-vast-machine-you-aint-seen-nothing-yet/</guid>
		<description><![CDATA[This blog is dedicated to the Semantic Web and Web 2.0/3.0 technology. In this posting, we consider privacy and the Semantic Web. The Traveler had it easy. There is a series of three science fiction novels by John Twelve Hawks. They concern a “Traveler” who battles the “Vast Machine”, which is a global grid of [...]]]></description>
				<content:encoded><![CDATA[<p>This blog is dedicated to the Semantic Web and Web 2.0/3.0 technology.  In this posting, we consider privacy and the Semantic Web.</p>
<p><strong>The Traveler had it easy.</strong></p>
<p>There is a series of three science fiction novels by John Twelve Hawks.  They concern a “Traveler” who battles the “Vast Machine”, which is a global grid of security cameras, governmental and corporate databases, and computers that collect information on people, track them, and manipulate society.  They are very popular novels.</p>
<p>But these books are not all that imaginative. </p>
<p>Why not?  If and when the Semantic Web ever emerges (please see previous postings of this blog), there will be a lot more than security camera footage and passive database systems out there.  In his books, Twelve Hawks describes programmers working for the Vast Machine who pull information out of databases and plant information in databases, and who somehow locate and integrate information from many sources.  It’s not clear how they do it.</p>
<p>The problem is tractability.  Extracting the meaning of data (its “semantics”) is extremely difficult, and given today’s Web, it is a highly manual, painstaking, and ultimately intractable problem.  Twelve Hawks’ Vast Machine isn’t all that much of a threat.</p>
<p><strong>Consider, however, the emerging Semantic Web.</strong></p>
<p>The whole idea of the Semantic Web, on the other hand, is to make databases proactive, to let them announce their content by using globally accepted standards.  In this blog, we have looked at one proposed standard, called <a href="http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-1/">RDF</a>, which is based on “triples” that interrelate information, and a Web-hopping query language called <a href="http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-4/">SPARQL</a> that can concatenate triples that define information at diverse, independently-created websites &#8211; thus inferring new information.  We’ve looked at the <a href="http://itknowledgeexchange.techtarget.com/semantic-web/a-real-world-look-at-the-semantic-web/">beginnings </a>of this technology as it is taking form on the Web.</p>
<p>In other words, it might not be long at all before the least of our problems would be dastardly hackers who break into databases and pluck information &#8211; because the finding, integrating, and interpreting of data from highly divergent sources will become, in large part, automatic.</p>
<p>It will make the intractable quite tractable.</p>
<p><strong>Okay, I confess&#8230;<br />
</strong><br />
It is not as simple as that, of course, and I am grossly overstating the danger.  Presumably, private databases belonging to corporations and governments will not be loaded up with this sort of semantic metadata and placed on the open Web.  And the sorts of inferences that can be made by unifying metadata from multiple sites will be fairly low-level, leaving a lot of difficult work for any Vast Machine that wants to manipulate our every move and thought.</p>
<p><strong>Still&#8230;<br />
</strong><br />
But it is true that the potential for misuse will increase sharply.  There will indeed be many isolated instances where innocently posted information from two or more sites will be automatically linked together because of uniformly-specified metadata.  If one triple at one site has data marked up as “People OWN Kinds-0f-StampCollections”, and another site says that “Kinds-of-StampCollections HAVE Certain-Values”, a thief who knows little about philatelics might learn that Bob owns stamps from the Southern Confederacy, and that stamps from the Southern Confederacy are worth hundreds of thousands of dollars&#8230;</p>
<p>Just a thought for the next sci-fi writer.</p>
<!-- wpms-network-global-inserts -->]]></content:encoded>
			<wfw:commentRss>http://itknowledgeexchange.techtarget.com/semantic-web/web-privacy-and-the-vast-machine-you-aint-seen-nothing-yet/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Dynamic pages, hidden data, and infered information: the danger of scale.</title>
		<link>http://itknowledgeexchange.techtarget.com/semantic-web/dynamic-pages-hidden-data-and-infered-information-the-danger-of-scale/</link>
		<comments>http://itknowledgeexchange.techtarget.com/semantic-web/dynamic-pages-hidden-data-and-infered-information-the-danger-of-scale/#comments</comments>
		<pubDate>Fri, 18 Sep 2009 03:02:28 +0000</pubDate>
		<dc:creator>Roger King</dc:creator>
				<category><![CDATA[assertions]]></category>
		<category><![CDATA[databases]]></category>
		<category><![CDATA[dynamic pages]]></category>
		<category><![CDATA[hidden web content]]></category>
		<category><![CDATA[inferences]]></category>
		<category><![CDATA[next generation search engines]]></category>
		<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[smart search engines]]></category>
		<category><![CDATA[static pages]]></category>
		<category><![CDATA[triples]]></category>

		<guid isPermaLink="false">http://itknowledgeexchange.techtarget.com/semantic-web/dynamic-pages-hidden-data-and-infered-information-the-danger-of-scale/</guid>
		<description><![CDATA[The good and bad sides of the powerful Semantic Web. So what happens when the Semantic Web is here? It’s supposed to largely automate the process of searching the Web by allowing us to attach machine-readable assertions (perhaps by using RDF) to information posted on the Web. Then, instead of us poor flailing humans having [...]]]></description>
				<content:encoded><![CDATA[<p><strong>The good and bad sides of the powerful Semantic Web.<br />
</strong><br />
So what happens when the <a href="http://itknowledgeexchange.techtarget.com/semantic-web/what-do-we-mean-by-semantic-web/">Semantic Web</a> is here?  It’s supposed to largely automate the process of searching the Web by allowing us to attach machine-readable assertions (perhaps by using <a href="http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-1/">RDF</a>) to information posted on the Web.  Then, instead of us poor flailing humans having to painstakingly chase down countless URLs until we get what we want, smart search engines would be able to find precisely what we want in a single shot.</p>
<p>There is an obvious danger to all of this.  The new Web will scale, in both good ways and bad.  I am certainly not the first person to point out that the smarter the Web, the easier it will be for software to peruse the Web and dig up personal information about us. There will be software that carefully crafts ads in Spam mail that will target our vulnerabilities and our preferences. Websites will dynamically create webpages that target us individually, as well.  When we shop online, when we read news, when we make social connections online, the Web will be disarmingly efficient and effective, and this leaves lots of room for fraud and manipulation.</p>
<p>This is already happening to a significant degree, and most of us are aware of it.</p>
<p><strong>The no-longer-hidden database factor.<br />
</strong><br />
There is something more subtle about all of this, however.  One of the most difficult things to do with traditional Web technology is to expose the content of databases to Web visitors.  That’s because the pages that deliver up content pulled from databases are highly dynamic in nature, and so it is very hard for web designers to make search engines (like Google) find and index the content of these databases.  There are simple and somewhat effective things web designers can do, like creating static pages that contain terms that are meant to draw web visitors to their sites.  These pages are not “destination” pages; rather, they exist only as a way of advertising the information  contained in databases. </p>
<p>In the future, RDF assertions (and other machine-readable content) will be added to websites, and they will server as far more effective draws.</p>
<p>But what about privacy?  Will web designers inadvertently facilitate fraud and identity theft by enabling the automatic cross-referencing of detailed information existing in databases that have been built and deployed on the Web in isolation?  This capability is at the heart of the Semantic Web effort.  Information that right now can only be obtained by individual users manipulating individual web interfaces will be discoverable by smart search engines.  </p>
<p><strong>The real problem: it will scale.<br />
</strong><br />
This is a big deal.  It’s not just that previously hidden information will now be <a href="http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-revealing-hidden-data/">discoverable</a>.  Because standardized terms and assertions will be used to describe information in databases, smart search engines will be able to automatically interrelate data from otherwise unrelated database systems. When information from multiple places is integrated, new information is effectively created.  </p>
<p>For a moment, let’s forget about databases and look at a simple example of information that might be stored statically in two websites.  Here is an example adapted from the previous posting of this blog:</p>
<p>Assertion 1: Joe <em><strong>is</strong></em> tall for an athlete.<br />
Assertion 2: Tall athletes <em><strong>should try out for</strong></em> basketball.</p>
<p>A new inference: Joe <em><strong>should try out for</strong></em> basketball.  </p>
<p>The point here is that this new inference can be inferred automatically, without the intervention of a human being.</p>
<p>We noted in the previous posting that the information  about Joe and the information about basketball might be on different websites.  These websites could easily have been built independently.  But a key notion &#8211; and that is the semantics of the word “tall” in the context of basketball &#8211; is what allows this information to be automatically integrated.  Another site might point out that Timmy is tall for a kindergarten student, but this would not trigger the suggestion  that Timmy try out for the NBA.</p>
<p>Now, let’s get back to database systems, these things that can contain countless terabytes of personal information.  Perhaps there is a database at one site containing information about many thousands of athletes. Perhaps there are hundreds or thousands of such sites.  The Semantic Web would allow us to find tall athletes without having to know in  advance what databases around the world have this sort of data inside them, data that previously could only have been extracted through tedious, time-consume human/computer interaction.  Now, a high school counselor or a sports agent looking for new clients can be far more effective at their jobs.</p>
<p>Or, maybe it’s a drug company matching potential customers up with expensive drugs targeted toward specific diseases, or toward people who might have vague symptoms of various diseases, and who might be easily convinced they are sick.   Ora con artist looking to scam elderly people who are likely to have dementias.  </p>
<p>Or &#8211; well, get it?  The Semantic Web will <em><strong>scale</strong></em> because it will have access to huge databases, and not just a world wide web of static pages.  That’s the danger.</p>
<!-- wpms-network-global-inserts -->]]></content:encoded>
			<wfw:commentRss>http://itknowledgeexchange.techtarget.com/semantic-web/dynamic-pages-hidden-data-and-infered-information-the-danger-of-scale/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Real-World Look at the Semantic Web, part 2</title>
		<link>http://itknowledgeexchange.techtarget.com/semantic-web/real-world-look-at-the-semantic-web-part-2/</link>
		<comments>http://itknowledgeexchange.techtarget.com/semantic-web/real-world-look-at-the-semantic-web-part-2/#comments</comments>
		<pubDate>Wed, 09 Sep 2009 18:02:18 +0000</pubDate>
		<dc:creator>Roger King</dc:creator>
				<category><![CDATA[assertions]]></category>
		<category><![CDATA[inferences]]></category>
		<category><![CDATA[information]]></category>
		<category><![CDATA[namespaces]]></category>
		<category><![CDATA[RDF]]></category>
		<category><![CDATA[SPARQL]]></category>
		<category><![CDATA[triples]]></category>
		<category><![CDATA[URI's]]></category>
		<category><![CDATA[wikis]]></category>

		<guid isPermaLink="false">http://itknowledgeexchange.techtarget.com/semantic-web/real-world-look-at-the-semantic-web-part-2/</guid>
		<description><![CDATA[This blog is dedicated to the study of emerging Web technology, in particular, ongoing research and development aimed at building software tools that will underlie the emerging Semantic Web. Last time, we looked at DBpedia, something that a former graduate student at my university, Greg Ziebold, pointed me toward. The Semantic MediaWiki. In this posting, [...]]]></description>
				<content:encoded><![CDATA[<p>This blog is dedicated to the study of emerging Web technology, in particular, ongoing research and development aimed at building software tools that will underlie the emerging <a href="http://itknowledgeexchange.techtarget.com/semantic-web/tag/language-syntax-and-semantics/">Semantic Web</a>.  Last time, we looked at <a href="http://wiki.dbpedia.org/About">DBpedia</a>, something that a former graduate student at my university, Greg Ziebold, pointed me toward.  </p>
<p><strong>The Semantic MediaWiki.</p>
<p></strong>In this posting, we look at the Semantic MediaWiki, something else that Greg told me about.  It is an extension of MediaWiki, the application that the Wikipedia is built out of.  You can learn all about it at the <a href="http://semantic-mediawiki.org/wiki/Help:Introduction_to_Semantic_MediaWiki">Semantic MediaWiki website</a>.  The idea behind Semantic MediaWiki is to provide a more powerful wiki tool, namely one that supports more than just human-readable things like text and images.  </p>
<p><strong>RDF and namespaces: creating machine-readable, web-based information.</strong></p>
<p>The idea is to allow entries in wikis that contain machine-readable information, so that searching can be performed in a largely automatic fashion.  Specifically, the Semantic MediaWiki allows users to export information from a wiki in <a href="http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-1/">RDF</a> format.  An RDF specification consists of “triples” that form “assertions”.  Consider the following</p>
<p>Assertion 1: Joe <em><strong>is</strong></em> tall.<br />
Assertion 2: Tall People <em><strong>should try out for</strong></em> Basketball.</p>
<p>The idea is for terms in triples (“Joe”, “tall”, “is”, “Tall People”, etc.) to be taken from predefined and globally accessible <a href="http://itknowledgeexchange.techtarget.com/semantic-web/namespaces-and-the-semantic-web/">namespaces.</a>  This would ensure that everyone who uses a given term (like “tall” or “Should try out for”) will have the same meaning in mind.  In this way, rather than having to painfully search for information that pertains  to Tall People, for example, a smart search engine could do the searching for us.</p>
<p><strong>Building locally, growing globally.<br />
</strong><br />
There is more to this.  These namespaces can be available on the Web, and RDF statements can point to the relevant namespaces.  This means that software searching the Web, and processing these triples, can easily find the relevant namespaces.  </p>
<p>Also, the things in  the right and left side of a triple (like “Joe” and “tall”) can themselves be <a href="http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-2/">Web-based resources.</a>  This means that information scattered around the Web can be interconnected &#8211; but all the work can be done locally.  No one has to manually integrate millions of websites.  The job can be done little by little, in a quiet way, as people start to store their information in an RDF compatible fashion.</p>
<p>This is how the Semantic Web will scale.  Everyone will use shared namespaces and shared protocols like RDF.  This will, in essence, turn the Web into one big website that can be searched in a partly automatic fashion.</p>
<p><strong>SPARQL: querying RDF-based information.</p>
<p></strong>How will we interrelate data scattered around the Web?</p>
<p>There is a query language out there, called <a href="http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-1/">SPARQL</a>, that can be used to search the Web.  SPARQL can follow RDF connections around the globe.  How is this done?  It has to do with being able to “infer” new things.  Consider a fact that can  be automatically deduced from the two assertions above:</p>
<p>A new inference: Joe <em><strong>should try out for</strong></em> Basketball.  </p>
<p>Assertion 1 could be on a server in Detroit, and assertion 2 could be on a server in Miami, and SPARQL could do the job of making the leap that leads to the new inference.</p>
<p>This means that we could figure out what Joe should be doing right now without having to find the two pieces of information manually (the fact that he is tall, and that tall people should play basketball), and  without having to make the inference ourselves.  </p>
<p>This is a big deal.  This sort of automation is what the Semantic Web is all about.</p>
<p><strong>So what do real people do with the Semantic MediaWiki?  We’ll look at this next.</strong></p>
<!-- wpms-network-global-inserts -->]]></content:encoded>
			<wfw:commentRss>http://itknowledgeexchange.techtarget.com/semantic-web/real-world-look-at-the-semantic-web-part-2/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>A Real-World Look at the Semantic Web, part 1</title>
		<link>http://itknowledgeexchange.techtarget.com/semantic-web/a-real-world-look-at-the-semantic-web/</link>
		<comments>http://itknowledgeexchange.techtarget.com/semantic-web/a-real-world-look-at-the-semantic-web/#comments</comments>
		<pubDate>Mon, 31 Aug 2009 03:40:04 +0000</pubDate>
		<dc:creator>Roger King</dc:creator>
				<category><![CDATA[assertions]]></category>
		<category><![CDATA[databases]]></category>
		<category><![CDATA[inferences]]></category>
		<category><![CDATA[information]]></category>
		<category><![CDATA[knowledge]]></category>
		<category><![CDATA[namespaces]]></category>
		<category><![CDATA[ontologies]]></category>
		<category><![CDATA[RDF]]></category>
		<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[SPARQL]]></category>
		<category><![CDATA[triples]]></category>
		<category><![CDATA[wikis]]></category>

		<guid isPermaLink="false">http://itknowledgeexchange.techtarget.com/semantic-web/a-real-world-look-at-the-semantic-web/</guid>
		<description><![CDATA[This blog is dedicated to the study of emerging Web technology, in particular, ongoing research and development aimed at building software tools that will underlie the emerging Semantic Web. In this posting, we look at a little-known website that has the potential of setting the pace for the developers of the Semantic Web. DBpedia. It’s [...]]]></description>
				<content:encoded><![CDATA[<p>This blog is dedicated to the study of emerging Web technology, in particular, ongoing research and development aimed at building software tools that will underlie the emerging <a href="http://itknowledgeexchange.techtarget.com/semantic-web/tag/language-syntax-and-semantics/">Semantic Web</a>.  In this posting, we look at a little-known website that has the potential of setting the pace for the developers of the Semantic Web.</p>
<p><strong>DBpedia.</strong></p>
<p><strong> </strong></p>
<p>It’s called <a href="http://wiki.dbpedia.org/About">DBpedia</a>.  A former graduate student at my university, Greg Ziebold, pointed me toward it.  The goal of the DBpedia is to transform data from the Wikipedia into a chunk of the Semantic Web.  To do this, DBpedia is using <a href="http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-1/">RDF</a> technology, something we have discussed is past postings of this blog.  Behind RDF is an extremely simple concept, but one that has proven extremely powerful and versatile.</p>
<p>The general idea is to break knowledge up into “triples” that describe relationships between pieces of information.  These triples can be chained together to discover new relationships.  And, importantly, triples must make use of widely shared sets of terminology, called <a href="http://itknowledgeexchange.techtarget.com/semantic-web/namespaces-and-the-semantic-web/">namespaces</a>, in order for knowledge from different places on the Web to be properly chained together.</p>
<p><strong>RDF, triples, assertions, and inferences.</strong></p>
<p>A thorough example can be found in a <a href="http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-2/">previous posting</a> of this blog.</p>
<p>Here is a very simple example of triples (also known as “assertions”) and how they can be put together into “inferences”.</p>
<p>Assertion 1: Joe <em><strong>is</strong></em> tall.<br />
Assertion 2: Tall People <em><strong>should try out for</strong></em> Basketball.<br />
A new inference: Joe <em><strong>should try out for</strong></em> Basketball.</p>
<p>Keep in mind that we would want to make sure that the words used in these assertions have precise, global meanings.   We might take the terms in these two assertions from a basketball namespace, one that would carefully dictate exactly what “tall” means in the basketball world.  Certainly, it would be quite different from the meaning of “tall” in a kindergarten namespace.</p>
<p><strong>More on DBpedia.</strong></p>
<p>There’s a fancy word for sets of triples that use namespaces and represent various areas of knowledge.  They are called “<a href="http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-5/">ontologies</a>”, taken from the term used by philosophers to argue about the existence of various things, like God.  The DBpedia is essentially a vast ontology, formed from triples and namespaces.  Most of the knowledge defined by this ontology comes from the Wikipedia.  The folks behind the DBpedia have been given direct access to the flow of information into the Wikipedia, so that the DBpedia can stay current.</p>
<p>One way to look at the DBpedia is that it takes the Wikipedia and reforms it into something that can be searched far more effectively.  Right now, to search the Wikipedia, most of us simply type in  terms (either into Google/Yahoo or into the Wikipedia search page).  We try various terms and follow links inside the Wikipedia until we find what we think we are looking for.  With the DBpedia, users can search with <a href="http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-4/">SPARQL</a>, a language based on the structure of SQL and engineered specifically for searching large bases of triples.  SPARQL allows us to traverse networks that consists of triples linked by inferences.</p>
<p>That way, if we were a coach looking for promising candidates for our team, we would use SPARQL to make the connection between Joe being tall and the fact that tall people should try out for basketball.  This is clearly much faster and more accurate than googling things like “tall”, “basketball”, etc, until we happened to find Joe in one of the web pages that pop up.</p>
<p>The DBpedia website, by the way, claims to have a triple base that consists of 274 million RDF triples.</p>
<p><em><strong>More on this in the next posting.<br />
</strong></em></p>
<!-- wpms-network-global-inserts -->]]></content:encoded>
			<wfw:commentRss>http://itknowledgeexchange.techtarget.com/semantic-web/a-real-world-look-at-the-semantic-web/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Semantic Web: RDF and SPARQL, part 5</title>
		<link>http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-5/</link>
		<comments>http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-5/#comments</comments>
		<pubDate>Wed, 05 Aug 2009 21:39:01 +0000</pubDate>
		<dc:creator>Roger King</dc:creator>
				<category><![CDATA[data]]></category>
		<category><![CDATA[information]]></category>
		<category><![CDATA[knowledge]]></category>
		<category><![CDATA[ontologies]]></category>
		<category><![CDATA[RDF]]></category>
		<category><![CDATA[the Semantic Web]]></category>
		<category><![CDATA[triples]]></category>

		<guid isPermaLink="false">http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-5/</guid>
		<description><![CDATA[This posting is a continuation of the previous posting. We are discussing RDF, the &#8220;triples&#8221; language that is serving as a cornerstone of the Semantic Web effort. The goal of the Semantic Web is to partly automate the searching of the Web, by using RDF to capture deeper semantics of information and SPARQL to query [...]]]></description>
				<content:encoded><![CDATA[<p>This posting is a continuation of the previous posting. We are discussing RDF, the &#8220;triples&#8221; language that is serving as a cornerstone of the Semantic Web effort. The goal of the Semantic Web is to partly automate the searching of the Web, by using RDF to capture deeper semantics of information and SPARQL to query that information. This is in comparison to today&#8217;s search engine technology, which does not allow us to do much more than search for individual words in the text of webpages.</p>
<p>Let&#8217;s step back for a moment.</p>
<p>Just how universal is this notion of RDF-style triples? Will we ever have something substantially more useful, more powerful in the semantics it can express?</p>
<p><strong>Data, Information, Knowledge, and Ontologies.</strong></p>
<p>Academic and industrial researchers in computing like to trivialize big words. Let&#8217;s briefly look at the problem. &#8220;Data&#8221; is an old word, and most of us have a sense that virtually anything stored digitally can be considered data. This includes applications and other pieces of software, too. If you back up some applications to free up space on your hard drive, you&#8217;ve just turned applications into data, right?</p>
<p>&#8220;Information&#8221; is a word that came into play when researchers wanted something that was smarter than data. The word was broader, and vaguer, but information was essentially data that was ready to be used by interactive users. If I pull down a page from the Encyclopedia Britannica site, it&#8217;s filled with information. </p>
<p>Then, there were demands for an even richer word, one that suggests data that is beyond information, stuff that is rich in semantics that can be easily extracted. Often, knowledge was data or information that had been interconnected, turned into trees or graphs. Traversing the links in the structure told us how various things were interrelated and thereby exposing powerful semantics. The Web in a sense is knowledge. I can follow links between pages to discover how various pages on the Web are interrelated. I can follow connections on the Britannica site to connect a scientific discovery to the story of the discoverer&#8217;s life.</p>
<p>Here&#8217;s something significant. This blog and all its postings are related to new web technology, such as the Semantic Web. Our central concern has been the partial automation of the searching of the Web, so that users aren&#8217;t limited to typing words into Google and getting back stuff no richer than pages that happen to have these words in them. As it turns out, the term &#8220;knowledge&#8221; dates way back before the days of the Web, but back then, our notion of what it meant to be knowledge and not just data or information was pretty much the same as it is now. Knowledge can be processed by programs, thereby automating the task of finding the right knowledge and applying it to our problem domain.</p>
<p>Then came &#8220;ontology&#8221;. This is a relatively new word, but it&#8217;s perhaps the most embarrassing. The word, until recently, was reserved for philosophers to use. An ontological argument is an argument about the existence of something. Over the centuries, one common subject of ontological discussions has been the existence of God.</p>
<p>Hmm.</p>
<p><strong>The same old, same old.</strong></p>
<p>Flash forward to the Internet age: Computer researchers use the term to refer to a precise specification of the objects and properties (of these objects) in some well studied domain. I guess the idea is to suggest that we can capture the true nature of the existence of some domain.</p>
<p>These domains could be large, like banking, health insurance, or the stock market. Laying out all of the objects involved in one of these is a daunting task. Consider an insurance claim and all of its properties: type of claim, provider of medical service, patient name, etc., and then imagine laying this all out for insurance policies, underwriting tables, actuarial data, etc. To include all of the objects and properties involved in building software for an insurance company would lead us to thousands of interconnected terms. Triples, in other words.</p>
<p>Or our ontology could be the specification of a pencil object, which has properties like being made of wood and graphite and metal, of having yellow paint and a little pink eraser. Triples like this:</p>
<p><strong>The pencil</strong> <em>has a</em> <strong>pink eraser.</strong><br /><strong>The pencil</strong> <em>is painted</em> <strong>yellow</strong>.</p>
<p>This characterizes the nature of the challenge we have taken on in our efforts to build ontologies. We take on the problems of scale, not the problems involved in really capturing, in some formal fashion, the nature of the world around us. We build gigantic, but very simple, models of the things that concern us in the software world.</p>
<p>We have trivialized this term, ontology. In fact, for the most part, we&#8217;re simply referring to the same old, same old modeling construct: triples. Yes, that simple tool called RDF can be used to build a vast &#8220;ontology&#8221;. </p>
<p>There is something about the nature of triples that has conquered computing. It is a concept that, as we have seen in previous postings of this blog, underlies object-oriented data structures. It predates object-oriented languages, going back to the early days of AI and the attempts to model the real world. </p>
<p><strong>So, what is an ontology?</strong></p>
<p>An ontology is supposed to be the end of the Semantic Web rainbow: our ability to fully automate the specification and searching of the real world. But the next time some computer person tries to impress you by tossing this term at you, remember to just shake your head and say &#8220;Quit being a puff toad. You&#8217;re just talking about triples.&#8221;</p>
<p><br class="final-break" /></p>
<!-- wpms-network-global-inserts -->]]></content:encoded>
			<wfw:commentRss>http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-5/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Semantic Web: RDF and SPARQL, part 4</title>
		<link>http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-4/</link>
		<comments>http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-4/#comments</comments>
		<pubDate>Wed, 29 Jul 2009 02:17:34 +0000</pubDate>
		<dc:creator>Roger King</dc:creator>
				<category><![CDATA[RDF]]></category>
		<category><![CDATA[SPARQL]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[the Semantic Web]]></category>
		<category><![CDATA[triples]]></category>
		<category><![CDATA[XML]]></category>

		<guid isPermaLink="false">http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-4/</guid>
		<description><![CDATA[This posting is a continuation of the previous posting. We are discussing RDF, the &#8220;triples&#8221; language that is serving as a cornerstone of the Semantic Web effort. In this posting, we will look at SPARQL, the web language designed to search data that has been specified as RDF triples. The goal of the Semantic Web [...]]]></description>
				<content:encoded><![CDATA[<p>This posting is a continuation of the previous posting. We are discussing RDF, the &#8220;triples&#8221; language that is serving as a cornerstone of the Semantic Web effort. In this posting, we will look at SPARQL, the web language designed to search data that has been specified as RDF triples. The goal of the Semantic Web is to partly automate the searching of the Web, by using RDF to capture deeper semantics of information and SPARQL to query that information. This is in comparison to today&#8217;s technology, which does not allow us to do much more than search for individual words in the text of webpages.</p>
<p><strong>From the last posting.</strong></p>
<p>Here is a piece of the RDF code from the previous posting:</p>
<p>&lt;rdf:RDF</p>
<p><span>xmls:rdf=”</span><span>http://www.w3.org/1999/02/22-rdf-syntax-ns#”</span><span>&gt;<br />
xmls:zx=”</span><span>http://www.someurl.org/zx/”</span><span>&gt;</span></p>
<p>&lt;rdf:Description</p>
<p><span>rdf:about=”</span><span>http://www.awebsite.org/index.html”</span><span>&gt;</span></p>
<p>&lt;zx:created-by&gt;<span>http://www.anotherurl.org/buzz</span>&lt;/zx:created-by&gt;</p>
<p>&lt;/rdf:Description&gt;</p>
<p>&lt;/rdf:RDF&gt;</p>
<p>This can be interpreted as the webpage at awesite.org/index.html was created by Buzz.</p>
<p>A<strong>nother representation of RDF-based information: 3 triples.</strong></p>
<p>We see from the above that RDF simply represents triples. We could simplify it even more as:</p>
<p>http://awesite.org/index.html was created by Buzz</p>
<p>Part of the reason that the original RDF code above is so much more complex is that the full syntax lets us specify that we are using terms that are defined at specific web addresses. This allows people to use standardized terms and greatly enhances the specitifity of an RDF specification. The full syntax also allows us to reference pieces of information that reside on the Web. (See the previous three postings, <a href="http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-1/">1</a>, <a href="http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-3/">2</a>, <a href="http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-3/">3</a>.)</p>
<p>Before we launch into a SPARQL example, we need to make an important distinction between syntax and symantics. The code above is written in a particular syntax for RDF, one that uses XML. We note that because syntax needs to be very precise, it tends to be verbose. This can cause syntax to obsure the conceptual simplicity of underlaying semantics, or meaning.</p>
<p>But this isn&#8217;t the only way to specify RDF triples. Let&#8217;s look at some information that is much simpler, and at the same time, let&#8217;s look at using a different syntax for specifying RDF-like triples. Here are three triples:</p>
<p>&lt;<span>http://awebsite.org/</span> &gt; was-created-by &#8220;Buzz&#8221;</p>
<p>&lt;<span>http://awebsite.org/</span> &gt; was-created-by &#8220;Suzy&#8221;</p>
<p><span>&lt;</span><span>http://anotherwebsite.org/</span><span>&gt; was-created-by &#8220;Alice&#8221;</span></p>
<p>This is a very simple program. It consists of a two triples that say that a website named awebsite was created by Buzz and Suzy, and another triple that says that Alice created a website called anotherwebsite. We are not saying that was-created-by is a widely used term; it may have been invented only for particular RDF specification, and its meaning would therefore not be precise. We can only interpret it from our general understanding of English words. We also have no idea who these people Buzz and Suzy and Alice are, and we have no other information about them.</p>
<p><strong>SPARQL: searching triples distributed across the Web.</strong></p>
<p>Now, here is a piece of code:</p>
<p>prefix website1: <span>&lt;</span><span>http://awebsite.org/</span><span> &gt;</span><br />
SELECT ?x<br />
WHERE<br />
{ website1:was-created-by ?x }</p>
<p>We&#8217;re getting very close to real SPARQL, by the way, and if you know SQL, you can see the extremely similarity. But syntax is not our issue here. We&#8217;re trying to look at concepts.</p>
<p>This code will find the creators of http://awebsite.org. You could imagine that there are actually many thousands of these triples, and that they tell us who built a large number of different websites. Now, we see the power of this query. It will search through all of these triples and find the two of interest to us, and then pluck off the names of the creators.</p>
<p>In fact, these triples could be distributed all around the Web, and we could imagine a search engine taking this query and running it everywhere on the Web where was-created-by triples are stored, and then having it bring back all the creators of awebsite, even if there are a hundred developers, and even if these names are spread around the Internet.</p>
<p><strong>Next, the bigger issue.</strong></p>
<p>In the next posting, we&#8217;ll look more closely at SPARQL. One thing we will consider is why it does look so much like SQL. There is a powerful reason for this that has to do with searching information in general.</p>
<p><br class="final-break" /></p>
<!-- wpms-network-global-inserts -->]]></content:encoded>
			<wfw:commentRss>http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-4/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Semantic Web: RDF and SPARQL, part 3</title>
		<link>http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-3/</link>
		<comments>http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-3/#comments</comments>
		<pubDate>Sat, 18 Jul 2009 21:57:21 +0000</pubDate>
		<dc:creator>Roger King</dc:creator>
				<category><![CDATA[RDF]]></category>
		<category><![CDATA[SPARQL]]></category>
		<category><![CDATA[the Semantic Web]]></category>
		<category><![CDATA[triples]]></category>

		<guid isPermaLink="false">http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-3/</guid>
		<description><![CDATA[This posting is a continuation of the previous posting. We are discussing RDF, the &#8220;triples&#8221; language that is serving as a cornerstone of the Semantic Web effort. In the previous two postings, we looked at RDF, which is an excellent example of solid software technology: It serves an important purpose. It is easy to use. [...]]]></description>
				<content:encoded><![CDATA[<p>This posting is a continuation of the previous posting. We are discussing RDF, the &#8220;triples&#8221; language that is serving as a cornerstone of the Semantic Web effort.</p>
<p>In the previous two postings, we looked at RDF, which is an excellent example of solid software technology: It serves an important purpose. It is easy to use. And, even if you don&#8217;t write any RDF yourself, it is easy to understand what it does, and therefore, how it will impact your life.</p>
<p>RDF, in its simple, quiet way, allows us to interconnect any resources that exist on the Web, and at the same time, make use of standardized terminologies. This provides a highly flexible and semantically expressive way of building the new Semantic Web.</p>
<p><strong>SPARQL: what is it?</strong></p>
<p>RDF is great stuff, but it&#8217;s only half the story. If knowledge on the emerging Semantic Web is going to be glued together into RDF triples, how will that information be searched? It doesn&#8217;t do any good to have a book that will solve all your problems if you can&#8217;t read it or search through it.</p>
<p>SPARQL stands for Protocol And RDF Query Language, with an S tossed into the beginning so we can say it as &#8220;sparkle&#8221;. Interestingly, when something is called a &#8220;query&#8221; language, we start thinking in terms of SQL, that largely <a href="http://itknowledgeexchange.techtarget.com/semantic-web/sql-and-xml-declarative-is-exciting/">declarative</a> relational language that is the core of almost all successful relational database management systems. Indeed, as we will see in a later blog posting about <a href="http://itknowledgeexchange.techtarget.com/semantic-web/the-difference-between-web-2-and-the-semantic-web/">XQuery</a>, the language for searching <a href="http://itknowledgeexchange.techtarget.com/semantic-web/xml-and-its-powerful-children/">XML-based data</a>, SQL, has served as the model for SPARQL.</p>
<p><strong>A blast from the past.</strong></p>
<p>There&#8217;s something about triples that we should look at before moving on. It has to do with the fact that triples are also known as &#8220;assertions&#8221;, and that assertions can be chained together to make &#8220;inferences&#8221;. Here are two triples/assertions, specified very informally: THE BALL is ORANGE. ORANGE is an UGLY COLOR. The inference we can make is THE BALL is an UGLY COLOR.</p>
<p>Or, getting back to the Web and RDF, below are two triples specified in RDF; the first one comes from the <a href="http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-2/">previous posting of this blog.</a></p>
<p>&lt;rdf:RDF</p>
<p><span>xmls:rdf=”</span><span>http://www.w3.org/1999/02/22-rdf-syntax-ns#”</span><span>&gt;<br />
xmls:zx=”</span><span>http://www.someurl.org/zx/”</span><span>&gt;</span></p>
<p>&lt;rdf:Description</p>
<p><span>rdf:about=”</span><span>http://www.awebsite.org/index.html”</span><span>&gt;<br />
</span></p>
<p>&lt;zx:created-by&gt;<span>http://www.anotherurl.org/buzz</span>&lt;/zx:created-by&gt;</p>
<p>&lt;/rdf:Description&gt;</p>
<p>&lt;/rdf:RDF&gt;</p>
<p>This first one can be interpreted as the webpage at awesite.org/index.html was created by Buzz.</p>
<p>Here is the second one RDF triple:</p>
<p>&lt;rdf:RDF</p>
<p><span>xmls:rdf=”</span><span>http://www.w3.org/1999/02/22-rdf-syntax-ns#”</span><span>&gt;<br />
xmls:zx=”</span><span>http://www.someurl.org/zx/”</span><span>&gt;</span></p>
<p>&lt;rdf:Description</p>
<p><span>rdf:about=”</span><span>http://www.anotherurl.org/buzz”</span><span>&gt;</span></p>
<p><span>&lt;zx:is&gt;</span><span>http://www.yetanotherurl.org/professor</span><span>&lt;/zx:Is&gt;</span></p>
<p>&lt;/rdf:Description&gt;</p>
<p>&lt;/rdf:RDF&gt;</p>
<p>This one can be interpreted as Buzz is the guy described at yetanotherurl.org/professor.</p>
<p>We can chain them together to deduce that the guy who built the page at awebsite.org/index.html is Buzz the professor.</p>
<p>This is an inference.</p>
<p>The point is that if you take a bunch of RDF statements and chain them together, you get what looks a lot like an object-oriented graph of related objects, somewhat like you see in Java.  In a sense, RDF takes an object representation and breaks in down into triples.  There&#8217;s really nothing new in RDF, other than the fact that any part of an RDF assertion (triple) can be something found on the Web.</p>
<p><strong>Back to SPARQL.</strong></p>
<p>So, what is SPARQL?  It is a language that can be used to traverse graphs that consist of RDF triples that are chained together into an object network.</p>
<p>We will look at some SPARQL code in the next posting.</p>
<!-- wpms-network-global-inserts -->]]></content:encoded>
			<wfw:commentRss>http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-3/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Semantic Web: RDF and SPARQL, part 2</title>
		<link>http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-2/</link>
		<comments>http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-2/#comments</comments>
		<pubDate>Fri, 10 Jul 2009 02:35:47 +0000</pubDate>
		<dc:creator>Roger King</dc:creator>
				<category><![CDATA[RDF]]></category>
		<category><![CDATA[the Semantic Web]]></category>
		<category><![CDATA[triples]]></category>
		<category><![CDATA[URI's]]></category>
		<category><![CDATA[XML]]></category>

		<guid isPermaLink="false">http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-2/</guid>
		<description><![CDATA[This posting is a continuation of the previous posting. We are discussing RDF, the &#8220;triples&#8221; language that is serving as a cornerstone of the Semantic Web effort. In the previous posting, we looked at a simple RDF program, which creates a relationship between a web-based resource and the term &#8220;funstuff&#8221;; the relationship is called &#8220;topic&#8221;, [...]]]></description>
				<content:encoded><![CDATA[<p>This posting is a continuation of the previous posting. We are discussing RDF, the &#8220;triples&#8221; language that is serving as a cornerstone of the Semantic Web effort. In the previous posting, we looked at a simple RDF program, which creates a relationship between a web-based resource and the term &#8220;funstuff&#8221;; the relationship is called &#8220;topic&#8221;, thus telling us that the resource located at the given URL is something fun.</p>
<p><strong>RDF and URI&#8217;s.</strong></p>
<p>One interesting fact is that, although we only used URI&#8217;s for two parts of the RDF triple embedded in this RDF program, we could have used URI&#8217;s for all three pieces of the triple. Thus, the program from the previous blog posting (immediately below) might be changed to look like the second program below, which now has two triples in it:</p>
<p>&lt;rdf:RDF</p>
<p><span>xmls:rdf=&#8221;</span><span>http://www.w3.org/1999/02/22-rdf-syntax-ns#&#8221;</span><span>&gt;<br />
xmls:zx=&#8221;</span><span>http://www.someurl.org/zx/&#8221;</span><span>&gt;</span></p>
<p>&lt;rdf:Description</p>
<p>rdf:about=&#8221;<span>http://www.awebsite.org/index.html&#8221;</span>&gt;<br />
&lt;zx:topic&gt;funstuff&lt;/zx:topic&gt;</p>
<p>&lt;/rdf:Description&gt;</p>
<p>&lt;/rdf:RDF&gt;</p>
<p>&#8212;&#8212;&#8212;&#8212;-</p>
<p>&lt;rdf:RDF</p>
<p><span>xmls:rdf=&#8221;</span><span>http://www.w3.org/1999/02/22-rdf-syntax-ns#&#8221;</span><span>&gt;<br />
xmls:zx=&#8221;</span><span>http://www.someurl.org/zx/&#8221;</span><span>&gt;</span></p>
<p>&lt;rdf:Description</p>
<p>rdf:about=&#8221;<span>http://www.awebsite.org/index.html&#8221;</span>&gt;<br />
&lt;zx:topic&gt;funstuff&lt;/zx:topic&gt;</p>
<p>&lt;zx:created-by&gt;<span>http://www.anotherurl.org/buzz</span>&lt;/zx:created-by&gt;</p>
<p>&lt;/rdf:Description&gt;</p>
<p>&lt;/rdf:RDF&gt;</p>
<p><strong>RDF and decentralized information.</strong></p>
<p>As a reminder, the triple expressed in the first program can be stated as:</p>
<p><span>www.awebsite.org</span>/index.html &lt;<strong>topic</strong>&gt; funstuff</p>
<p>So, what did we add in the second program?  There is a new triple that has been added.  It can be roughly stated as:</p>
<p><span>www.awebsite.org</span><span>/index.html &lt;<strong>created-by</strong>&gt; </span><span>http://www.anotherurl.org/buzz</span></p>
<p>In other words, our vocabulary defined at <span>http://www.someurl.org/zx</span> apparently has another standardized term called &#8220;created-by&#8221;.  The added triple in our second program says that the resource found at <span>www.awebsite.org</span>/index.html was created by someone who is identified by the url <span>http://www.anotherurl.org/buzz.</span></p>
<p>We see that the value in the first triple, which concerns the &#8220;topic&#8221; of our resource, consists of a character string, but the value in the second triple, which concerns the &#8220;created-by&#8221; of our resource, is actually a URL.</p>
<p>This is big.  It shows us that all three parts of a triple in RDF can be URI&#8217;s, and they can be distributed around the Internet.  This means that the information embedded in the triple is highly decentralized.</p>
<p><strong>The bottom line</strong></p>
<p>This illustrates the power of RDF.  It can be used to express information which is not controlled in any centralized fashion.  RDF is thus the glue that can be used to bring diverse pieces information together.  And it can use standardized, shared terminologies to precisely dictate the semantics of the triples in RDF programs.  In our example, the resource is defined by one URI, the kind of relationship is defined by another URI, and the value of that relationship is defined by yet another URI.</p>
<p>We will continue this in the next posting.</p>
<!-- wpms-network-global-inserts -->]]></content:encoded>
			<wfw:commentRss>http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Semantic Web: RDF and SPARQL, part 1</title>
		<link>http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-1/</link>
		<comments>http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-1/#comments</comments>
		<pubDate>Fri, 03 Jul 2009 04:28:34 +0000</pubDate>
		<dc:creator>Roger King</dc:creator>
				<category><![CDATA[automating Web searches]]></category>
		<category><![CDATA[namespaces]]></category>
		<category><![CDATA[RDF]]></category>
		<category><![CDATA[SPARQL]]></category>
		<category><![CDATA[the Semantic Web]]></category>
		<category><![CDATA[triples]]></category>
		<category><![CDATA[XML]]></category>

		<guid isPermaLink="false">http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-1/</guid>
		<description><![CDATA[This blog is dedicated to advanced and emerging Web technology.  Each posting is meant to be understandable and informative on its own, but the blog as a whole tells a continuing story. The Semantic Web. In this posting, we will focus on the Semantic Web, which is a global effort at radically improving our ability [...]]]></description>
				<content:encoded><![CDATA[<p>This blog is dedicated to advanced and emerging Web technology.  Each posting is meant to be understandable and informative on its own, but the blog as a whole tells a continuing story.</p>
<p><strong>The Semantic Web.</strong></p>
<p>In this posting, we will focus on the Semantic Web, which is a global effort at radically improving our ability to search the Web.</p>
<p>Currently, to search the web, we type in keywords into a search engine like Google, which then searches its vast index of webpages for pages that have these keywords in them. Because this sort of search is very low-level, and not at all tied to the true meaning or purpose of the information stored in webpages, searching is painfully iterative and interactive.  A user must chase down countless URLs returned by a search engine to see if any of them are relevant.  Quite frequently, they are not.  And so, the user must refine the set of keywords and tries again.  It might take many attempts before a satisfactory result is obtained.</p>
<p>One of the primary goals of the Semantic Web is to automate the process of searching the Web.  There are two stages to this.  First, people who post information on the Web must capture knowledge about the meaning of their information; this knowledge is commonly called &#8220;metadata&#8221;.  The metadata is then store with the posted information.</p>
<p>The second stage happens when users search the Web.  Rather than using the low level keyword search approach, the search is at least partly automated.  The iterative process is sharply reduced by employing a smart search engine that knows how to find relevant information by searching for metadata that pertains precisely to whatever it is that the user is seeking.</p>
<p><strong>The bottom line</strong>.</p>
<p>The goal?</p>
<p>The Semantic Web would be able to ease the burden of searching for information, as well as find vast stores of &#8220;<a href="http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-revealing-hidden-data/">hidden data</a>&#8221; that reside in databases that are accessible via webpages, but whose contents right now are not seen by search engines.</p>
<p>Ultimately, we would want the Web to be entirely searchable by software, without any humans guiding the process.  This would be the true Semantic Web.</p>
<p><strong>Namespaces and triples</strong>.</p>
<p>In past postings of this blog, we have discussed a handful of key approaches to implement the Semantic Web.  One idea is to tag information with standardized sets of terminology called &#8220;<a href="http://itknowledgeexchange.techtarget.com/semantic-web/the-dublin-core-and-the-metadata-object-description-schema-a-look-at-namespaces/">namespaces</a>&#8220;.</p>
<p>We have also looked at the idea of embedding these tags in things called &#8220;<a href="http://itknowledgeexchange.techtarget.com/semantic-web/what-do-we-mean-by-semantic-web/">triples</a>&#8220;.  In this posting, we look at this concept more closely and consider an existing language that would allow people to specify these triples.</p>
<p><strong>RDF and SPARQL.</strong></p>
<p>The most well-known standard for specifying triples is RDF, which stands for the Research Description Framework.  SPARQL is a query language, heavily influenced by <a href="http://itknowledgeexchange.techtarget.com/semantic-web/sql-and-xml-declarative-is-exciting/">SQL</a>, that can be used to search data that has been structured using RDF.</p>
<p>This is the first of a series of blog postings in which we will first look at RDF, and then at SPARQL.  Then, we&#8217;ll consider the big issue: will RDF and SPARQL enable the development of the true Semantic Web?</p>
<p><strong>RDF. </strong></p>
<p>So, what is RDF?  At its highest level, RDF is used to describe anything that can be found on the Web.  RDF has an XML syntax; in other words, RDF can be written as an XML program, using a set of predefined &#8220;element&#8221; and &#8220;attribute&#8221; tags.   (<a href="http://itknowledgeexchange.techtarget.com/semantic-web/xml-and-its-powerful-children/">XML and XML languages</a> were discussed in an earlier posting of this blog, as was <a href="http://itknowledgeexchange.techtarget.com/semantic-web/sql-and-xml-declarative-is-exciting/">XML and declarative languages</a>.)</p>
<p>We might remember that on its own, XML is impotent.  It is not in itself a programming language.  It is simply a language standard for taking a set of tags and using them as &#8220;elements&#8221; and &#8220;attributes&#8221; in a declarative, data-intensive languages.  A good example is <a href="http://itknowledgeexchange.techtarget.com/semantic-web/tag/smil/">SMIL</a>, which is used to define multimedia presentations.</p>
<p>Here is a fragment in RDF, using its XML syntax.  Note that XML languages are embedded languages, with opening tags beginning with &lt;&gt; and closing ones ending in &lt;/&gt;</p>
<p>&lt;rdf:RDF</p>
<p>xmls:rdf=&#8221;http://www.w3.org/1999/02/22-rdf-syntax-ns#&#8221;<br />
xmls:zx=&#8221;http://www.someurl.org/zx/&#8221;&gt;</p>
<p>&lt;rdf:Description</p>
<p>rdf:about=&#8221;http://www.awebsite.org/index.html&#8221;&gt;<br />
&lt;zx:topic&gt;funstuff&lt;/zx:topic&gt;</p>
<p>&lt;/rdf:Description&gt;</p>
<p>&lt;/rdf:RDF&gt;</p>
<p>This looks complicated, but it&#8217;s not.  This simple example illustrates the power of RDF.  It uses a set of standardized RDF-specific tags, and the second line of code tells us where these tags come from: the w3.org site, which contains a vast store of information about advanced web technology.  In other words, we can go to w3.org to find the precise definition of RDF specific tags.</p>
<p>RDF is engineered to also use other sets of tags, in particular, domain-specific tags.  In this example, these tags come from a (non-existing) url called someurl.org.  The tags themselves are prefaced with &#8220;zx:&#8221; in the rest of the code, so we know which tags are native RDF and which come from a domain-specific set of tags  (called a <a href="http://itknowledgeexchange.techtarget.com/semantic-web/namespaces-and-the-semantic-web/">namespace</a>).</p>
<p>The xml &#8220;element&#8221; called Description is an RDF-specific tag that tells us we are giving the description of some resource on the Web, namely one at a (non-existing) website called awebsite.org.</p>
<p>The whole piece of code is one triple: It says that the topic of the resource at www.awebsite.org/index.html is funstuff.  Here it is as a triple, with all the xml syntax and the namespace information removed:</p>
<p><em>www.awebsite.org/index.html</em> &lt;<strong>topic</strong>&gt; <em>funstuff</em>.</p>
<p>Let&#8217;s overview this again.  RDF is an XML language, so it uses the syntax of XML.  One of the primary concepts in XML is that of an &#8220;element&#8221;, and Description is an XML element, one defined in the RDF standard.  The piece of code begins with two namespace statements, one telling us which RDF specification we are using, and the second telling us that we will also be using some tags from another, domain-specific specification, which includes the tag &#8220;topic&#8221;.  Then there is the guts of the triple, telling us that we are listing the topic of a Web-resident resource.</p>
<p>More on this in the next posting&#8230;</p>
<div></div>
<!-- wpms-network-global-inserts -->]]></content:encoded>
			<wfw:commentRss>http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-rdf-and-sparql-part-1/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Semantic Web: revealing hidden data.</title>
		<link>http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-revealing-hidden-data/</link>
		<comments>http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-revealing-hidden-data/#comments</comments>
		<pubDate>Mon, 11 May 2009 03:07:55 +0000</pubDate>
		<dc:creator>Roger King</dc:creator>
				<category><![CDATA[databases]]></category>
		<category><![CDATA[DB2]]></category>
		<category><![CDATA[dynamic pages]]></category>
		<category><![CDATA[hidden web content]]></category>
		<category><![CDATA[indexing]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[namespaces]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[PostgreSQL]]></category>
		<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[static pages]]></category>
		<category><![CDATA[the Semantic Web]]></category>
		<category><![CDATA[triples]]></category>

		<guid isPermaLink="false">http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-revealing-hidden-data/</guid>
		<description><![CDATA[The Hidden Web. The Semantic Web &#8211; a primary topic of this continuing blog series &#8211; will help us search the web with greater ease. One of the things it will (hopefully) do is expose a vast sea of information that is currently invisible to our web browsers. In fact, some say that right now, [...]]]></description>
				<content:encoded><![CDATA[<p><strong>The Hidden Web.</strong></p>
<p>The <a href="http://itknowledgeexchange.techtarget.com/semantic-web/the-difference-between-web-2-and-the-semantic-web/" target="_blank">Semantic Web</a> &#8211; a primary topic of this continuing blog series &#8211; will help us search the web with greater ease. One of the things it will (hopefully) do is expose a vast sea of information that is currently invisible to our web browsers. In fact, some say that right now, we can see less than 1% of what&#8217;s out there. I cannot vouch for this number, but I can say that what we cannot see right now includes large volumes of extremely valuable data.</p>
<p>Perhaps you have heard of the mysterious &#8220;Hidden Web&#8221;? So, what is this stuff and where is it?</p>
<p><strong>Forms, Databases, and Interactive Interfaces.</strong></p>
<p>The Hidden Web refers to data that is out there on the web, publicly accessible &#8211; but only via webpage interfaces that are opaque to the indexing software of search engines like Google.</p>
<p>Let&#8217;s step back for a moment. </p>
<p>The way search engines work, in case you don&#8217;t know, is by constantly searching the web, looking for new webpages. When a new page is found, it is added to the search engines index, meaning that now, when people search the web with Google, they might get the URL for that page in their search results. </p>
<p>The important thing to note is that the primary source of information that Google uses when it indexes a page is the page itself. What words are on it?</p>
<p>This sounds great for static webpages that are stored as-is on websites and delivered as-is to the Google user. </p>
<p>But suppose we want Google to find dynamic pages? A typical dynamic page has content that isn&#8217;t known until an interactive user types some words into a web<em> &#8220;</em>form&#8221;. A web form is a page where the browser user fills in blanks and then lets the browser send the completed page back to the server. There, the information in the form is used to select other information, which is plugged into a &#8220;dynamically&#8221; created page that is sent to the client machine and viewed by the browser user.</p>
<p>So, I might visit Amazon. I navigate to their search page, which is a form, and I type in the title of the book I want. That information goes back to the server. A description of this book, including its cost, is plugged into a dynamically created page, which is then downloaded to my machine so that I can read the material with my browser.</p>
<p><strong>Indexing Dynamic Pages.</strong></p>
<p>So, if I have information that is not sitting in static pages, how can I get Google to index this information? There are multiple ways. For example, if the primary job of your website is to create large volumes of dynamically created pages, you might want to create a special directory page for your site &#8211; a static page &#8211; loaded with all the right words, and that contains links to the pages and forms you want the user to discover.</p>
<p>On the future Semantic Web, you might want to make sure that those magic words come at least in part from globally accessible <a href="http://itknowledgeexchange.techtarget.com/semantic-web/namespaces-and-the-semantic-web/" target="_blank">namespaces</a>, so that people who are using next-generation browsers, and who will be using these namespaces as a source of search keywords, will find your static page. As we have discussed, namespaces will provide us with detailed sets of terms, which will be tied to specific domains. This will make the search for static pages far more efficient than it is now.</p>
<p>As an example, a namespace concerning books might have words like <em>ISBN-10</em> and <em>ISBN-13. </em>If the web designer uses these terms to describe static pages about books, and if the user of the browser can specify that they are looking for ISBN numbers, the browser will have a much more detailed idea of what is meant by those 10 and 13 digit numbers the user types in. </p>
<p>Here&#8217;s the critical part. Right now, Amazon lets you search by the these numbers on their specialized web form page, but imagine if you could at any time tell your browser to look for ISBN numbers on whatever webpages it searches.</p>
<p>An example of a namespace that is used to describe documents on the web is the <a href="http://itknowledgeexchange.techtarget.com/semantic-web/the-dublin-core-and-the-metadata-object-description-schema-a-look-at-namespaces/" target="_blank">Dublin Core</a>, by the way.</p>
<p>So, that&#8217;s one way to make your dynamic pages somewhat visible. Create a web page that is static and leads to the pages you want users to see, and to make it all the more powerful, use terms from a globally accepted namespace like the Dublin Core. This is something that is already partly doable. The Dublin Core, along with other namespaces, are in wide use.</p>
<p><strong>Where Does that Information Come From?</strong></p>
<p>Is there a better way, though? This technique will only point users to our static web directory, which will then enable interactive users to find our web forms. The users must then use our forms to get detailed data. Could the searching for dynamic pages be made more automatic?</p>
<p>Well, where does data in dynamic pages come from? Often from large databases built with such database management systems as Oracle, SQL Server, MySQL, PostgreSQL, and DB2. This is why some folks conjecture that the amount of information in the Hidden Web is vastly bigger than the web we see today. Databases can be BIG.</p>
<p>Imagine all the information on the ancient Pharaohs, genetic diseases, investments, philosophy, and countless other topics is sitting inside databases that right now are only accessible via web forms. Right now, we Google keywords like &#8220;pharaoh&#8221; and the first things we see are static, highly condensed Wikipedia pages, and perhaps some static pages posted by museums and academics.</p>
<p><strong>What Will the Semantic Web Do?</strong></p>
<p>The Semantic Web will have as a primary challenge the ability for us to ask for information, and know that the search space will contain information tucked away in databases dotted all around the globe. </p>
<p>This is a very complex problem. Right now, we need a human sitting at the keyboard of the client machine to navigate to the correct URL and then type terms into a web form. In the future, web designers will need ways of capturing information about what is contained in databases, and to specify that information in a fashion that browsers can access. And this information will have to be very detailed, sometimes very intricate. </p>
<p>The browser will also have to take information specified by the user and match it up with the information that describes databases on the web. This means that we will need some automatic way to search databases without a user interactively and incrementally screening tens or hundreds or thousands of URLs. In an earlier blog posting in this series we described one possible technique called &#8220;<a href="http://itknowledgeexchange.techtarget.com/semantic-web/what-do-we-mean-by-semantic-web/" target="_blank">triples</a>&#8221; that might, combined with namespaces, provide a partial solution to this problem.</p>
<p>We will look at this again, more closely, in a future blog posting.</p>
<p><br class="final-break" /></p>
<!-- wpms-network-global-inserts -->]]></content:encoded>
			<wfw:commentRss>http://itknowledgeexchange.techtarget.com/semantic-web/the-semantic-web-revealing-hidden-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
