XML archives - Buzz’s Blog: On Web 3.0 and the Semantic Web

Buzz’s Blog: On Web 3.0 and the Semantic Web:

XML

Oct 3 2009   9:12PM GMT

Multimedia: The Problem of Subtle Semantics



Posted by: Roger “Buzz” King
3D animation, 3D modeling, advanced Web apps, automating Web searches, blob data, continuous data, databases, information, Multimedia, rich internet apps, Semantic Web, smart search engines, tagging, Text, Web 2.0, Web 3.0, web applications, Web development, Web development frameworks, XML

The challenge of the Semantic Web.

We’ve looked at the emerging Semantic Web technology in the previous postings of this blog. The idea is to have a far, far smarter Web, one where the process of finding and interpreting and making use of far flung information can be largely automated. This is in sharp contrast with today’s Web, where these things have to be done in a painful, extremely time-consuming fashion.

So that is the key challenge. It has to do with searching the kinds of information that are important to us in our daily lives. This information, as it turns out, is very difficult to process automatically. Why is this?

The complexity of modern multimedia.

I teach a very basic 3D animation class to mostly computer science students. We use Maya, arguably the most popular 3D animation application, one that is used in the making of many animated features. The interesting thing about animation is that it is truly multimedia. It can give us a lot of insight into what we need the new Web to do for us.

That’s because the number and diversity of applications that are used for drawing, documenting, modeling, animating, motion capture, texturing, video rendering, video editing, video conversion and compression, sound editing, in even small projects, can be very impressive. Correspondingly, the wide variety and complexity of media formats involved in an animation project can be overwhelming.

What happens in an animation project? The workflow might begin with vector storyboard drawings to break the story down into scenes. In a typical animation project, 3D models in a variety of proprietary formats are used. Models must be transformed as they are exported from one application and imported into the next. Multiple video renders of animated models are made, and they must be edited together, along with multiple sound files. Multiple video and audio formats might be used. 2D images are used for textures; photographs of butterfly wings can be used to make an animated butterfly very realistic, and a checkerboard image made with Photoshop can be used to make a Linoleum floor. And along the way, a variety of note taking, screen capture, and conferencing software might be used to facilitate group communication.

There is also a heavy focus on reuse in an animation project. Building every model, editing every texture, creating every environment and background, recording every sound from scratch is frequently intractable. If existing assets cannot be tailored and reused, the project would be far too expensive and time consuming, and would demand too wide a variety of professionals to always be available. This raises the multimedia stakes, as assets of widely differing forms must be constantly reconfigured and used in concert in new ways.

But what’s the real problem? We aren’t all trying to produce complex animated videos. But very interestingly, in our everyday lives we essentially face the animator’s challenge when we try to find and use information on the Web. That’s because we’re often looking for things whose meaning, whose interpretation, demands focused human thought. We are looking not for business data, but for pieces of media, and the problem is that today, most of our searching has to be based on tags or brief textual descriptions that are associated with pieces of media, and not on the true meaning of the media itself.

The needs of the business world are not our needs.

It’s the subjective nature of media assets - this is what is at the heart of the problem facing us. Existing technology for searching the web is based on keywords and very short pieces of text.

There is other technology, though, under active development, stuff that serves as the information storage backbone of most commercial websites. It’s the technology that has for decades been used in-house (not on the Web) by businesses when they process large databases. But this stuff was designed to handle traditional business data forms, like integers, character strings, real numbers, dates, timestamps, and full text.

There is more, though. All of the major database management systems, along with tools for building and searching advanced websites are being retrofitted (or in some cases, built from the ground up) to manage more than keywords and text, more than standard business data.

But up to now, the focus has not been on supporting the kinds of information you and I are most interested in. The focus has been on extending database and Web technology to support xml documents, as well as more complex data objects, like those inside a Java program, as well as other forms of data found inside programs. This includes arrays and lists and short pieces of textual data, like the names of diseases.

In other words, we’ve been busy extending our support of the business world, so they can store complex business data in databases and make that information processable over the Web. You and I have largely been left out.

Finally, we are attacking our needs.

But there now many ongoing efforts to extend database and Web technology to make it useful to us. The new focus is on supporting blob and continuous media like images, video, and audio. This is extremely hard to do.

Why? Because the strongest means by which we deduce the meeting of business data is by looking at its internal structure and the terms that are used to describe that structure. A relational table named Prescriptions, with a character attributes Patient Name, Doctor’s Name, and Medication, and with a numeric attribute Dosage, is pretty easy to interpret.

But what do we do with a photograph, which is just a grid of pixels with no internal structure? Or a long series of images, along with a sound track, put together to form a piece of video?

The U.S. military has been pumping money into image processing for several decades, and so all is not lost. There is a vast body of mathematical research and software development that allows us to write programs that can find a particular face in a crowd and search satellite photos for airplane runways. But in general, we cannot at this time write a program that can process an arbitrary photo or video clip and tell us what it means. That means we can’t quickly search vast media database for useful pieces of information.

The goal behind the Semantic Web effort is to build a new generation of websites whose information can be searched automatically, and where information from multiple sites can be automatically integrated. To do this with numeric and character based data is quite doable. But when it comes to multimedia, like images and sound and video and 3D models and engineering designs, well, we have a long way to go. The meaning - in other words, the semantics - of these forms of data are complex and subtle, and highly dependent upon an individual’s interpretation of that media.

So, we see that we have only just begun our journey to create the new Web.

Jul 29 2009   2:17AM GMT

The Semantic Web: RDF and SPARQL, part 4



Posted by: Roger “Buzz” King
the Semantic Web, RDF, SPARQL, SQL, triples, XML

This posting is a continuation of the previous posting. We are discussing RDF, the “triples” language that is serving as a cornerstone of the Semantic Web effort. In this posting, we will look at SPARQL, the web language designed to search data that has been specified as RDF triples. The goal of the Semantic Web is to partly automate the searching of the Web, by using RDF to capture deeper semantics of information and SPARQL to query that information. This is in comparison to today’s technology, which does not allow us to do much more than search for individual words in the text of webpages.

From the last posting.

Here is a piece of the RDF code from the previous posting:

<rdf:RDF

xmls:rdf=”http://www.w3.org/1999/02/22-rdf-syntax-ns#”>
xmls:zx=”
http://www.someurl.org/zx/”>

<rdf:Description

rdf:about=”http://www.awebsite.org/index.html”>

<zx:created-by>http://www.anotherurl.org/buzz</zx:created-by>

</rdf:Description>

</rdf:RDF>

This can be interpreted as the webpage at awesite.org was created by Buzz.

Another representation of RDF-based information: 3 triples.

We see from the above that RDF simply represents triples. We could simplify it even more as:

 http://awesite.org/index.html was created by Buzz

Part of the reason that the original RDF code above is so much more complex is that the full syntax lets us specify that we are using terms that are defined at specific web addresses. This allows people to use standardized terms and greatly enhances the specitifity of an RDF specification. The full syntax also allows us to reference pieces of information that reside on the Web. (See the previous three postings, 1, 2, 3.)

Before we launch into a SPARQL example, we need to make an important distinction between syntax and symantics. The code above is written in a particular syntax for RDF, one that uses XML. We note that because syntax needs to be very precise, it tends to be verbose. This can cause syntax to obsure the conceptual simplicity of underlaying semantics, or meaning.

But this isn’t the only way to specify RDF triples. Let’s look at some information that is much simpler, and at the same time, let’s look at using a different syntax for specifying RDF-like triples. Here are three triples:

<http://awebsite.org/ > was-created-by “Buzz”

<http://awebsite.org/ > was-created-by “Suzy”

<http://anotherwebsite.org/> was-created-by “Alice”

This is a very simple program. It consists of a two triples that say that a website named awebsite was created by Buzz and Suzy, and another triple that says that Alice created a website called anotherwebsite. We are not saying that was-created-by is a widely used term; it may have been invented only for particular RDF specification, and its meaning would therefore not be precise. We can only interpret it from our general understanding of English words. We also have no idea who these people Buzz and Suzy and Alice are, and we have no other information about them.

SPARQL: searching triples distributed across the Web.

Now, here is a piece of code:

prefix website1: <http://awebsite.org/ >
SELECT ?x
WHERE
{ website1:was-created-by ?x }

We’re getting very close to real SPARQL, by the way, and if you know SQL, you can see the extremely similarity. But syntax is not our issue here. We’re trying to look at concepts.

This code will find the creators of http://awebsite.org. You could imagine that there are actually many thousands of these triples, and that they tell us who built a large number of different websites. Now, we see the power of this query. It will search through all of these triples and find the two of interest to us, and then pluck off the names of the creators.

In fact, these triples could be distributed all around the Web, and we could imagine a search engine taking this query and running it everywhere on the Web where was-created-by triples are stored, and then having it bring back all the creators of awebsite, even if there are a hundred developers, and even if these names are spread around the Internet.

Next, the bigger issue.

In the next posting, we’ll look more closely at SPARQL. One thing we will consider is why it does look so much like SQL. There is a powerful reason for this that has to do with searching information in general.



Jul 10 2009   2:35AM GMT

The Semantic Web: RDF and SPARQL, part 2



Posted by: Roger “Buzz” King
the Semantic Web, RDF, triples, XML, URI's

This posting is a continuation of the previous posting. We are discussing RDF, the “triples” language that is serving as a cornerstone of the Semantic Web effort. In the previous posting, we looked at a simple RDF program, which creates a relationship between a web-based resource and the term “funstuff”; the relationship is called “topic”, thus telling us that the resource located at the given URL is something fun.

RDF and URI’s.

One interesting fact is that, although we only used URI’s for two parts of the RDF triple embedded in this RDF program, we could have used URI’s for all three pieces of the triple. Thus, the program from the previous blog posting (immediately below) might be changed to look like the second program below, which now has two triples in it:

<rdf:RDF

xmls:rdf=”http://www.w3.org/1999/02/22-rdf-syntax-ns#”>
xmls:zx=”
http://www.someurl.org/zx/”>

<rdf:Description

rdf:about=”http://www.awebsite.org/index.html”>
<zx:topic>funstuff</zx:topic>

</rdf:Description>

</rdf:RDF>

————-

<rdf:RDF

xmls:rdf=”http://www.w3.org/1999/02/22-rdf-syntax-ns#”>
xmls:zx=”
http://www.someurl.org/zx/”>

<rdf:Description

rdf:about=”http://www.awebsite.org/index.html”>
<zx:topic>funstuff</zx:topic>

<zx:created-by>http://www.anotherurl.org/buzz</zx:created-by>

</rdf:Description>

</rdf:RDF>

RDF and decentralized information.

As a reminder, the triple expressed in the first program can be stated as:

www.awebsite.org/index.html <topic> funstuff

So, what did we add in the second program?  There is a new triple that has been added.  It can be roughly stated as:

www.awebsite.org/index.html <created-by> http://www.anotherurl.org/buzz

In other words, our vocabulary defined at http://www.someurl.org/zx apparently has another standardized term called “created-by”.  The added triple in our second program says that the resource found at www.awebsite.org/index.html was created by someone who is identified by the url http://www.anotherurl.org/buzz.

We see that the value in the first triple, which concerns the “topic” of our resource, consists of a character string, but the value in the second triple, which concerns the “created-by” of our resource, is actually a URL.

This is big.  It shows us that all three parts of a triple in RDF can be URI’s, and they can be distributed around the Internet.  This means that the information embedded in the triple is highly decentralized.

The bottom line

This illustrates the power of RDF.  It can be used to express information which is not controlled in any centralized fashion.  RDF is thus the glue that can be used to bring diverse pieces information together.  And it can use standardized, shared terminologies to precisely dictate the semantics of the triples in RDF programs.  In our example, the resource is defined by one URI, the kind of relationship is defined by another URI, and the value of that relationship is defined by yet another URI.

We will continue this in the next posting.


Jul 3 2009   4:28AM GMT

The Semantic Web: RDF and SPARQL, part 1



Posted by: Roger “Buzz” King
the Semantic Web, namespaces, RDF, SPARQL, XML, triples, automating Web searches

This blog is dedicated to advanced and emerging Web technology.  Each posting is meant to be understandable and informative on its own, but the blog as a whole tells a continuing story.

The Semantic Web.

In this posting, we will focus on the Semantic Web, which is a global effort at radically improving our ability to search the Web.

Currently, to search the web, we type in keywords into a search engine like Google, which then searches its vast index of webpages for pages that have these keywords in them. Because this sort of search is very low-level, and not at all tied to the true meaning or purpose of the information stored in webpages, searching is painfully iterative and interactive.  A user must chase down countless URLs returned by a search engine to see if any of them are relevant.  Quite frequently, they are not.  And so, the user must refine the set of keywords and tries again.  It might take many attempts before a satisfactory result is obtained.

One of the primary goals of the Semantic Web is to automate the process of searching the Web.  There are two stages to this.  First, people who post information on the Web must capture knowledge about the meaning of their information; this knowledge is commonly called “metadata”.  The metadata is then store with the posted information.

The second stage happens when users search the Web.  Rather than using the low level keyword search approach, the search is at least partly automated.  The iterative process is sharply reduced by employing a smart search engine that knows how to find relevant information by searching for metadata that pertains precisely to whatever it is that the user is seeking.

The bottom line.

The goal?

The Semantic Web would be able to ease the burden of searching for information, as well as find vast stores of “hidden data” that reside in databases that are accessible via webpages, but whose contents right now are not seen by search engines.

Ultimately, we would want the Web to be entirely searchable by software, without any humans guiding the process.  This would be the true Semantic Web.

Namespaces and triples.

In past postings of this blog, we have discussed a handful of key approaches to implement the Semantic Web.  One idea is to tag information with standardized sets of terminology called “namespaces“.

We have also looked at the idea of embedding these tags in things called “triples“.  In this posting, we look at this concept more closely and consider an existing language that would allow people to specify these triples.

RDF and SPARQL.

The most well-known standard for specifying triples is RDF, which stands for the Research Description Framework.  SPARQL is a query language, heavily influenced by SQL, that can be used to search data that has been structured using RDF.

This is the first of a series of blog postings in which we will first look at RDF, and then at SPARQL.  Then, we’ll consider the big issue: will RDF and SPARQL enable the development of the true Semantic Web?

RDF.

So, what is RDF?  At its highest level, RDF is used to describe anything that can be found on the Web.  RDF has an XML syntax; in other words, RDF can be written as an XML program, using a set of predefined “element” and “attribute” tags.   (XML and XML languages were discussed in an earlier posting of this blog, as was XML and declarative languages.)

We might remember that on its own, XML is impotent.  It is not in itself a programming language.  It is simply a language standard for taking a set of tags and using them as “elements” and “attributes” in a declarative, data-intensive languages.  A good example is SMIL, which is used to define multimedia presentations.

Here is a fragment in RDF, using its XML syntax.  Note that XML languages are embedded languages, with opening tags beginning with <> and closing ones ending in </>

<rdf:RDF

xmls:rdf=”http://www.w3.org/1999/02/22-rdf-syntax-ns#”
xmls:zx=”http://www.someurl.org/zx/”>

<rdf:Description

rdf:about=”http://www.awebsite.org/index.html”>
<zx:topic>funstuff</zx:topic>

</rdf:Description>

</rdf:RDF>

This looks complicated, but it’s not.  This simple example illustrates the power of RDF.  It uses a set of standardized RDF-specific tags, and the second line of code tells us where these tags come from: the w3.org site, which contains a vast store of information about advanced web technology.  In other words, we can go to w3.org to find the precise definition of RDF specific tags.

RDF is engineered to also use other sets of tags, in particular, domain-specific tags.  In this example, these tags come from a (non-existing) url called someurl.org.  The tags themselves are prefaced with “zx:” in the rest of the code, so we know which tags are native RDF and which come from a domain-specific set of tags  (called a namespace).

The xml “element” called Description is an RDF-specific tag that tells us we are giving the description of some resource on the Web, namely one at a (non-existing) website called awebsite.org.

The whole piece of code is one triple: It says that the topic of the resource at www.awebsite.org is funstuff.  Here it is as a triple, with all the xml syntax and the namespace information removed:

www.awebsite.org/index.html <topicfunstuff.

Let’s overview this again.  RDF is an XML language, so it uses the syntax of XML.  One of the primary concepts in XML is that of an “element”, and Description is an XML element, one defined in the RDF standard.  The piece of code begins with two namespace statements, one telling us which RDF specification we are using, and the second telling us that we will also be using some tags from another, domain-specific specification, which includes the tag “topic”.  Then there is the guts of the triple, telling us that we are listing the topic of a Web-resident resource.

More on this in the next posting…


May 3 2009   3:00AM GMT

Email addresses, the new Web, and NASCAR.



Posted by: Roger “Buzz” King
NASCAR-like web ads, Web 2.0, Web 3.0, the Semantic Web, XML, namespaces, free email accounts, web services, web-based ads

The Semantic Web.

This blog concerns advanced Web technology, in particular,Web 2.0/3.0 and the Semantic Web. Each blog entry should be fully understandable on its own, but the blog as a whole tells a continuing story.

Very roughly, we’ve defined the Web 2.0/3.0 as the class of emerging web applications that are highly responsive, to the point of being competitive with desktop apps. Another characteristic is that they can manage large volumes of very complex media, like images, sound, and animation, as well as interconnected forms of media. We’ve looked at some specific advanced web applications.

Our concern here, in this blog entry, is the Semantic Web, which we have also roughly defined. The Semantic Web is something that does not yet exist, but would meet the very aggressive goal of supporting largely automatic web searches, freeing us from excruciatingly interactive, manual Google and Yahoo sessions. And we’ve seen that we would use such things as shared namespaces, intelligent full text searching, and XML-based markup languages to embed information in websites that could be used by smart browsers to perform far more accurate searches.

Web services would help a lot, too, by taking humans out of the loop when providing powerful web-based capabilities; one website can now provide a vast amount of information, for example, by silently using web services to collect information from many other web-based sources.

(By the way, we have also looked at precisely what we mean by “semantic” in the Semantic Web.)

The way we pay.

This all sounds very good. The Web would be far more useful, with automatically searchable Semantic Web-sites. But there’s a bad side to all of this, and it has to do with how we often pay for Web use.

The problem is that we often do not pay at all. At least not directly, with money. We pay by putting up with ads. Free email services, such as those hustled by Yahoo, Hotmail, AOL, and Mail.com, are generally accessed via web browsers, and we find the main pages of these email accounts stuffed with ads.

Some free email accounts even stick ads in your outgoing mail!

Often, the only way to get the ads stripped from a web mail interface is to pay a fee. We might also get more than just ad-free web mail pages; paying sometimes allows users to access their email with POP or IMAP protocols, via desktop clients (like Outlook and Apple Mail), thus avoiding ads in another way.

(As an aside, there are free email sites that either have no ads in them, or only very subtle ones. Try Gmail.com and Inbox.com. My favorite, with its clean interface and growing set of accompanying capabilities, is GMX.com.)

As it turns out, folks looking to buy ad space online find that they have a vast array of choices, and this drives down the cost of ad space. But these two things, an ever-growing list of free online services and cheap ad space, are related. This is because it is all too easy to build useful web applications. Like browsers, bulletin boards, calendar apps, blogging services, and stickies applications, email servers are cheap to build and maintain. Venders can use canned, largely free software components.

And, transmission costs on the Internet are effectively free, and the bandwidth is huge. Free email accounts often offer a gigabyte or several gigabytes of storage, because disk space is dirt cheap, too.

There is a lot of rebranding going on, too, where someone seems to be offering free email (or some other service), but it is actually being provided by a large email provider.

So, the way things have shaken out, is that free web apps like email servers look like NASCAR racing cars, covered with colorful ads. Many of these ads consist of video, and so we have to battle distracting, flashing colors so we can focus on our mail.

The trick behind online ads.

There is something happening in the online ad world: folks who provide these free, pay-for-it-with-ads services are learning to carefully target ads. There is specialized software available for this, and by plugging in some smarts, folks can make the ads that appear on your screen far more likely to be of interest to you.

How is this done? By watching what you type into search engines, by taking advantage of personal information you supply when you sign up for free email accounts and other services, and by carefully examining the content of the messages you send and receive, that’s how it’s done.

It’s important to point out that this works. The “click through” rate on ads can be radically improved, just by using some simple heuristics in choosing your ads. Folks who pay for ads love this, and it has allowed individuals who don’t even provide free web applications turn themselves in to ad space sellers. Your blog, your specialized website, can now host ads carefully targeted toward the visitors to your blog or your website.

But just wait for the Semantic Web.

But it will really kick in when the semantic web is here. The same technology that would make browsers far, far smarter about finding good URLs for you will make the targeting of ads at you extremely precise.

This slowly-emerging technology is badly needed by the folks who sell ad space and by the people who buy that ad space. That’s because you and I are starting to get used to this world of NASCAR websites. We are looking through or past or around the ads. They need to be made a lot smarter, is order to get our attention back.

But by using Semantic Web technology to radically increase click-through rates, by getting us interested in ads again, impulse shopping on the Web might skyrocket. It’s very easy to go from seeing an ad for a product you have never heard of before to having bought it.

Like little kids watching commercials for sugar-heavy cereals on Saturday cartoon shows, we will be manipulated like we have never imagined before. That’s the bad side to the Semantic Web.



Apr 26 2009   8:09PM GMT

The world of advanced Web applications: what are they?



Posted by: Roger “Buzz” King
Web 2.0, Web 3.0, the Semantic Web, XML, mashups, wikis, social networking sites, tagging, distance education, zenbe.com, evernote, GlideOS, namespaces, web services

This blog is dedicated to an ongoing discussion of Web 2.0/3.0 and the Semantic Web. The slant is on the technology itself, how it works and what’s going on inside advanced Web applications. We’ve looked at a couple different Web 2.0, in particular, Evernote and GlideOS. We’ve tried to characterize the capabilities of Web apps.

The impact of the new Web.

This posting addresses a non-technical question: What has been the impact of this technology our society?

Technological advancement can be very roughly broken into two groups: incremental and radical. Which of these is Web 2.0/3.0? Is it a radical advance?

Consider what highly responsive, multimedia web applications have done for us. They have enabled the development of:

* Wikis: These are web applications that allow us to collaboratively develop sophisticated, easily searchable information bases. These can range from dictionaries for specialized disciplines to vast databases containing DNA information. Data can be vetted by experts and/or challenged by random users.

Everybody knows about Wikipedia, but like blog and bulletin board software, wiki software can be easily installed and configured for deployment on almost any web server, whether it is publicly accessible, or used privately within a corporation or by a professional organization.

* Social networking sites: These are web applications that allow us to actively participate in a myriad of communities based on professional and personal interests. We find work, develop contacts, share music and photographs and video, and develop lifelong collaborations with people we would never have met otherwise.

They are also used by people who are in daily physical contact, but who find they can deepen their relationships by posting personal information on public sites like MySpace and Facebook. The interesting thing about these sites is that new and successful ones keep emerging,

* Tagged content vendor sites: Volunteers and paid individuals can contribute multimedia content and collaboratively tag it, using both freeform and highly sophisticated tagging protocols, such as the sophisticated MPEG-7 standard. (We will look at MPEG-7 in a future posting of this blog.) These include images and sound and video, and many taggers are highly trained professionals who can carefully categorize content according its detailed meaning. This technology makes a vast sea of otherwise-unknown assets available to us. It also makes these assets searchable, thus transforming a completely intractable task into something we easily perform.

In particular, this has radically enhanced the creative power of both professional and hobbyist animators by giving them complex scenery and character components to work with. Check out thoughtequity.com for an example of a content vendor. Take a look at daz3d.com for animation content.

* Mashups: These are portal or second tier web applications that take content from other web sources, such as Google Maps, investment information, medical advice, and scientific data. Often mashups take data from several or hundreds of other sites and create complex, highly valuable multimedia assets.

Take a look at woozor.com. It combines Google map and weather data.

* Distance learning: Universities, corporations, professional organizations, and lone instructors can develop and sell effective, multimedia educational packages that bring education to anyone who has Internet access. This allows us to retrain ourselves for new occupations, stay current in our professional skills, and find employment that is satisfying, steady, and high paying.

I teach on my university’s distance learning site, and we use video, sound, desktop video capture, slide presentations, and software demonstrations - and they can all be edited into a unified product. There are online universities now, where you can get a college degree. Take a look at jonesuniversity,com.

* Hybrid applications that support things like email, calendar, collaboration, RSS feeds, etc.

A good example of a hybrid application is zenbe.com, which provides a combined web-based email, list making, and calendar application, and in that sense is similar to many other email providers. But Zenbe also provides a collaborative tool called Zenbe Pages, which can be used by collaborators to organize their activities. A Zenbe page can have notes, calendars, lists, RSS feeds (not new ones, but existing RSS feeds) on them. Zenbe also provides quick access to Twitter, Google Talk, and Facebook.

By the way, it’s important to point out that the categories I list above are not as clear-cut as one might think. Many modern web apps contain elements from more than one of these categories.

The software building blocks.

From a programming perspective, what specific Web 2.0/3.0 software has allowed all of this to come about? We’ve discussed much of this already in previous postings of this blog. It includes XML and the exploding class of XML languages, namespaces, IDE’s (Integrated Development Environments), large code bases (such as the vast library of ready-made Java components), web service software development tools, and AJAX web page optimization technology. It also includes web development frameworks like Ruby on Rails, and newer ones, engineered toward high responsiveness, like Flex and Silverlight.

Also included are powerful media formats, codecs, players, and editors, which allow web users to do more than upload and search media; we can edit it and reform video, images, and sound, without leaving the simple world of our browsers. And of course, modern mega media apps enable us to build media assets. The list of contributing software tools goes on, but we’ll stop here.

It scales!

And there is something subtle, but important that gives advanced web technology extraordinary power: it scales. We manage shared resources that are truly gigantic in size, and are spread across countless machines around the world. We leverage global user bases, cheap server technology, and wide open Internet bandwidth to give media stores belonging to Web apps astonishing growth rates.

The bottom line.

Yep. Web 2.0/3.0, as a whole, is a truly radical advancement. It has fundamentally and globally changed society in a big way.



Apr 19 2009   2:31AM GMT

There are Web apps and then there are Web apps.



Posted by: Roger “Buzz” King
Web 2.0, Web 3.0, the Semantic Web, web applications, Filemaker, evernote, SMIL, XML, Glide

In our continuing series on Web 2.0/3.0 and the Semantic Web, we have looked at one simple, yet impressive Web application, called Evernote. There are significant advantages of Web apps; in particular, the application is available wherever you can get onto the Web, you don’t have to run and maintain complex desktop software, and your data sits on a (hopefully) secure and backed-up data server.

Web Apps.

We noted that some Web apps, including Evernote, are both Web-based and desktop-based. Seemingly, this might be a disadvantage, because now, the user does have to install and maintain the desktop version of the app. But, in exchange, you have two copies of your data, at different physical locations. You also can use the app when you are not on the Internet. And, as far as Evernote goes, the desktop app is very far from difficult to manage.

Let’s look at this a little closer. Not all Web apps are the same. One problem is that too many vendors feel compelled to brag about the Web capabilities of their projects, and so we have to be suspicious - especially when it comes to older applications that have been retrofitted with Web capabilities.

Let’s look at a few applications. Please keep in mind that the first two applications are not advertised as “Web apps”. I am describing them only as a way of categorizing the Web capabilities of applications in general.

Minimal capabilities: exporting to the Web.

Our first example is an application that runs on Macs and is very impressive. It’s called Curio, and is made by a company called Zengobi. It gives you a workspace to which you can append text notes, lists, images, video, and sound clips. It also supports diagrammatic mind-maps. It’s great for a wide class of brainstorming techniques from simple note-taking to sophisticated workflow planning. It’s all-in-one nature makes it a little imposing and chaotic at first, but it is actually quick to master - and then its freeform nature proves itself to be very powerful. It is also very elegant.

Curio’s Web capabilities are extremely limited, however. All you can do is output a Curio file as a fixed HTML page. It cannot be updated over the Web. For convenience, it can export a file directly to your “Mac” Web account, if you own one.

Modest, often tacked-on Web capabilities.

Another example application is Filemaker. (I am referring to their products called Filemaker Pro and Filemaker Pro Advanced, since they are what I have used in my classes as the University of Colorado.) I teach database management systems, and I can say lots of good things about Filemaker. It is a very quick and simply way to get a full-fledged, scalable, visually-pleasing desktop database up and running. I like it.

But its Web capabilities are typical of applications that have added Web capabilities long after the fact. What you can do with Filemaker is “publish” a database on the Web, and allow Web-based updating and searching. It in effect turns the machine hosting the database into a simple server. But most of Filemaker’s capabilities are not available via the Web interface. And, the database only exists on its original site. All data remains there.

Native, full Web capabilities.

So, what’s a true Web app? I’d say it is an application whose native interface is Web-based, and where all or virtually all of its capabilities are available via a browser. Evernote is a good example.

There is a fuzzy line between “websites” and “Web applications”, as we have previously discussed. And in fact, some people consider virtually all powerful websites to be Web apps. This includes Amazon, Blogger, and Wikipedia, as well as countless lesser-known websites.

And, with respect to the deliberately narrow criteria we’re using here, these applications are indeed Web apps.

So, what characteristics do we see in applications that are powerful, and have native, complete Web interfaces? They are likely to store data persistently in a serverized database management system like MySQL, and present the user with web forms to fill in, and return to the user dynamic Web pages populated from the database. A website that we might be willing to label “Web 2.0″ would be one that is highly responsive and manages large amounts of data.

We might call it Web 3.0 if it also manages large volumes of continuous data (like audio and video), and presents to the user a highly multimedia web interface. But these terms are vague, and drawing lines between them is to a certain degree misleading and a distraction.

Perhaps something that might be a truly Web 3.0 characteristic is that the application, rather than just delivering up video and audio, uses a combination of multiple forms of media, in concert, to interact with the user. We looked at SMIL, an XML language that allows the user to build presentations that coordinate multiple forms of media, such as images, sound, and video. The SMIL programmer can arrange media on the screen, and specify how the various pieces of media will be displayed over time.

Glide: the Web-based desktop.

But let’s look at one very, very aggressive attempt at a true Web 3.0 application. It’s called Glide, and you can get yourself a free account. This application does not support any sort of desktop-based version, and so you do have to be online to use it. It also needs a very fast Internet connection, because of the wide variety and high volume of data it allows you to manipulate.

What’s Glide? It is advertised as “the complete mobile desktop solution”, and it provides a complete, virtual, web-based computer. With it, you can edit photos, draw diagrams, store media files, send and receive email, manage a calendar, manage video, write documents, even build a website - in other words, do almost everything a non-programmer might want to do with a computer.

Its interface consists of three main windows. One is a virtual desktop, with various applications ready to use; another is a portal where the user can access the Web and develop websites; the third is a virtual hard drive, where media and files created by the various applications can be stored and accessed.

Is this the way of the future? It completely frees a user from having to buy, install, and maintain complex, expensive applications, although you still need a computer with a browser to run it. One drawback is that none of its apps, as near as I could tell, can compete with the dominant desktop applications. It is not Photoshop, it is not Dreamweaver, and it is not MS Office Outlook. But its apps are not trivial: they do the job just fine. And the entire interface is simple and visually pleasing.

There is also a way to sync your files on your desktop with the files on the Glide servers, and their documents and spreadsheets are apparently compatible (to some degree) with Microsoft’s Word and Excel. But they apparently are not planning on creating any sort of hybrid web/desktop based product. Glide’s goal is to move us all toward the Web and away from our desktops.

The Glide servers seemed fast enough to me, by the way. That’s the big question. Can it be as responsive as a desktop computer? Well, it’s as fast as my Vista machine… But slower than my iMac.

Give it a try.




Apr 2 2009   5:59AM GMT

Full Text searching: cleaver heuristics for managing large web-based document collections.



Posted by: Roger “Buzz” King
XML, the Semantic Web, SMIL, web applications, Web 2.0, documents, Web 3.0, databases, MySQL, SQL Server, Multimedia, full text, full text searching

There is an explosion of technology for supporting sophisticated forms of media on websites and in web applications. In our continuing series on advanced web applications (in particular, as they pertains to the Semantic Web and Web 2.0/3.0), we’ve looked at continuous media, in particular, video and multimedia presentations. But there is a very old form of continuous media, something that is perhaps the dominant media on the Web, and that’s text.

It’s becoming a very major issue in web development.

Text.

In this blog entry, we’ll be looking at a particular form of text, called “full text”.

But just what is text to begin with? It’s character-based data, anything we can read.

And what will we want to do with it in next-generation web applications? It’s important to note that more and more vast libraries of documents are being put online. Web applications need to provide far faster and more accurate searches of documents than what we can perform with Google.

Interestingly, a successful technology, called “full text retrieval”, is already in place in the relational database systems that underlie modern web applications. It’s there working for us, and we are likely to not be aware of how clever it is.

It’s also something that should be used much more heavily by web application developers.

Let’s step back and consider three different - and increasingly more sophisticated - ways of managing character data.

Atomic Character Attributes.

First, there is the traditional relational database approach, whereby data is stored as tables made of rows of atomic, fixed sized attributes. By atomic, we mean that each attribute has no internal structure. So, a table of insurance claims might have rows with the following attributes: Claims-ID (an integer), Amount (an integer), Medical_Problem (a fixed length character string), and Subscriber_Name (a fixed length character string). Using SQL, the universal database “query” language, we might look for all rows that contain the name “Fred Jones”. Or, we might search for all rows that have claim numbers that are between 110 and 115.

Essentially, this approach limits us to comparing small strings of data to each other or to fixed values. There are some common extensions that we find in relational databases, such as being able to ask the question to find all rows where the Medical_Problem is something like “broken leg”. Then if a row actually has the value “broken legs”, we would most likely see this row in our results.

Full Text.

Second, there is the ability to search pieces of text according to their natural language (in this case, English) meaning. In this case, we consider the character data to have internal structure, and the values are not considered atomic. Often, these pieces of text are long and of variable length from one row to the next.

It is actually an extension of - but a very dramatic one - of the like operator in SQL.

It is what we call “full text” management or retrieval, and modern relational database management systems like MySQL and Microsoft SQL Server support this. This was seen long ago as a critical extension to relational database technology. Thus, we might rename our Medical_Problem field to Doctor’s_Diagnosis, and allow free form English text in this attribute, as well as allowing the value to be quite long. Then we might search for all rows where the doctor describes “fractures of the lower limbs”. Notice that none of these words might actually appear in the attribute, which might simply refer to “broken legs”.

Natural Language Processing.

This capability would clearly be very powerful, if we could do it right. The problem is that to support it fully, we would need to use highly advanced natural language processing techniques, which are very time consuming to execute, especially on huge databases of large documents. The full text approach tries to simulate true natural language searching in a far less expensive way. The real thing, by the way, might not be all that accurate anyway. Natural language is naturally ambiguous and very subtle.

True natural language searching would be our third way of processing character-based data, by the way. It is not a fully developed technology. And importantly, we usually don’t need anything that fancy.

The Clever Compromise.

So, our middle option, full text searching, is what dominates today - and it is a surprisingly accurate, and efficient, technique that operates on a small set of heuristics. It can transform a dumb webpage where we can only search for small, fixed character strings, to a rich next-generation webpage that can effectively be searched according to its meaning. It allows us to manage very large text documents in web applications - and get us surprisingly close to the semantic power of true natural language searching.

We’re not going to go into a lot of detail here, but here are some of the heuristics that are used in full text search. First, “stemming” and related techniques are used; they conjugate verbs, detect plurals of nouns, and remove prefixes and suffixes. Another technique is to use a “stop list” that lists words that should be ignored, like “the”. The system might also let us specify the “proximity” of words; this refers to how closely specific words should appear in a document. It can also be powerful to include a synonym checker. And the ability to allow for “wild cards”, in particular, letters that may vary in a passage without changing its meaning, can be quite useful. Dictionaries of technical words that pertain to specific domains (like medicine or law) are very useful. We might also provide a feedback capability, whereby users can train full text search engines to be more accurate.

This clearly doesn’t come anywhere near true natural language processing - but it is fast. It will be a growing technology on the new web, with a lot of hidden development, making this heuristic-based technique more and more effective.

Indexing.

We should note that there is a significant up front cost in preparing a document for full text searching: we need to build an index with an entry for every (non-stop) word in the text. Then, when a query is executed, we can look for words in the document by searching the index, instead of searching the full text. If there were no index, the search would be extremely time-consuming.

The Future.

As more and more governmental, educational, medical, and other complex documents become available on the web, advanced full text searching will enable us to search vast databases in a tractable fashion. Even more clever full text retrieval engines will turn dumb, “gotta Google them” document portals into true Web 3.0 and Semantic Web applications.



Mar 26 2009   11:31PM GMT

SQL and XML: declarative is exciting



Posted by: Roger “Buzz” King
namespaces, SMIL, XML

In the continuing series of blogs on the Semantic Web and other advanced web technology, we’ve looked at XML as a cornerstone of the technology that allows us to markup data, and in combination with namespaces, create powerful tools for sharing the meaning - and not just the structure - of data. There’s something special about XML that is at the core of its truly amazing widespread adoption, that explains its versatility as the language of choice for tagging data, no matter what the purpose.

What is it?

It’s that XML is “declarative”.

A declarative language is one that allows us to write programs that tell us what needs to be computed, not the order in which primitive operations need to be carried out in order to get the result. Java, C++, JavaScript, C#, Objective C, ActionScript, PHP - none of these are declarative.

Some Non-Declarative Code.

A Here’s some code:

for ( i = 0; i <100; i++ )
stuff[i] = stuff[i] + 1;

It says to start i at 0, then add 1 to i until you get to 99, and each time i is incremented, add 1 to that element in an array called stuff.

This manipulating-an-array program is the classic piece of non-declarative code. It doesn’t just say to add one to every element in an an array, it also tells the order in which to do it. This extra information shouldn’t really be needed, but in non-declarative languages - known as “imperative” languages - it is frequently necessary.

Some Declarative Code.

Now, here is some declarative code. It’s SQL, the universal database language:

SELECT Firstname
FROM Clients
WHERE (Lastname = ‘Smith’) AND (City = “Boulder”) AND (Bday BETWEEN ‘2/10/1970′ AND ‘2/10/1980′)

Clients is a relational “table”, and Firstname, Lastname, City, and Bday are all “attributes” (or “columns”) of that table.

This piece of code gives us the first name of any client whose last name is Smith, and who is from Boulder, and was born between Feb 10 of 1970 and Feb 10 of 1980.

Notice that it tells the computer what data we want, and not the sequence of steps that must be carried out to return the value. We don’t know what order the rows in the table will be examined. We don’t know if the three conditions will all be checked at once, or if we will filter the table first by picking out all clients who are from Boulder.

XML is Declarative.

XML is a declarative language, too. Let’s look at it.

This is the XML from the previous posting of this blog.

<smil xmlns:qt=”http://www.apple.com/quicktime
/resources/smilextensions” qt:autoplay=”true” qt:time-slider=”true”>
<head>
<meta name=”title” content=”Buzz’s Video”/>
<layout>
<root-layout background-color=”white” width=”320″ height=”290″/>
<region id=”videoregion” top=”0″ left=”0″ width=”320″ height=”290″/>
</layout>
</head>
<body>
<seq>
<video src=”http://files.me.com/kingbuzz/radljq.mov” region=”videoregion”/>
<video src=”http://files.me.com/kingbuzz/radljq.mov” region=”videoregion”/>
</seq>
</body>
</smil>

An XML program consists of “elements” and “attributes”. Notice <head> and </head> form the bounds for an element, as do <body> and </body>. Also note that there is an element nested within <head> and it’s marked by <layout> and </layout>. The tags <seq> and </seq> denote an element inside <body>.

The other major construct in XML is called an attribute, and name and content are two attributes with values “title” and “Buzz’s Video”, respectively. Attributes are always simple character values, and therefore cannot be nested.

When this program is saved with the name buzz.smil, and then run by Quicktime, it will download a video from my website (a very nice piece of animation by a student named Jochen Wendel), and then play it twice in succession. See the previous blog for more of an explanation of how SMIL works. It also discusses the difference between XML and its extensions, such as SMIL.

Note that these tags are not part of XML itself; rather they are part of the namespace that has been defined for the SMIL extension of XML. This illustrates the power of XML: it can be used to define other languages.

To understand the XML above, all that is needed is access to the SMIL namespace (which is available at the URL listed at the beginning of the code), and a program that knows how to interpret XML that contains these tags. In this case, it defines a layout for the screen, and that a video should be played twice, sequentially. Quicktime has been programmed to understand the elements and attributes of the SMIL XML language.

Going from “sequential” to “parallel”.

To make our point stronger, here’s a variation. Instead of playing the two videos sequentially, I am using the <par> and </par> tags that represent “parallel” in the SMIL namespace. I have also made the layout area twice as big, and broken it up into two regions. Now, the program plays the video twice, side-by-side, one in each region. At the bottom of this blog entry is what you should see if you save it as buzz.smil and run it with Quicktime. There is also a nice soundtrack.

<smil xmlns:qt=”http://www.apple.com/quicktime
/resources/smilextensions” qt:autoplay=”true” qt:time-slider=”true”>
<head>
<meta name=”title” content=”Buzz’s Video”/>
<layout>
<root-layout background-color=”white” width=”640″ height=”290″/>
<region id=”videoregion” top=”0″ left=”0″ width=”320″ height=”290″/> <region id=”videoregion2″ top=”0″ left=”320″ width=”320″ height=”290″/>
</layout>
</head>
<body>
<par>
<video src=”http://files.me.com/kingbuzz/radljq.mov” region=”videoregion”/>
<video src=”http://files.me.com/kingbuzz/radljq.mov” region=”videoregion2″/>
</par>
</body>
</smil>

IMPORTANT.

Notice that this program, written in the SMIL extension of XML, is quite declarative: it says to create a layout, break it into two regions, and then place the animation (a .mov video) in both regions, in parallel. It does not say how to do this. The program doesn’t specify the sequence of steps that are needed to get the job done - rather, what the result should look like.

This makes XML programs far easier to read than programs in imperative languages, thus making the programs easier for a programmer to write, and easier for another programmer to read and perhaps change later on. This makes programs in XML far more likely to be written correctly and then used appropriately.

We’ll look at declarative languages again, in future entries of this blog.

An SMIL XML program that plays a video twice, in parallel.
An XML program that plays a video twice, in parallel.


Mar 21 2009   3:17AM GMT

XML and its powerful children



Posted by: Roger “Buzz” King
XML, SMIL, namespaces, Quicktime, the Semantic Web

A key purpose of this blog is to provide a continuing examination of the Semantic Web - and certainly one of the most critical technologies to discuss is XML. Why is it so important?

First of all, just what is XML?

XML stands for eXtensible Markup Language, the extensible part is the key to its power.

Markup Languages.

Let’s step back though and look at the markup part first. “Markup” refers to the process of embedding commands in data. HTML is a markup language. When a browser fetches a web page from a web server, it processes the text-based HTML “markups” that appear in the page in order to present the page to us.

HTML: a Markup Language.

Importantly, HTML is focused on the visual appearance of information. It controls the layout of web pages, including “controls” such as menus and buttons. It also allows us to link pages together. One of the biggest jobs of HTML is to tell the browser how to layout pieces of text, such as the descriptions of books sold by Amazon.

HMTL has a fixed set of legal tags. Here is a sample HTML file:

<html>-
<body>
<h1> This is a heading </hl>
</body>
</html>

Notice that every tag comes in pairs, one with with a “<>” and the other with </>.

This HTML opens by telling us that it is an html file. Then it says there is a body to the file, and that there is a heading to be printed. This file will print the words “This is a heading”.

The important point, though is that these tags - html, body, and h1 - are HTML specific tags, and we cannot invent our own.

XML: a Far More Powerful Markup Language.

Now, let’s see what happens when we can invent our own tags.

XML is also a markup language. It was developed as a way to embed markups in data, so that the meaning of information can be communicated. In order to do this, XML allows us to do something we cannot do with HTML: we can specify our own “tags” so that we can add a lot more expressive power to our markups. There are two particularly critical aspects of XML tages.

The first is that there are two main sorts of tags: “elements” and “attributes”. Elements can have complex structure, and in fact, we can embedd elements inside elements. Attributes are simple values and have no internal structure.

The second critical thing is that we can use words taken from shared namespaces as values in tags in XML. This gives XML the power of shared, detailed terminologies that are available globally via the web.

The real power of XML is this: we can produce our own extensions of XML by defining our own tags. Each of these extensions is itself a complete markup language.

This is why it is such a critical part of Semantic Web technology: we can use it to capture the meaning (or “semantics“) of data so that it can be processed automatically. HTML controls the way a page is displayed only, and we have to use our eyes and minds to interactively interpret this information. But XML can be interpreted by a program, thus allowing powerful, automatic searching of the web.

An Example of an XML Extension.

Let’s look at an XML-based language, in particular, at its use of elements, attributes, and values from a shared namespace.

Below is a piece of code written in an XML extended language called SMIL. SMIL allows us to create multimedia presentations, with various pieces of media laid out on the display, as well as being sequenced in time.  (SMIL stands for Synchronized Multimedia Integration Language.)

First, let’s start with the core of a SMIL program:

<smil>
<head>
<layout>

… here is where we put commands that control the visual layout of the page we are constructing with SMIL …

</layout>
</head>
<body>

… this is where we put the core of our SMIL program, the part that specifies the multimedia presentation that is to appear in the page …

</body>

</smil>

Here is the entire program, fleshed out:

<smil xmlns:qt=”http://www.apple.com/quicktime
/resources/smilextensions” qt:autoplay=”true” qt:time-slider=”true”>
<head>
<meta name=”title” content=”Buzz’s Video”/>
<layout>
<root-layout background-color=”white” width=”320″ height=”290″/>
<region id=”videoregion” top=”0″ left=”0″ width=”320″ height=”290″/>
</layout>
</head>
<body>
<seq>
<video src=”http://files.me.com/kingbuzz/radljq.mov” region=”videoregion”/>
<video src=”http://files.me.com/kingbuzz/radljq.mov” region=”videoregion”/>
</seq>
</body>
</smil>

We don’t need to worry about the specifics of this code. The values of attributes are in quotes, and the values of elements are inside <> and </>. So, background-color is an attribute, and video is a element.

Let’s look at the beginning of this program:

<smil xmlns:qt=”http://www.apple.com/quicktime

/resources/smilextensions” qt:autoplay=”true” qt:time-slider=”true”>

This code refers to the SMIL extension, i.e., namespace.  That’s what xmlns stands for, by the way: XML namespace - i.e., the set of attribute and element tags invented specifically for the SMIL extension to XML. By pointing to this namespace, our program identifies itself as being a legal SMIL file, and this tells Quicktime, which can play SMIL files, how to interpret it.

To see this, do this: Download Quicktime from Apple, if you don’t have it. Then put the above program in a file called buzz.smil. Then open buzz.smil with Quicktime.

Quicktime will read the file, locate the SMIL namespace on the web, then read the tags inside the SMIL program, and use them to interpret the rest of the code. This will direct it to download a video from my site - an excellent piece of animation built by one of my Intro to 3D Animation students, named Jochen Wendel. And in fact, Quicktime will play it twice - that’s that the <seq> </seq> tags mean: play it twice in sequence.

The Exciting Part.

Do you see what happened? We used a predefined namespace belonging to the SMIL extension of XML to write a program that can find a video, download it, and play it twice!

Why do we care?  It’s not that building a language to play various pieces of media, like Jochen’s animation, is a big deal in itself.  It’s that XML is extremely versatile.  By defining a set of tags and then sharing them, we can embed within information the means for interpreting it - and thereby create an endless array of powerful languages.

This is very important. XML and its powerful children (such as SMIL) are changing the web in a big way.

There’s something  else, too, something that is equally important.  XML is declarative. We’ll look at this in another blog soon, but essentially it means that an XML language like SMIL is easier to read than imperative code, like Java or C.  Look at my SMIL program, and then look at a Java program.  Which is easier to understand?  We’ll get to this.