Buzz’s Blog: On Web 3.0 and the Semantic Web


April 26, 2009  8:09 PM

The world of advanced Web applications: what are they?



Posted by: Roger King
distance education, evernote, GlideOS, mashups, namespaces, social networking sites, tagging, the Semantic Web, Web 2.0, Web 3.0, web services, wikis, XML, zenbe.com

This blog is dedicated to an ongoing discussion of Web 2.0/3.0 and the Semantic Web. The slant is on the technology itself, how it works and what’s going on inside advanced Web applications. We’ve looked at a couple different Web 2.0, in particular, Evernote and GlideOS. We’ve tried to characterize the capabilities of Web apps.

The impact of the new Web.

This posting addresses a non-technical question: What has been the impact of this technology our society?

Technological advancement can be very roughly broken into two groups: incremental and radical. Which of these is Web 2.0/3.0? Is it a radical advance?

Consider what highly responsive, multimedia web applications have done for us. They have enabled the development of:

* Wikis: These are web applications that allow us to collaboratively develop sophisticated, easily searchable information bases. These can range from dictionaries for specialized disciplines to vast databases containing DNA information. Data can be vetted by experts and/or challenged by random users.

Everybody knows about Wikipedia, but like blog and bulletin board software, wiki software can be easily installed and configured for deployment on almost any web server, whether it is publicly accessible, or used privately within a corporation or by a professional organization.

* Social networking sites: These are web applications that allow us to actively participate in a myriad of communities based on professional and personal interests. We find work, develop contacts, share music and photographs and video, and develop lifelong collaborations with people we would never have met otherwise.

They are also used by people who are in daily physical contact, but who find they can deepen their relationships by posting personal information on public sites like MySpace and Facebook. The interesting thing about these sites is that new and successful ones keep emerging,

* Tagged content vendor sites: Volunteers and paid individuals can contribute multimedia content and collaboratively tag it, using both freeform and highly sophisticated tagging protocols, such as the sophisticated MPEG-7 standard. (We will look at MPEG-7 in a future posting of this blog.) These include images and sound and video, and many taggers are highly trained professionals who can carefully categorize content according its detailed meaning. This technology makes a vast sea of otherwise-unknown assets available to us. It also makes these assets searchable, thus transforming a completely intractable task into something we easily perform.

In particular, this has radically enhanced the creative power of both professional and hobbyist animators by giving them complex scenery and character components to work with. Check out thoughtequity.com for an example of a content vendor. Take a look at daz3d.com for animation content.

* Mashups: These are portal or second tier web applications that take content from other web sources, such as Google Maps, investment information, medical advice, and scientific data. Often mashups take data from several or hundreds of other sites and create complex, highly valuable multimedia assets.

Take a look at woozor.com. It combines Google map and weather data.

* Distance learning: Universities, corporations, professional organizations, and lone instructors can develop and sell effective, multimedia educational packages that bring education to anyone who has Internet access. This allows us to retrain ourselves for new occupations, stay current in our professional skills, and find employment that is satisfying, steady, and high paying.

I teach on my university’s distance learning site, and we use video, sound, desktop video capture, slide presentations, and software demonstrations – and they can all be edited into a unified product. There are online universities now, where you can get a college degree. Take a look at jonesuniversity,com.

* Hybrid applications that support things like email, calendar, collaboration, RSS feeds, etc.

A good example of a hybrid application is zenbe.com, which provides a combined web-based email, list making, and calendar application, and in that sense is similar to many other email providers. But Zenbe also provides a collaborative tool called Zenbe Pages, which can be used by collaborators to organize their activities. A Zenbe page can have notes, calendars, lists, RSS feeds (not new ones, but existing RSS feeds) on them. Zenbe also provides quick access to Twitter, Google Talk, and Facebook.

By the way, it’s important to point out that the categories I list above are not as clear-cut as one might think. Many modern web apps contain elements from more than one of these categories.

The software building blocks.

From a programming perspective, what specific Web 2.0/3.0 software has allowed all of this to come about? We’ve discussed much of this already in previous postings of this blog. It includes XML and the exploding class of XML languages, namespaces, IDE’s (Integrated Development Environments), large code bases (such as the vast library of ready-made Java components), web service software development tools, and AJAX web page optimization technology. It also includes web development frameworks like Ruby on Rails, and newer ones, engineered toward high responsiveness, like Flex and Silverlight.

Also included are powerful media formats, codecs, players, and editors, which allow web users to do more than upload and search media; we can edit it and reform video, images, and sound, without leaving the simple world of our browsers. And of course, modern mega media apps enable us to build media assets. The list of contributing software tools goes on, but we’ll stop here.

It scales!

And there is something subtle, but important that gives advanced web technology extraordinary power: it scales. We manage shared resources that are truly gigantic in size, and are spread across countless machines around the world. We leverage global user bases, cheap server technology, and wide open Internet bandwidth to give media stores belonging to Web apps astonishing growth rates.

The bottom line.

Yep. Web 2.0/3.0, as a whole, is a truly radical advancement. It has fundamentally and globally changed society in a big way.


April 19, 2009  2:31 AM

There are Web apps and then there are Web apps.



Posted by: Roger King
evernote, Filemaker, Glide, SMIL, the Semantic Web, Web 2.0, Web 3.0, web applications, XML

In our continuing series on Web 2.0/3.0 and the Semantic Web, we have looked at one simple, yet impressive Web application, called Evernote. There are significant advantages of Web apps; in particular, the application is available wherever you can get onto the Web, you don’t have to run and maintain complex desktop software, and your data sits on a (hopefully) secure and backed-up data server.

Web Apps.

We noted that some Web apps, including Evernote, are both Web-based and desktop-based. Seemingly, this might be a disadvantage, because now, the user does have to install and maintain the desktop version of the app. But, in exchange, you have two copies of your data, at different physical locations. You also can use the app when you are not on the Internet. And, as far as Evernote goes, the desktop app is very far from difficult to manage.

Let’s look at this a little closer. Not all Web apps are the same. One problem is that too many vendors feel compelled to brag about the Web capabilities of their projects, and so we have to be suspicious – especially when it comes to older applications that have been retrofitted with Web capabilities.

Let’s look at a few applications. Please keep in mind that the first two applications are not advertised as “Web apps”. I am describing them only as a way of categorizing the Web capabilities of applications in general.

Minimal capabilities: exporting to the Web.

Our first example is an application that runs on Macs and is very impressive. It’s called Curio, and is made by a company called Zengobi. It gives you a workspace to which you can append text notes, lists, images, video, and sound clips. It also supports diagrammatic mind-maps. It’s great for a wide class of brainstorming techniques from simple note-taking to sophisticated workflow planning. It’s all-in-one nature makes it a little imposing and chaotic at first, but it is actually quick to master – and then its freeform nature proves itself to be very powerful. It is also very elegant.

Curio’s Web capabilities are extremely limited, however. All you can do is output a Curio file as a fixed HTML page. It cannot be updated over the Web. For convenience, it can export a file directly to your “Mac” Web account, if you own one.

Modest, often tacked-on Web capabilities.

Another example application is Filemaker. (I am referring to their products called Filemaker Pro and Filemaker Pro Advanced, since they are what I have used in my classes as the University of Colorado.) I teach database management systems, and I can say lots of good things about Filemaker. It is a very quick and simply way to get a full-fledged, scalable, visually-pleasing desktop database up and running. I like it.

But its Web capabilities are typical of applications that have added Web capabilities long after the fact. What you can do with Filemaker is “publish” a database on the Web, and allow Web-based updating and searching. It in effect turns the machine hosting the database into a simple server. But most of Filemaker’s capabilities are not available via the Web interface. And, the database only exists on its original site. All data remains there.

Native, full Web capabilities.

So, what’s a true Web app? I’d say it is an application whose native interface is Web-based, and where all or virtually all of its capabilities are available via a browser. Evernote is a good example.

There is a fuzzy line between “websites” and “Web applications”, as we have previously discussed. And in fact, some people consider virtually all powerful websites to be Web apps. This includes Amazon, Blogger, and Wikipedia, as well as countless lesser-known websites.

And, with respect to the deliberately narrow criteria we’re using here, these applications are indeed Web apps.

So, what characteristics do we see in applications that are powerful, and have native, complete Web interfaces? They are likely to store data persistently in a serverized database management system like MySQL, and present the user with web forms to fill in, and return to the user dynamic Web pages populated from the database. A website that we might be willing to label “Web 2.0″ would be one that is highly responsive and manages large amounts of data.

We might call it Web 3.0 if it also manages large volumes of continuous data (like audio and video), and presents to the user a highly multimedia web interface. But these terms are vague, and drawing lines between them is to a certain degree misleading and a distraction.

Perhaps something that might be a truly Web 3.0 characteristic is that the application, rather than just delivering up video and audio, uses a combination of multiple forms of media, in concert, to interact with the user. We looked at SMIL, an XML language that allows the user to build presentations that coordinate multiple forms of media, such as images, sound, and video. The SMIL programmer can arrange media on the screen, and specify how the various pieces of media will be displayed over time.

Glide: the Web-based desktop.

But let’s look at one very, very aggressive attempt at a true Web 3.0 application. It’s called Glide, and you can get yourself a free account. This application does not support any sort of desktop-based version, and so you do have to be online to use it. It also needs a very fast Internet connection, because of the wide variety and high volume of data it allows you to manipulate.

What’s Glide? It is advertised as “the complete mobile desktop solution”, and it provides a complete, virtual, web-based computer. With it, you can edit photos, draw diagrams, store media files, send and receive email, manage a calendar, manage video, write documents, even build a website – in other words, do almost everything a non-programmer might want to do with a computer.

Its interface consists of three main windows. One is a virtual desktop, with various applications ready to use; another is a portal where the user can access the Web and develop websites; the third is a virtual hard drive, where media and files created by the various applications can be stored and accessed.

Is this the way of the future? It completely frees a user from having to buy, install, and maintain complex, expensive applications, although you still need a computer with a browser to run it. One drawback is that none of its apps, as near as I could tell, can compete with the dominant desktop applications. It is not Photoshop, it is not Dreamweaver, and it is not MS Office Outlook. But its apps are not trivial: they do the job just fine. And the entire interface is simple and visually pleasing.

There is also a way to sync your files on your desktop with the files on the Glide servers, and their documents and spreadsheets are apparently compatible (to some degree) with Microsoft’s Word and Excel. But they apparently are not planning on creating any sort of hybrid web/desktop based product. Glide’s goal is to move us all toward the Web and away from our desktops.

The Glide servers seemed fast enough to me, by the way. That’s the big question. Can it be as responsive as a desktop computer? Well, it’s as fast as my Vista machine… But slower than my iMac.

Give it a try.




April 13, 2009  3:27 AM

Mega Media Apps: A Huge Challenge for Web 3.0



Posted by: Roger King
3D animation, 3D modeling, blob data, codecs, continuous data, Maya, media applications, Video, video containers, Web 2.0, Web 3.0, web applications

What Are Web 2.0 and Web 3.0 Apps?

In our continuing series on Web 2.0/3.0 and Semantic Web technology, we’ve discussed one particularly impressive Web 2.0 app: Evernote. The challenge is to get the best of both worlds: the interactive performance of a desktop application, and the use-it-from-anywhere convenience of the Web. Many Web applications – such as Evernote – also ensure offline usability by providing both a desktop and webpage interface, and maintaining a local version of the database, which is periodically synched with the web-resident database.

But, as cleverly engineered as it is, and as useful as it is, Evernote is still a very simple application. What about big applications? What challenges face the developers of Web 3.0 applications, ones that will manipulate large databases of continuous data, and extra-large instances of blob data? (Video and sound are continuous; an image is blob data.)

Let’s consider one of the biggest media apps out there: Maya, the high-end 3D application that is widely used to make full length animated movies. (See http://autodesk.com for Maya.)

What’s the big problem? If an application like Maya was reengineered as a Web app along the lines of Evernote, would it be usable? Might it be intractable to be continuously moving complex animation data between the server and your client machine?

3D Geometry: Just How Big Is It?

Well, the problem is not the complex geometric models that an application like Maya must store and manipulate. 3D animation applications like Maya tend to support multiple ways of creating 3D shapes, and they do indeed tend to be very data-intensive. The first image at the bottom of this page shows a Maya screen with two spheres, one built with straight line geometry and one built with curved line geometry.

As it turns out, to make the straight line model smooth, you would need to use many more lines and vertices than I have in the the model in the image. But if you think about it, the straight line model uses the geodesic dome approach; it builds a 3D sphere out of many 2D polygons – which are flat. The more polygons, the smoother the model. In the other model, we use curved lines, and so the model looks much smoother, even with not that much detail. But the mathematics are complex.

You can image that a dense scene, with a very large number of detailed, 3D models of these sorts would contain a lot of data. But no, that’s not the problem. These models can be uploaded and download very quickly. They aren’t as big as you might image – because they are not continuous data. They are blobs, either binary or of code text, and are reasonably manageable.

The Killer Problem: Video.

The problem? It’s what Maya creates at the end of the design process, when Maya renders a scene so we can watch it. It renders video. And video, whether you are looking at video shot with your home camera, or at video rendered by Maya, or video I create when I capture desktop videos on how to use Maya and post it for my animation students, well, it’s big. Really big.

Video is the killer. Video makes a lot of mega apps, and even very simple apps that happen to create video, not scale. We could manage a modest number of modest-sized video segments via a web interface, but not big chunks of video. To make videos even worse, we usually have to add a sound track.

So, the lesson is that many or most applications that create and/or edit video in any form face this challenge.

This is why we use video compression. First, you need a container, which is a way of bundling the huge series of still images that make up the video, with the sound, as so that we can move it around as a single object. (Keep in mind that often consists of at least 25 frames, or still images, per second – and that makes for big pieces of continuous data.) Popular containers for small scale projects (such as animations that will be marketed via CDs) are .mov and .avi. The first is the Apple Quicktime standard, and the second is due to Microsoft.

Once you have a container, you need a codec, which is a way of compressing and decompression video, so that it isn’t so big when you move in over the Internet or store it on a small storage device. Codec actually stands for “code” and “decode”. It cannot be overstated how powerful a codec can be; I routinely turn gigabyte videos submitted by my students into less-than-100 megabyte videos. They can be uploaded to a website and then played, and at least in a small box on a web page, they look great.

But if you want quality, if you don’t want to lose detail, and if in particular, if you are going to display a video on a large display (or at the movie theatre), you often cannot compress it enough.

That’s it. That’s the problem, and it’s one of the biggest challenges facing the makers of Web 3.0 apps, which are supposed to fluidly manipulate video segments.

A Far Bigger, Far More Universal Problem.

But perhaps the old video challenge, the one that is constantly shoved in the face of next-generation web app developers, is a distraction, something that draws us away from the real problem, the one that kills many media apps, even when they are totally desktop-based. What is it? Take a look at the animation designer’s interface to Maya, in the second image at the bottom of this page.

The problem is the size and complexity of these apps. There are made up of multiple complex windows. They have menus, palettes, and lots of little boxes that contain detailed information. Keep in mind that you only see one of the Maya windows in the image below, at the bottom of the page, and it is already too dense for a single screen, even a large one. Looking more closely at the window in this image, note that there are several places on it that contain drop down menus. Many of these menu items lead to other drop down menus. Even the main menu at the top is changed frequently during the process of creating an animation project. The designer’s GUI as a whole changes during the process of using the app.

It is very hard to fathom the incredible complexity of an interface like Maya’s until you use it. Professional video editing applications are typically simpler, but are still very complex, especially if the application supports special effects and the insertion of text. Even applications intended for the average Joe, like Photoshop Elements, are often horrifically complex.

The Bottom Line.

The problem that faces developers of all sorts of next-generation apps that must manipulate animation or sound or video or images, or that format complex documents for publication (like Adobe InDesign), or support the development of complex web pages (like Adobe Dreamweaver), is this: it is near-intractable or perhaps completely impossible to build an interface that explains to the user the process of using the application. Little wizards or chunks of documentation that contain “recipe” steps, don’t come within a thousand light-years of conveying how to use that app as a whole.

That’s it. True Web 3.0 applications would convey not just a vast, deeply embedded toolset, but the way the tools should be used. That’s the big challenge.

By the way, if you want to see a handful of videos made by my introductory animation students, go to my website at http://buzzking.squarespace.com and look at the right column, near the bottom of the page.


April 9, 2009  3:28 AM

The Dublin Core and the Metadata Object Description Schema: a look at namespaces



Posted by: Roger King
Dublin Core, MODS, namespaces, Semantic Web, the Metadata Object Description Schema

Namespaces.

As we have seen, namespaces are a core element of the emerging Semantic Web. By posting namespaces on the Web, we can share precise vocabularies that will hopefully enable us to automate the process of searching the Web.

Searching with today’s search engines, like Google, is an inaccurate and highly iterative process. Searches are based on matching our search words with words in the documents that have been found and indexed in advance by the search engine. It can be a very painstaking process: we have to click on the URLs that are returned, and for each one, make a decision as to whether or not the page is relevant. We typically end up changing our search words gradually, as we hone our search criteria.

Namespaces are intended as a key element of a long term goal to make search engines of the future smarter. If the terms we used to formulate our searches came from widely-adopted, standardized namespaces, there would be far less painstaking iteration involved in finding the right webpages. We would accompany our search requests with links to the namespaces that define terms we are using. And in fact, searching would become at least partly automatic, with the browser able to narrow the set of returned URLs by making use of its knowledge of namespaces.

The Dublin Core.

Let’s take a look at one of the most widely known namespaces. It’s called the Dublin Core. But, as it turns out, it proved too simple and has since been eclipsed, at least in part, by a somewhat more sophisticated namespace called the Metadata Object Description Schema.

To get started, here’s another way to look at a namespace: it is used to create metadata that describes some data source. In particular, the Dublin Core was engineered to provide metadata for resources that can be found on the Web, including text-based documents, images, and video, and in particular, web pages. Want to know what a web page is all about? Look at its metadata, specified with the Dublin Core standard.

By the way, the namespace is named after Dublin, Ohio, not the other Dublin. The namespace was the result of a workshop held in Dublin in 1995. It is not an XML extension, like SMIL, the language used for building multimedia presentations. However, the Dublin Core can be used to create metadata for documents that are specified with XML or one of its many extensions.

So, what is in the Dublin Core? Basically it is a set of terms such as Contributor, Publisher, and Language. Some of the terms generally refer to very simple values, like Contributer, which is the person or organization that created a document.

To look at one of the potentially more complex Dublin Core terms, Coverage can describe the 3D (x,y,z) coordinates, or the time period, or the nation referenced by the document being described. It could refer to all of these. Note that this is not the time the document was written, or where it was written. Coverage refers specifically to the content of the document itself.

So, if we tell a smart browser of the future to find all documents that pertain to the year 1865, it will not return documents that were written in 1865, but are about the year 1012.

One drawback of the Dublin Core is that it is very loosely defined. So, it often fails in its true purpose: to provide precisely-defined terms that all of us can use, and where we can be confident they will be uniformly interpreted.

A More Sophisticated Standard: MODS.

A newer proposed standard, called the Meta Object Description Schema, or MODS, is an XML language that has been very actively promoted as a successor to the Dublin Core. MODS has more terms, and more precisely-defined terms. Since it leverages the ability of XML to express nested or embedded structures, it can convey much more information than a list of Dublin Core terms can convey.

Here’s a little piece of MODS:

<name type=”personal”>
<namePart type=”family”>King</namePart>
<namePart type=”given”>Bugs</namePart>
</name>

This only gives a hint of the rich metadata that can be specified by using MODS. (The MODS website provides some far more detailed examples.)

Still, compare this to the Dublin Core Contributor term, which might have the value “Bugs King”. Is this a human name? Is it a pest control company?

But – even though it seems like an odd name, in the MODS example, we know that this is a person who goes by the name Bugs King.

Dublin Core might die and blow away – but it will always be recognized as a pivotal point in the development of the Semantic Web.



April 2, 2009  5:59 AM

Full Text searching: cleaver heuristics for managing large web-based document collections.



Posted by: Roger King
databases, documents, full text, full text searching, Multimedia, MySQL, SMIL, SQL Server, the Semantic Web, Web 2.0, Web 3.0, web applications, XML

There is an explosion of technology for supporting sophisticated forms of media on websites and in web applications. In our continuing series on advanced web applications (in particular, as they pertains to the Semantic Web and Web 2.0/3.0), we’ve looked at continuous media, in particular, video and multimedia presentations. But there is a very old form of continuous media, something that is perhaps the dominant media on the Web, and that’s text.

It’s becoming a very major issue in web development.

Text.

In this blog entry, we’ll be looking at a particular form of text, called “full text”.

But just what is text to begin with? It’s character-based data, anything we can read.

And what will we want to do with it in next-generation web applications? It’s important to note that more and more vast libraries of documents are being put online. Web applications need to provide far faster and more accurate searches of documents than what we can perform with Google.

Interestingly, a successful technology, called “full text retrieval”, is already in place in the relational database systems that underlie modern web applications. It’s there working for us, and we are likely to not be aware of how clever it is.

It’s also something that should be used much more heavily by web application developers.

Let’s step back and consider three different – and increasingly more sophisticated – ways of managing character data.

Atomic Character Attributes.

First, there is the traditional relational database approach, whereby data is stored as tables made of rows of atomic, fixed sized attributes. By atomic, we mean that each attribute has no internal structure. So, a table of insurance claims might have rows with the following attributes: Claims-ID (an integer), Amount (an integer), Medical_Problem (a fixed length character string), and Subscriber_Name (a fixed length character string). Using SQL, the universal database “query” language, we might look for all rows that contain the name “Fred Jones”. Or, we might search for all rows that have claim numbers that are between 110 and 115.

Essentially, this approach limits us to comparing small strings of data to each other or to fixed values. There are some common extensions that we find in relational databases, such as being able to ask the question to find all rows where the Medical_Problem is something like “broken leg”. Then if a row actually has the value “broken legs”, we would most likely see this row in our results.

Full Text.

Second, there is the ability to search pieces of text according to their natural language (in this case, English) meaning. In this case, we consider the character data to have internal structure, and the values are not considered atomic. Often, these pieces of text are long and of variable length from one row to the next.

It is actually an extension of – but a very dramatic one – of the like operator in SQL.

It is what we call “full text” management or retrieval, and modern relational database management systems like MySQL and Microsoft SQL Server support this. This was seen long ago as a critical extension to relational database technology. Thus, we might rename our Medical_Problem field to Doctor’s_Diagnosis, and allow free form English text in this attribute, as well as allowing the value to be quite long. Then we might search for all rows where the doctor describes “fractures of the lower limbs”. Notice that none of these words might actually appear in the attribute, which might simply refer to “broken legs”.

Natural Language Processing.

This capability would clearly be very powerful, if we could do it right. The problem is that to support it fully, we would need to use highly advanced natural language processing techniques, which are very time consuming to execute, especially on huge databases of large documents. The full text approach tries to simulate true natural language searching in a far less expensive way. The real thing, by the way, might not be all that accurate anyway. Natural language is naturally ambiguous and very subtle.

True natural language searching would be our third way of processing character-based data, by the way. It is not a fully developed technology. And importantly, we usually don’t need anything that fancy.

The Clever Compromise.

So, our middle option, full text searching, is what dominates today – and it is a surprisingly accurate, and efficient, technique that operates on a small set of heuristics. It can transform a dumb webpage where we can only search for small, fixed character strings, to a rich next-generation webpage that can effectively be searched according to its meaning. It allows us to manage very large text documents in web applications – and get us surprisingly close to the semantic power of true natural language searching.

We’re not going to go into a lot of detail here, but here are some of the heuristics that are used in full text search. First, “stemming” and related techniques are used; they conjugate verbs, detect plurals of nouns, and remove prefixes and suffixes. Another technique is to use a “stop list” that lists words that should be ignored, like “the”. The system might also let us specify the “proximity” of words; this refers to how closely specific words should appear in a document. It can also be powerful to include a synonym checker. And the ability to allow for “wild cards”, in particular, letters that may vary in a passage without changing its meaning, can be quite useful. Dictionaries of technical words that pertain to specific domains (like medicine or law) are very useful. We might also provide a feedback capability, whereby users can train full text search engines to be more accurate.

This clearly doesn’t come anywhere near true natural language processing – but it is fast. It will be a growing technology on the new web, with a lot of hidden development, making this heuristic-based technique more and more effective.

Indexing.

We should note that there is a significant up front cost in preparing a document for full text searching: we need to build an index with an entry for every (non-stop) word in the text. Then, when a query is executed, we can look for words in the document by searching the index, instead of searching the full text. If there were no index, the search would be extremely time-consuming.

The Future.

As more and more governmental, educational, medical, and other complex documents become available on the web, advanced full text searching will enable us to search vast databases in a tractable fashion. Even more clever full text retrieval engines will turn dumb, “gotta Google them” document portals into true Web 3.0 and Semantic Web applications.



March 26, 2009  11:31 PM

SQL and XML: declarative is exciting



Posted by: Roger King
namespaces, SMIL, XML

In the continuing series of blogs on the Semantic Web and other advanced web technology, we’ve looked at XML as a cornerstone of the technology that allows us to markup data, and in combination with namespaces, create powerful tools for sharing the meaning – and not just the structure – of data. There’s something special about XML that is at the core of its truly amazing widespread adoption, that explains its versatility as the language of choice for tagging data, no matter what the purpose.

What is it?

It’s that XML is “declarative”.

A declarative language is one that allows us to write programs that tell us what needs to be computed, not the order in which primitive operations need to be carried out in order to get the result. Java, C++, JavaScript, C#, Objective C, ActionScript, PHP – none of these are declarative.

Some Non-Declarative Code.

A Here’s some code:

for ( i = 0; i <100; i++ )
stuff[i] = stuff[i] + 1;

It says to start i at 0, then add 1 to i until you get to 99, and each time i is incremented, add 1 to that element in an array called stuff.

This manipulating-an-array program is the classic piece of non-declarative code. It doesn’t just say to add one to every element in an an array, it also tells the order in which to do it. This extra information shouldn’t really be needed, but in non-declarative languages – known as “imperative” languages – it is frequently necessary.

Some Declarative Code.

Now, here is some declarative code. It’s SQL, the universal database language:

SELECT Firstname
FROM Clients
WHERE (Lastname = ‘Smith’) AND (City = “Boulder”) AND (Bday BETWEEN ’2/10/1970′ AND ’2/10/1980′)

Clients is a relational “table”, and Firstname, Lastname, City, and Bday are all “attributes” (or “columns”) of that table.

This piece of code gives us the first name of any client whose last name is Smith, and who is from Boulder, and was born between Feb 10 of 1970 and Feb 10 of 1980.

Notice that it tells the computer what data we want, and not the sequence of steps that must be carried out to return the value. We don’t know what order the rows in the table will be examined. We don’t know if the three conditions will all be checked at once, or if we will filter the table first by picking out all clients who are from Boulder.

XML is Declarative.

XML is a declarative language, too. Let’s look at it.

This is the XML from the previous posting of this blog.

<smil xmlns:qt=”http://www.apple.com/quicktime
/resources/smilextensions” qt:autoplay=”true” qt:time-slider=”true”>
<head>
<meta name=”title” content=”Buzz’s Video”/>
<layout>
<root-layout background-color=”white” width=”320″ height=”290″/>
<region id=”videoregion” top=”0″ left=”0″ width=”320″ height=”290″/>
</layout>
</head>
<body>
<seq>
<video src=”http://files.me.com/kingbuzz/radljq.mov” region=”videoregion”/>
<video src=”http://files.me.com/kingbuzz/radljq.mov” region=”videoregion”/>
</seq>
</body>
</smil>

An XML program consists of “elements” and “attributes”. Notice <head> and </head> form the bounds for an element, as do <body> and </body>. Also note that there is an element nested within <head> and it’s marked by <layout> and </layout>. The tags <seq> and </seq> denote an element inside <body>.

The other major construct in XML is called an attribute, and name and content are two attributes with values “title” and “Buzz’s Video”, respectively. Attributes are always simple character values, and therefore cannot be nested.

When this program is saved with the name buzz.smil, and then run by Quicktime, it will download a video from my website (a very nice piece of animation by a student named Jochen Wendel), and then play it twice in succession. See the previous blog for more of an explanation of how SMIL works. It also discusses the difference between XML and its extensions, such as SMIL.

Note that these tags are not part of XML itself; rather they are part of the namespace that has been defined for the SMIL extension of XML. This illustrates the power of XML: it can be used to define other languages.

To understand the XML above, all that is needed is access to the SMIL namespace (which is available at the URL listed at the beginning of the code), and a program that knows how to interpret XML that contains these tags. In this case, it defines a layout for the screen, and that a video should be played twice, sequentially. Quicktime has been programmed to understand the elements and attributes of the SMIL XML language.

Going from “sequential” to “parallel”.

To make our point stronger, here’s a variation. Instead of playing the two videos sequentially, I am using the <par> and </par> tags that represent “parallel” in the SMIL namespace. I have also made the layout area twice as big, and broken it up into two regions. Now, the program plays the video twice, side-by-side, one in each region. At the bottom of this blog entry is what you should see if you save it as buzz.smil and run it with Quicktime. There is also a nice soundtrack.

<smil xmlns:qt=”http://www.apple.com/quicktime
/resources/smilextensions” qt:autoplay=”true” qt:time-slider=”true”>
<head>
<meta name=”title” content=”Buzz’s Video”/>
<layout>
<root-layout background-color=”white” width=”640″ height=”290″/>
<region id=”videoregion” top=”0″ left=”0″ width=”320″ height=”290″/> <region id=”videoregion2″ top=”0″ left=”320″ width=”320″ height=”290″/>
</layout>
</head>
<body>
<par>
<video src=”http://files.me.com/kingbuzz/radljq.mov” region=”videoregion”/>
<video src=”http://files.me.com/kingbuzz/radljq.mov” region=”videoregion2″/>
</par>
</body>
</smil>

IMPORTANT.

Notice that this program, written in the SMIL extension of XML, is quite declarative: it says to create a layout, break it into two regions, and then place the animation (a .mov video) in both regions, in parallel. It does not say how to do this. The program doesn’t specify the sequence of steps that are needed to get the job done – rather, what the result should look like.

This makes XML programs far easier to read than programs in imperative languages, thus making the programs easier for a programmer to write, and easier for another programmer to read and perhaps change later on. This makes programs in XML far more likely to be written correctly and then used appropriately.

We’ll look at declarative languages again, in future entries of this blog.

An SMIL XML program that plays a video twice, in parallel.
An XML program that plays a video twice, in parallel.


March 21, 2009  3:17 AM

XML and its powerful children



Posted by: Roger King
namespaces, Quicktime, SMIL, the Semantic Web, XML

A key purpose of this blog is to provide a continuing examination of the Semantic Web – and certainly one of the most critical technologies to discuss is XML. Why is it so important?

First of all, just what is XML?

XML stands for eXtensible Markup Language, the extensible part is the key to its power.

Markup Languages.

Let’s step back though and look at the markup part first. “Markup” refers to the process of embedding commands in data. HTML is a markup language. When a browser fetches a web page from a web server, it processes the text-based HTML “markups” that appear in the page in order to present the page to us.

HTML: a Markup Language.

Importantly, HTML is focused on the visual appearance of information. It controls the layout of web pages, including “controls” such as menus and buttons. It also allows us to link pages together. One of the biggest jobs of HTML is to tell the browser how to layout pieces of text, such as the descriptions of books sold by Amazon.

HMTL has a fixed set of legal tags. Here is a sample HTML file:

<html>-
<body>
<h1> This is a heading </hl>
</body>
</html>

Notice that every tag comes in pairs, one with with a “<>” and the other with </>.

This HTML opens by telling us that it is an html file. Then it says there is a body to the file, and that there is a heading to be printed. This file will print the words “This is a heading”.

The important point, though is that these tags – html, body, and h1 – are HTML specific tags, and we cannot invent our own.

XML: a Far More Powerful Markup Language.

Now, let’s see what happens when we can invent our own tags.

XML is also a markup language. It was developed as a way to embed markups in data, so that the meaning of information can be communicated. In order to do this, XML allows us to do something we cannot do with HTML: we can specify our own “tags” so that we can add a lot more expressive power to our markups. There are two particularly critical aspects of XML tages.

The first is that there are two main sorts of tags: “elements” and “attributes”. Elements can have complex structure, and in fact, we can embedd elements inside elements. Attributes are simple values and have no internal structure.

The second critical thing is that we can use words taken from shared namespaces as values in tags in XML. This gives XML the power of shared, detailed terminologies that are available globally via the web.

The real power of XML is this: we can produce our own extensions of XML by defining our own tags. Each of these extensions is itself a complete markup language.

This is why it is such a critical part of Semantic Web technology: we can use it to capture the meaning (or “semantics“) of data so that it can be processed automatically. HTML controls the way a page is displayed only, and we have to use our eyes and minds to interactively interpret this information. But XML can be interpreted by a program, thus allowing powerful, automatic searching of the web.

An Example of an XML Extension.

Let’s look at an XML-based language, in particular, at its use of elements, attributes, and values from a shared namespace.

Below is a piece of code written in an XML extended language called SMIL. SMIL allows us to create multimedia presentations, with various pieces of media laid out on the display, as well as being sequenced in time.  (SMIL stands for Synchronized Multimedia Integration Language.)

First, let’s start with the core of a SMIL program:

<smil>
<head>
<layout>

… here is where we put commands that control the visual layout of the page we are constructing with SMIL …

</layout>
</head>
<body>

… this is where we put the core of our SMIL program, the part that specifies the multimedia presentation that is to appear in the page …

</body>

</smil>

Here is the entire program, fleshed out:

<smil xmlns:qt=”http://www.apple.com/quicktime
/resources/smilextensions” qt:autoplay=”true” qt:time-slider=”true”>
<head>
<meta name=”title” content=”Buzz’s Video”/>
<layout>
<root-layout background-color=”white” width=”320″ height=”290″/>
<region id=”videoregion” top=”0″ left=”0″ width=”320″ height=”290″/>
</layout>
</head>
<body>
<seq>
<video src=”http://files.me.com/kingbuzz/radljq.mov” region=”videoregion”/>
<video src=”http://files.me.com/kingbuzz/radljq.mov” region=”videoregion”/>
</seq>
</body>
</smil>

We don’t need to worry about the specifics of this code. The values of attributes are in quotes, and the values of elements are inside <> and </>. So, background-color is an attribute, and video is a element.

Let’s look at the beginning of this program:

<smil xmlns:qt=”http://www.apple.com/quicktime

/resources/smilextensions” qt:autoplay=”true” qt:time-slider=”true”>

This code refers to the SMIL extension, i.e., namespace.  That’s what xmlns stands for, by the way: XML namespace – i.e., the set of attribute and element tags invented specifically for the SMIL extension to XML. By pointing to this namespace, our program identifies itself as being a legal SMIL file, and this tells Quicktime, which can play SMIL files, how to interpret it.

To see this, do this: Download Quicktime from Apple, if you don’t have it. Then put the above program in a file called buzz.smil. Then open buzz.smil with Quicktime.

Quicktime will read the file, locate the SMIL namespace on the web, then read the tags inside the SMIL program, and use them to interpret the rest of the code. This will direct it to download a video from my site – an excellent piece of animation built by one of my Intro to 3D Animation students, named Jochen Wendel. And in fact, Quicktime will play it twice – that’s that the <seq> </seq> tags mean: play it twice in sequence.

The Exciting Part.

Do you see what happened? We used a predefined namespace belonging to the SMIL extension of XML to write a program that can find a video, download it, and play it twice!

Why do we care?  It’s not that building a language to play various pieces of media, like Jochen’s animation, is a big deal in itself.  It’s that XML is extremely versatile.  By defining a set of tags and then sharing them, we can embed within information the means for interpreting it – and thereby create an endless array of powerful languages.

This is very important. XML and its powerful children (such as SMIL) are changing the web in a big way.

There’s something  else, too, something that is equally important.  XML is declarative. We’ll look at this in another blog soon, but essentially it means that an XML language like SMIL is easier to read than imperative code, like Java or C.  Look at my SMIL program, and then look at a Java program.  Which is easier to understand?  We’ll get to this.


March 15, 2009  11:37 AM

A look at a Web 2.0 App



Posted by: Roger King
evernote, note-taking, notebooks, Rich Web Apps, the Semantic Web, Web 2.0, Web3.0

It’s called Evernote and I am a very heavy user. It guides my every workday.

In earlier blog entries in this series on the Semantic Web and Web 2.0/3.0, we’ve said that the primary goal of Web 2.0 developers is to build web apps that perform like desktop apps. Let’s check this definition against Evernote.

It’s a note taking program, but not your traditional desktop note program. Two paradigms of note-taking have traditionally been very popular: build-a-notebook and file-it-away. The first gives you a virtual notebook, with a cover, a title, and a table of contents. Typically the contents are broken into sections, and the sections into pages. Each page might be straight text or indented outlines. The second approach gives the user a set of conceptual folders (and perhaps subfolders), with each one stuffed with notes that might be text or indented outlines. Both approaches might support “sticky” notes, audio notes, video notes, and/or image notes.

The web application concept is making heavy inroads, though. Two of the build-a-notebook programs I use allow the user to export them as webpages so they can be manipulated remotely. This of course means that the machine hosting the notes must be exposed to the Internet as a web server. More practically, it means that the notes can be shared only within a local area network or a closed intranet of some sort.

But Evernote goes a big step further. It is a true web app, with your notes stored on their server. The monthly fee is very modest, and is based on an upload allowance – a very generous one, unless your notes are packed with big chunk of media. My notes consist almost entirely of text and web pages that I find obf interest. I have their cheapest paid subscription (there is also a free one), and I only use a small fraction of my allowance. As of this moment (and yes, I am writing this on Evernote) I have 1,106 notes.

The model that Evernote uses is a variation of the file-it-away paradigm. There is a desktop client (available for Macs and Microsoft Windows machines) that presents the user with a column of conceptual folders, a listing of all notes in a given folder, and a viewing space for some specific note in the folder currently of interest. There is a sync protocol that keeps all notes up to date on the Evernote server. The web interface to the server isn’t as elegant as the desktop application, and I don’t use it much. You can also keep your notes locally on multiple machines. There is even an iPhone/iPod Touch version of the desktop app; I use it on my iPod Touch. There is also a mobile phone app, but I have not tried it. All of these applications are available to you for the one monthly fee.

I’ve done some experiments, and the syncing protocol works quite well. It creates a special folder in your Evernote desktop app if it finds a conflict it cannot resolve. As a result, I have never lost a single note or been forced to use an older version of any note. I have had, however, to dig things out of conflict folders.

So, why is it a Web 2.0 app? This is where I have to admit that the definition of Web 2.0 is, let’s say, very flexible. Yes, Evernote is fast, and the syncing never slows me down; I can create a note, click the sync button on Evernote on my iMac, and by the time I’ve rolled my chair over to my Vista machine, the new note is there. Web pages can upload far more slowly than straight text, admittedly. One more thing: Evernote allows you to put tags on your notes. And of course, you can search by those tags. Oh, and there is a very convenient web page clipper that I have used on Safari, Internet Explorer, and Firefox; it will tuck a web page away with a couple of clicks.

But in truth, it’s a Web 2.0 app, not because it is a web app whose performance approaches that of a destkop app, but largely because it is such a great blend of a desktop and web application. Rather than building a web-only app as an alternative to a desktop app, and then engineer the thing to be as fast as possible for uploading, downloading, and searching notes, they’ve given us the rapid access rate of a desktop, along with the mobility of a web app, and all the pieces seem to work together just right. It’s a very smart, very modern app.

I keep a heavy fraction of my notes on it, including my to-do lists, and whether I am in my home office, my university office, a university computer lab, Barnes and Noble, or wherever, my notes are always available.

Try it.



March 11, 2009  8:24 PM

Web services: part of the Web 2.0 & Semantic Web picture



Posted by: Roger King
Multimedia, rich internet apps, Rich Web Apps, the Semantic Web, Web 2.0, Web 3.0, Web development, web services

This is the fifth in a series of blogs about the Semantic Web and Web 2.0/3.0. While the sequence of blog posts tell a continuous story, each blog should be fully informative if read out of sequence.

So far, we’ve discussed the Semantic Web, which is an attempt at automating the process of searching the web and integrating the results, and Web 2.0/3.0, which is largely oriented toward making media-intensive web applications highly responsive. (We noted in an earlier blog that Web 3.0 is an extension of Web 2.0, and we will look at this transition in a fugure blog.)

But there’s a third term that is often thrown into the mix: web services. What are they?

A web service is a web application that is not accessed interactively by a human, but rather by a program.

To make use of a website, we load a URL into a browser and visit the site. Once we’re there, we might – even if we don’t realize it – be operating a very sophisticated application, like Amazon. We can search their inventory – which sits inside a very large database – via Amazon’s search form.

But there’s something else you can do with Amazon. We can access their inventory via a web service. Or more precisely, we can build programs that can access their inventory by communicating with programs (called web services) that they have provided. Our programs and their programs talk to each other directly. These services can be used to do things that would usually would be very time-consuming, and in fact, often intractable, if performed with a browser. Their web services also allow third party vendors to post their stock on Amazon, and in return for a fee, let Amazon sell and ship their products. Thus, independent vendors can easily make themselves an extension of Amazon, something that works so smoothly that if we don’t look carefully, we might not realize we are buying something that is not directly marketed by Amazon.

The way a web service works is by the provider of the service (such as Amazon) making the interface to the software that implements the service publicly accessible over the Internet. Such an interface is called an Application Programming Interface, or API. This way, anyone who wants to write a program that will access the service over the web knows exactly how to write their program to talk to the web service. These programs that access web services are called “client” applications.

There is a wide class of web services available on the Internet, and many of them provide APIs that allow programmers to write software that can access vast databases of such things as news and real estate information. Many web services also are available via a website, for users who want to use the service interactively. And many client applications are really just doing the same thing a browser might do when accessing a website, except that the client is likely to be a far more specialized application and it runs as a desktop application on your machine. More importantly, that client program might be able to things that your browser cannot do.

For example, there is a web service called MusicBrainz; it provides information about music, not the music itself. It can be accessed via an API. There is also a website, MusicBrainz.org, where you can search the database interactively. The API might be accessed by a CD player application; it can communicate with MusicBrainz (without you knowing it) to download information about whatever CD you happen to be listening to on your desktop. It might enable your CD player to tell you the artist’s name and variations of that name, the release date, the catalog number, etc.

Since part of the idea is that we don’t have to directly interact with web services by using a browser, their explosive growth has been very quiet. Many websites are powered by input they get by using web service APIs. These second-hand websites are often called “portals”, and many portals integrate information from a number of sources and give you access to information that would otherwise be very tedious to find on your own. Web services are thus a critical building block for many of the multimedia, highly interactive websites that constitute much of the Web 2.0 effort.

In fact, they underscore the difficulty in making a sharp distinction between the Semantic Web and Web 2.0/3.0. This is because both of them depend highly on automating the movement of information around the Internet. A Web 2.0 website (often called a web application because it provides fast access to complex information, in particular, sound, images, and/or video) cannot answer your search request quickly unless it has ongoing, rapid access to underlying streams of information on the web. But this capability, of providing us with information integrated from multiple websites, is actually a cornerstone of the emerging Semantic Web.

The difference is that the Semantic Web will (hopefully) someday put a tremendous amount of smarts into web services, and allow us to locate, transform, and integrate information in extremely complex ways. The Semantic Web, in this sense, can be viewed as an extremely aggressive extent of the Web 2.0/3.0 effort.

So there we have it. Web services, in their hidden way, are rapidly evolving the web into something incredibly powerful.


March 5, 2009  5:05 AM

Multimedia, what is it? Why do we care?



Posted by: Roger King
Multimedia, SMIL, Text, the Semantic Web, Video, Web 2.0, Web 3.0, Web development, XML

This is the fourth in a series of blogs on the Semantic Web and Web 2.0/3.0.

To get us going here, just what is “multimedia”? At one level, it simply refers to applications that manipulate, store, and/or present multiple kinds of media, such as text, video, relational data, sound, animation, etc. More pragmatically, it refers to the introduction of blob and continuous forms of data into applications that traditionally manage simple data, like character strings and numbers. In its most aggressive form, multimedia refers to the sophisticated integration of traditional, blob, and continuous data into integrated data forms that convey their own semantics.

A quick note: Blob data is data that is stored in a semantics-less fashion, usually as simple binary or character data. This could be almost any sort of data, such as images, video, sound, or natural language – but the key element is that the language or system being used to manipulate it doesn’t have an appropriate, specific data type. Blob data is often large, of variable size, and usually requires a sophisticated, outside application to interpret and present it. It is the default, catch-all way to store advanced forms of data in relational database management systems.

Another quick note: Continuous data is data that has a temporal aspect or can be broken down into segments that have their own identity. The visual part of video can be broken into clips; in fact, it can be broken all the way down to individual pixel-based images. Sound can be cut into pieces. James Joyce’s Ulysses is a big piece of continuous textual data. Like blob data, we typically need an outside appliation to interpret it. (Even the most complex application, a human, generally has trouble doing this with Ulysses.)

Back to the Semantic Web, Web 2.0/3.0, and multimedia.

In a previous blog, we tried to define these two terms and explain why they are very different concepts. The Semantic Web is an attempt to automate the searching of the web and the integration of data collected on the web; the idea is to greatly ease the painful interactive nature of using a search engine like Google. Web 2.0/3.0 (and no, there is no sharp distinction between the two) are largely about performance, of making web applications as responsive as possible, potentially as responsive as desktop applications.

But they share one common goal: effectively managing advanced forms of media. From the Semantic Web perspective, how can we search things like sound, images, video, and natural language in a semantically-meaningful way? We use sophisticated tags and image/sound processing to do this, but it is only a small step toward a solution.

From a Web 2.0/3.0 perspective, how can we deliver up such forms of media in way that is highly responsive? Video streaming on the web is a huge challenge, for example. Or, how can we interact with video in a responsive way, in such things as games and digital libraries?

The web, in fact, is inately multimedia: we take images, icons, links, text, video, sound, and various user controls like buttons and menus, and put them together in highly sophisticated ways. And behind these web pages, databases often sit, populating dynamic pages with information in response to user requests – this data is virtually invisible to search engines. This is what makes the Semantic Web in particular such an incredible challenge. How can we ever hope to search the web automatically?

There are modest advancements that have been made. One example is something called the Synchronized Multimedia Integration Language (SMIL, pronounced like the facial phenomena). It is an XML extension that supports basic constructs to glue multiple forms of media together in two dimensions and in temporal sequences. Using XML elements and attributes (the basic constructs of XML), we can create multimedia presentations in a precise, unambiguous way.

SMIL presentations can be processed automatically. This is very significant.

So, multimedia: it’s at the core of both the Semantic Web and Web 2.0/3.0. It is one of the basic motivations for their existence.


Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to: