Multimedia archives - Buzz’s Blog: On Web 3.0 and the Semantic Web

Buzz’s Blog: On Web 3.0 and the Semantic Web:

Multimedia

Oct 26 2009   8:58PM GMT

The imposing heterogeneity of media applications



Posted by: Roger “Buzz” King
3D animation, 3D modeling, advanced Web apps, automating Web searches, continuous data, media applications, Multimedia, the Metadata Object Description Schema, Video, video containers, Web 3.0, Web development, Web development frameworks, XML Schema

This blog is dedicated to the discussion of emerging web technologies. Today, we look at a the rapidly growing world of media applications, and their impact on the Semantic Web.

The problem of searching for media assets.

We’ve already looked at advanced media, in particular video, audio, and animation data, in previous blog postings. In particular, we’ve looked at the subtle and complex nature of media asset semantics. We’ve seen that interpreting a piece of video, for example, is far, far more difficult than interpreting an integer or character field. Since the goal of the Semantic Web effort is to make the searching of the web highly automated, advanced media is becoming a huge and critical research and development focus for the builders of next-generation web development applications.

Just how do we provide an environment where media assets can be searched in a mostly automatic fashion, so that a human does not have to painfully paw through hundreds or thousands (or millions) of video chunks to find the right one? We’ve looked at emerging technologies for marking up advanced media information, and for making it usable in a variety of web applications. We’ve also looked at the dramatic challenge presented by mega apps to would-be users; the interfaces to these applications are truly massive and cannot present to the user the way in which they are meant to be used.

The problem of proprietary formats.

One specific, and very difficult problem, is the massive heterogeneity, not just of media formats, compression technologies, and container technologies, but of the applications themselves. If we are going to automate the searching of complex modeling, video, audio, and other media assets, we’re going to have to address a key question: since many media apps make use of their own proprietary data formats, how are we going to provide automated ways of searching media assets that are stored in these formats?

The problem of highly imperfect generic formats.

There are indeed many existing, as well as soon-to-emerge, standards for importing and exporting data between powerful media applications, but transformations in and out of these formats are often “lossy”, in that information is lost or changed. In fact, locating and downloading assets that are in supposedly-generic form is often very frustrating, because these assets end up not performing well. They can be difficult to edit and reuse. 3D animation models regularly blow up when animators try to import them into animation applications and the manipulate them. A hawk may look like a hawk until you try to render it with its wings flapping, and suddenly it’s a blob of geometric garbage.

One possible direction.

So, what do we do about the fact that many media assets must be manipulated by the original applications that created them? How can we facilitate reuse? It’s extremely unrealistic to expect users to master perhaps dozens of video or audio or animation applications. Filtering assets according to their file extensions is a good idea, and it is a well established practice.

But what we really need is a globally-known site that either literally or conceptually centralizes the massive network of import/export relationships, along with information about the relative success of these mappings. Are they ever lossy? If so, can they be fixed? What series of applications might we want an asset to be imported/exported through so that in the end it is in a usable format, given the applications that the user owns and has mastered?

There is much to be done. Right now, searching for and reusing media assets is a painstaking, trial-and-error-prone process.

Oct 18 2009   10:37PM GMT

Personal Information Management Applications and Web 3.0



Posted by: Roger “Buzz” King
advanced Web apps, databases, information, media applications, Multimedia, note-taking, notebooks, rich internet apps, tagging, Web 2.0, Web 3.0, web applications

This blog is devoted to the discussion of Semantic Web and Web 2.0/3.0 technology.

Managing personal and small group information.

When it comes to so-called Web 2.0 and 3.0 technology, one of the most proliferate marketplaces involves the explosion of applications for managing information for individuals and small groups. Looking only at applications developed for Macs, we see an array of information management technologies.

Notebooks.

One of the most popular formats for managing information uses the paradigm of a notebook. The user can create a notebook, often selecting from multiple canned formats, such as a diary, class notes, or a novel, complete perhaps with a notebook cover and a spiral wire down the left side. The application creates a table of contents, and users can create sections and pages - and stuff virtually any kind of information on each page. Two very good examples of this approach are NoteShare and Notebook.

Interestingly, and perhaps because many of the applications in this category have been around for a number of years, these tend to not be true web applications. Often you can share notebooks, including full read/write access, via a URL and a simple browser interface, and you can publish a notebook at a URL. But the products are primarily for single-user, desktop use.

A good example of a notebook application that is a true web application is Zoho Notebook. (Zoho actually provides a large set of web based applications, of which the note program is just one.)

Buckets.

The other very popular note format uses the bucket or folder approach. The application may or may not support the nesting of these buckets and/or the creation of conceptual buckets, so that a given note can exist in more than one bucket. Two very good applications that use this approach are SOHO Notes and Yojimbo. These two applications are desktop-based, although most applications in this category support the synching of notes over multiple machines, using the Apple web-synching technology.

A hybrid desktop/web application is Evernote, which has elegant desktop applications for Windows machines, Macs, and a variety of handhelds and cell phones. It also has a very effective web interface. The user can sync multiple Evernote desktop instances via Evernote’s web server. Users can thus avoid ever using the web interface.

Outlines.

One specialized sort of information management application involves the creation of embedded outlines and bulleted lists. These applications, such as OmniOutliner, actually provide a full notebook functionality as well. OmniOutliner notebooks can be published on the web, but it is very definitely a desktop application.

Task lists.

An even more specialized class of information management applications support To-Do lists. Great examples are Zenbe Lists (they also provide integrated email and collaborative software) and rememberthemilk.com. These are web applications.

Photos and video.

There are a rapidly growing number of applications that allow users to collect, sort, tag, edit, and share photographs and video. Apple’s iPhoto is a great example. It is very much a desktop app, although applications in this class typically support the publication of images and video on the web, and sometimes, even read/write access via the web.

Stories, scripts, novels, and storyboards.

There are a number of highly specialized applications that support the development of fiction, including Final Draft and Montage (scripts), Scrivener and StoryMill (fiction prose), and Toon Boom storyboard (which is actually an impressive drawing program). Again, users can often publish to the web. Interestingly, many of these applications can easily be used as full blown, generic note applications, and can manage many forms of media.

Diary Applications.

Perhaps the most popular diary application on Macs is MacJournal (by the Montage and StoryMill folks). An interesting twist is that it is also an excellent blogging program. I use it to write this blog. This is, of course, one of the most widely used vehicles for sharing information on the web, and you can expect other sorts of personal information management systems to have blogging capabilities added to them.

Small, forms-based database management systems.

These applications are desktop apps. Apple’s Bento is a very good example. It actually is a sort of hybrid database/spreadsheet application. The most recent release allows multiple instances of Bento to share databases running on computers on a shared network.

Mind-Mapping.

The “circles and lines” applications have become highly specialized. The most well known one is MindManager, and there are versions for Windows machines and Macs. These are desktop apps. The vender, MindJet, recently introduced both web interfaces for sharing and updating desktop mind maps, as well as a web-based application that has a fresh, smooth interface, and provides team collaboration tools. Many forms of media can be placed in MindManager, including data from a wide variety of relational database management systems.

Screen and audio capture.

There are a number of applications that allow users to capture desktop video, along with audio voice-overs. Camtasia (which has Windows and Mac products) and Screenium are popular products.

These applications are, in a way, successors to slide applications like Microsoft Powerpoint and Apple Keynote. More and more presentations are being engineered with screen capture and audio applications, and these applications often support text and image data, as well as the insertion of video capture of the speaker. Sometimes, Powerpoint slides can be imported.

Conferencing apps.

There are several applications that provide hybrid desktop/browser live communication, including video, sound, and collaborative white-boarding. The best known one is probably Cisco WebEx, which comes in varieties for Macs and Windows machines. Skype supports a similar, limited product - which is free. One of the nice things about these products is that they come with their own voice lines. Other products, like Adobe ConnectNow, require the use of a cell phone to carry voice. With most of these products, a conference can be recorded for later use.

Finally…

Importantly, we note that in this rapidly-exploding marketplace, the borders between these various categories are being broken down, and applications often support a number of these capabilities at once. A good example is Curio, a desktop application that supports notes, lists, video, audio, white-boarding, mind-mapping, and limited web publishing.


Oct 11 2009   11:07PM GMT

Making information management scale: leveraging metadata on the new Web



Posted by: Roger “Buzz” King
3D modeling, automating Web searches, databases, DB2, information, Multimedia, MySQL, Oracle, PostgreSQL, RDF, Semantic Web, Video, Web 3.0, Web development frameworks, Web3.0

Previous postings of this blog.

This blog is dedicated to advanced Web development tools and concepts. Previous blog postings have focused on the emerging Semantic Web, which promises to make the Web radically easier to search and to greatly enhance the value of the vast sea of currently-disconnected information spread across the Web. We have also looked at Web 3.0 efforts, which promise to make multimedia websites highly usable and capable of conveying far more information than the current generation of websites. Previous postings describe breadth and depth of cutting edge Web technology.

Metadata: making that ratio small.

Here’s something that’s very important: Much of the ongoing research and development that is loosely categorized as Semantic Web and Web 3.0 efforts is focused on a specific technical goal, one that has been at the core of information management technology since the mainframe era that was epitomized by the IBM 360 series. That goal is to leverage metadata as much as possible.

It’s our best weapon against the truly staggering amount of information on the Web. This includes traditional text-based and numeric data, as well as books, medical advice, photographs, entertainment and training videos, music and recorded books, investment information, educational materials, scientific materials, e-government information, etc., etc. How can we possibly organize information and then search it in a way that scales? The Web is far from a closed world. In traditional data processing environments like banking, insurance, and credit card processing, we could get our arms around all of the data, as vast as it may have seemed. But the world of information today is an open world, effectively infinite in size.

Very informally, if you look at the size of the metadata divided by the size of the data itself, the smaller that fraction the better. In traditional relational databases (built with database management systems, such as Oracle, MS SQL Server, MySQL, PostgreSQL, or DB2), the extreme focus on minimizing this ratio has enabled the fast processing of extremely large volumes of data. The tradeoff is that the table definitions (or the “schema”), which form the heart of the metadata are very, very simplistic.

The old days: relational database schemas.

An insurance claim may be defined as a table with such columns as Subscriber_Name, Medical_Provider, etc., and thus, may consist of little or no more than a series of simple character and numeric fields. But if we need to process fifty thousand of them tonight, we must be able to bring many such table rows into memory at once, and quickly move through them. The database world was an extension of the paper world: a row in an insurance claim table was effectively an electronic successor to the traditional claim form.

Today: a far more challenging problem.

But on the new Web, information can be far more complex in nature, making the metadata to data ratio far larger. We’ve looked at some of the emerging technology and technical trends for embedding metadata in advanced forms of data (and for processing that metadata); this data includes books, images, video, modeling and animation, and sound. This new generation of information formats make up our personal health records and medical records images, industrial training materials, university “distance” courses, and the like. Each instance of these tends to be far more unique than individual insurance claim forms. And, it takes a lot of metadata to properly convey their “meaning”.

The challenge.

What we’re struggling with right now is to succinctly specify the meaning of modern media assets and to automate searching based on this metadata. This is our only hope for leveraging that ratio of metadata size divided by data size.


Oct 3 2009   9:12PM GMT

Multimedia: The Problem of Subtle Semantics



Posted by: Roger “Buzz” King
3D animation, 3D modeling, advanced Web apps, automating Web searches, blob data, continuous data, databases, information, Multimedia, rich internet apps, Semantic Web, smart search engines, tagging, Text, Web 2.0, Web 3.0, web applications, Web development, Web development frameworks, XML

The challenge of the Semantic Web.

We’ve looked at the emerging Semantic Web technology in the previous postings of this blog. The idea is to have a far, far smarter Web, one where the process of finding and interpreting and making use of far flung information can be largely automated. This is in sharp contrast with today’s Web, where these things have to be done in a painful, extremely time-consuming fashion.

So that is the key challenge. It has to do with searching the kinds of information that are important to us in our daily lives. This information, as it turns out, is very difficult to process automatically. Why is this?

The complexity of modern multimedia.

I teach a very basic 3D animation class to mostly computer science students. We use Maya, arguably the most popular 3D animation application, one that is used in the making of many animated features. The interesting thing about animation is that it is truly multimedia. It can give us a lot of insight into what we need the new Web to do for us.

That’s because the number and diversity of applications that are used for drawing, documenting, modeling, animating, motion capture, texturing, video rendering, video editing, video conversion and compression, sound editing, in even small projects, can be very impressive. Correspondingly, the wide variety and complexity of media formats involved in an animation project can be overwhelming.

What happens in an animation project? The workflow might begin with vector storyboard drawings to break the story down into scenes. In a typical animation project, 3D models in a variety of proprietary formats are used. Models must be transformed as they are exported from one application and imported into the next. Multiple video renders of animated models are made, and they must be edited together, along with multiple sound files. Multiple video and audio formats might be used. 2D images are used for textures; photographs of butterfly wings can be used to make an animated butterfly very realistic, and a checkerboard image made with Photoshop can be used to make a Linoleum floor. And along the way, a variety of note taking, screen capture, and conferencing software might be used to facilitate group communication.

There is also a heavy focus on reuse in an animation project. Building every model, editing every texture, creating every environment and background, recording every sound from scratch is frequently intractable. If existing assets cannot be tailored and reused, the project would be far too expensive and time consuming, and would demand too wide a variety of professionals to always be available. This raises the multimedia stakes, as assets of widely differing forms must be constantly reconfigured and used in concert in new ways.

But what’s the real problem? We aren’t all trying to produce complex animated videos. But very interestingly, in our everyday lives we essentially face the animator’s challenge when we try to find and use information on the Web. That’s because we’re often looking for things whose meaning, whose interpretation, demands focused human thought. We are looking not for business data, but for pieces of media, and the problem is that today, most of our searching has to be based on tags or brief textual descriptions that are associated with pieces of media, and not on the true meaning of the media itself.

The needs of the business world are not our needs.

It’s the subjective nature of media assets - this is what is at the heart of the problem facing us. Existing technology for searching the web is based on keywords and very short pieces of text.

There is other technology, though, under active development, stuff that serves as the information storage backbone of most commercial websites. It’s the technology that has for decades been used in-house (not on the Web) by businesses when they process large databases. But this stuff was designed to handle traditional business data forms, like integers, character strings, real numbers, dates, timestamps, and full text.

There is more, though. All of the major database management systems, along with tools for building and searching advanced websites are being retrofitted (or in some cases, built from the ground up) to manage more than keywords and text, more than standard business data.

But up to now, the focus has not been on supporting the kinds of information you and I are most interested in. The focus has been on extending database and Web technology to support xml documents, as well as more complex data objects, like those inside a Java program, as well as other forms of data found inside programs. This includes arrays and lists and short pieces of textual data, like the names of diseases.

In other words, we’ve been busy extending our support of the business world, so they can store complex business data in databases and make that information processable over the Web. You and I have largely been left out.

Finally, we are attacking our needs.

But there now many ongoing efforts to extend database and Web technology to make it useful to us. The new focus is on supporting blob and continuous media like images, video, and audio. This is extremely hard to do.

Why? Because the strongest means by which we deduce the meeting of business data is by looking at its internal structure and the terms that are used to describe that structure. A relational table named Prescriptions, with a character attributes Patient Name, Doctor’s Name, and Medication, and with a numeric attribute Dosage, is pretty easy to interpret.

But what do we do with a photograph, which is just a grid of pixels with no internal structure? Or a long series of images, along with a sound track, put together to form a piece of video?

The U.S. military has been pumping money into image processing for several decades, and so all is not lost. There is a vast body of mathematical research and software development that allows us to write programs that can find a particular face in a crowd and search satellite photos for airplane runways. But in general, we cannot at this time write a program that can process an arbitrary photo or video clip and tell us what it means. That means we can’t quickly search vast media database for useful pieces of information.

The goal behind the Semantic Web effort is to build a new generation of websites whose information can be searched automatically, and where information from multiple sites can be automatically integrated. To do this with numeric and character based data is quite doable. But when it comes to multimedia, like images and sound and video and 3D models and engineering designs, well, we have a long way to go. The meaning - in other words, the semantics - of these forms of data are complex and subtle, and highly dependent upon an individual’s interpretation of that media.

So, we see that we have only just begun our journey to create the new Web.


Jun 11 2009   11:44AM GMT

The two duct tapes of computing: Excel and Firefox, and the New Web



Posted by: Roger “Buzz” King
Web 3.0, Web 2.0, the Semantic Web, Multimedia, Excel, browsers, models of computing, smart browsers

This blog concerns advanced Web technologies. Each posting should be readable on its own, but the series of blogs as a whole tell a continuous story.

In this posting, we look at the Duct Tape Phenomena.

Excel.

As a researcher, I have worked with biologist in the past. Big biologists, not microbiologists, the folks who tinker with DNA. The folks I worked with study macroscopic things mostly, species, in particular. They search for as-yet undocumented species. They tend to have appointments at major universities around the world, and then take extended field trips to study life. Most of them go to rain forests because that’s where biodiversity is its greatest.

Each scientist has a chunk of the world and a kind of animal they specialize in. I know the butterfly man of Costa Rica, a fellow who has documented several thousand varieties of butterflies, some of which have wing spans of several inches. I know the bug man of the Amazon, who builds long tunnel-like things from the floor of the forest up to the canopy, fills the tunnels with bug killer, and then looks among the dead for bugs that are yet unheard-of.

Here’s the interesting part, at least from a computing perspective: a lot of the scientists I came into contact with store their data in Excel. This is a phenomena that crosscuts the entire spectrum of computer users. They had to learn Excel at some point, maybe in school or at some workplace, and the next time they needed an application to do something, they found a way to make Excel do the job. For most people, learning the “right” application to use is far too much work, even if it’s hard to query Excel the way we would a database, even if Excel spreadsheets get way out of control size-wise, given the large amount of data many of us collect.

Excel, in many ways, is the duct tape of desktop and notebook computing.

Firefox (or your favorite browser).

But what about developers of desktop apps? What do they use as a design paradigm when building the interface to an app, even if it’s not meant for the Web?

Browsers.

Indeed, there is a merging of desktop GUI and web app interface technologies, and now you could sit down in front of a running app and not be sure which of the two you are seeing. In fact, the design impact is not the end of it. We actually use browsers now to interface with some desktop apps, but not often, not yet. However, at least as a user interface paradigm, the browser is becoming the duct tape of GUI design.

For developers of interfaces, Firefox has become a sort of duct tape.

The new Web.

These are the two things that underly much of computing: the need to store and compute (as with Excel) and the need to interface (as with Firefox). But when the new Web, (in the form of the Semantic Web and truly advanced Web 3.0 apps), begins to arrive, will a new paradigm emerge?

Perhaps they will be extra smart browsers that can process code written with xml and namespace and other semantic technology, so they can do more than just look for pages according to the English keywords on them.

In other words, we could imagine them as extensions of what our browsers do for us now. They’re very stupid now, really. They’re not at all smart like Excel.

How does it work now? Crawlers commissioned by search engines like Google constantly search the Web and “invert” every static page they find by building an index on every word in them. And then later, we can search this gigantic index store according to the words that appear on the pages that the crawler has found. Once we find URLs of interest, we click on them and go visit the actual pages. These searchers are far, far less than “semantic” in nature.

Our smart browsers will also have to let us build up organized libraries of specialized web content we have found, including documents, images, video, sound, animation, and such specialized data as medical treatment advice. We might maintain these in virtual space, or we might download frozen copies of pages to store on our machines. Our smart browsers could constantly look for updated versions of pages we have copied and downloaded.

These smart browsers will also have to interrelate data of a wide variety of sorts, so that a description of certain symptoms can be accurately hooked up with the specifics of a diagnosis and a medical treatment plan. Our browsers will have to isolate conflicting information, as well.

So, in the future, we’ll need browsers with smarts. We’ll look at this much more carefully in a future posting of this blog, but for now, here’s the lesson: thats the two things that applications do for us, they let us store and search things, and they let us compute things.

And what about viewing all this information? How will so much complex, multimedia information be presented? Not as simple webpages with images, text, and things you can click on. Perhaps the new browsers will lay out multimedia presentations of complex, integrated information that has been synthesized from many, many different sources.

The point.

So, what does this imply? That these two things underly computing apps of almost all sorts: 1, storing and searching, and 2, viewing and manipulating.

And they will underlie the most complex and sophisticated end-user applications of the future.

In a vague, somewhat analogous fashion, most apps are a blend of Excel and Firefox.

Things change radically over time. And things never really change at all.



Apr 2 2009   5:59AM GMT

Full Text searching: cleaver heuristics for managing large web-based document collections.



Posted by: Roger “Buzz” King
XML, the Semantic Web, SMIL, web applications, Web 2.0, documents, Web 3.0, databases, MySQL, SQL Server, Multimedia, full text, full text searching

There is an explosion of technology for supporting sophisticated forms of media on websites and in web applications. In our continuing series on advanced web applications (in particular, as they pertains to the Semantic Web and Web 2.0/3.0), we’ve looked at continuous media, in particular, video and multimedia presentations. But there is a very old form of continuous media, something that is perhaps the dominant media on the Web, and that’s text.

It’s becoming a very major issue in web development.

Text.

In this blog entry, we’ll be looking at a particular form of text, called “full text”.

But just what is text to begin with? It’s character-based data, anything we can read.

And what will we want to do with it in next-generation web applications? It’s important to note that more and more vast libraries of documents are being put online. Web applications need to provide far faster and more accurate searches of documents than what we can perform with Google.

Interestingly, a successful technology, called “full text retrieval”, is already in place in the relational database systems that underlie modern web applications. It’s there working for us, and we are likely to not be aware of how clever it is.

It’s also something that should be used much more heavily by web application developers.

Let’s step back and consider three different - and increasingly more sophisticated - ways of managing character data.

Atomic Character Attributes.

First, there is the traditional relational database approach, whereby data is stored as tables made of rows of atomic, fixed sized attributes. By atomic, we mean that each attribute has no internal structure. So, a table of insurance claims might have rows with the following attributes: Claims-ID (an integer), Amount (an integer), Medical_Problem (a fixed length character string), and Subscriber_Name (a fixed length character string). Using SQL, the universal database “query” language, we might look for all rows that contain the name “Fred Jones”. Or, we might search for all rows that have claim numbers that are between 110 and 115.

Essentially, this approach limits us to comparing small strings of data to each other or to fixed values. There are some common extensions that we find in relational databases, such as being able to ask the question to find all rows where the Medical_Problem is something like “broken leg”. Then if a row actually has the value “broken legs”, we would most likely see this row in our results.

Full Text.

Second, there is the ability to search pieces of text according to their natural language (in this case, English) meaning. In this case, we consider the character data to have internal structure, and the values are not considered atomic. Often, these pieces of text are long and of variable length from one row to the next.

It is actually an extension of - but a very dramatic one - of the like operator in SQL.

It is what we call “full text” management or retrieval, and modern relational database management systems like MySQL and Microsoft SQL Server support this. This was seen long ago as a critical extension to relational database technology. Thus, we might rename our Medical_Problem field to Doctor’s_Diagnosis, and allow free form English text in this attribute, as well as allowing the value to be quite long. Then we might search for all rows where the doctor describes “fractures of the lower limbs”. Notice that none of these words might actually appear in the attribute, which might simply refer to “broken legs”.

Natural Language Processing.

This capability would clearly be very powerful, if we could do it right. The problem is that to support it fully, we would need to use highly advanced natural language processing techniques, which are very time consuming to execute, especially on huge databases of large documents. The full text approach tries to simulate true natural language searching in a far less expensive way. The real thing, by the way, might not be all that accurate anyway. Natural language is naturally ambiguous and very subtle.

True natural language searching would be our third way of processing character-based data, by the way. It is not a fully developed technology. And importantly, we usually don’t need anything that fancy.

The Clever Compromise.

So, our middle option, full text searching, is what dominates today - and it is a surprisingly accurate, and efficient, technique that operates on a small set of heuristics. It can transform a dumb webpage where we can only search for small, fixed character strings, to a rich next-generation webpage that can effectively be searched according to its meaning. It allows us to manage very large text documents in web applications - and get us surprisingly close to the semantic power of true natural language searching.

We’re not going to go into a lot of detail here, but here are some of the heuristics that are used in full text search. First, “stemming” and related techniques are used; they conjugate verbs, detect plurals of nouns, and remove prefixes and suffixes. Another technique is to use a “stop list” that lists words that should be ignored, like “the”. The system might also let us specify the “proximity” of words; this refers to how closely specific words should appear in a document. It can also be powerful to include a synonym checker. And the ability to allow for “wild cards”, in particular, letters that may vary in a passage without changing its meaning, can be quite useful. Dictionaries of technical words that pertain to specific domains (like medicine or law) are very useful. We might also provide a feedback capability, whereby users can train full text search engines to be more accurate.

This clearly doesn’t come anywhere near true natural language processing - but it is fast. It will be a growing technology on the new web, with a lot of hidden development, making this heuristic-based technique more and more effective.

Indexing.

We should note that there is a significant up front cost in preparing a document for full text searching: we need to build an index with an entry for every (non-stop) word in the text. Then, when a query is executed, we can look for words in the document by searching the index, instead of searching the full text. If there were no index, the search would be extremely time-consuming.

The Future.

As more and more governmental, educational, medical, and other complex documents become available on the web, advanced full text searching will enable us to search vast databases in a tractable fashion. Even more clever full text retrieval engines will turn dumb, “gotta Google them” document portals into true Web 3.0 and Semantic Web applications.



Mar 11 2009   8:24PM GMT

Web services: part of the Web 2.0 & Semantic Web picture



Posted by: Roger “Buzz” King
Multimedia, Rich Web Apps, the Semantic Web, Web 2.0, Web 3.0, Web development, rich internet apps, web services

This is the fifth in a series of blogs about the Semantic Web and Web 2.0/3.0. While the sequence of blog posts tell a continuous story, each blog should be fully informative if read out of sequence.

So far, we’ve discussed the Semantic Web, which is an attempt at automating the process of searching the web and integrating the results, and Web 2.0/3.0, which is largely oriented toward making media-intensive web applications highly responsive. (We noted in an earlier blog that Web 3.0 is an extension of Web 2.0, and we will look at this transition in a fugure blog.)

But there’s a third term that is often thrown into the mix: web services. What are they?

A web service is a web application that is not accessed interactively by a human, but rather by a program.

To make use of a website, we load a URL into a browser and visit the site. Once we’re there, we might - even if we don’t realize it - be operating a very sophisticated application, like Amazon. We can search their inventory - which sits inside a very large database - via Amazon’s search form.

But there’s something else you can do with Amazon. We can access their inventory via a web service. Or more precisely, we can build programs that can access their inventory by communicating with programs (called web services) that they have provided. Our programs and their programs talk to each other directly. These services can be used to do things that would usually would be very time-consuming, and in fact, often intractable, if performed with a browser. Their web services also allow third party vendors to post their stock on Amazon, and in return for a fee, let Amazon sell and ship their products. Thus, independent vendors can easily make themselves an extension of Amazon, something that works so smoothly that if we don’t look carefully, we might not realize we are buying something that is not directly marketed by Amazon.

The way a web service works is by the provider of the service (such as Amazon) making the interface to the software that implements the service publicly accessible over the Internet. Such an interface is called an Application Programming Interface, or API. This way, anyone who wants to write a program that will access the service over the web knows exactly how to write their program to talk to the web service. These programs that access web services are called “client” applications.

There is a wide class of web services available on the Internet, and many of them provide APIs that allow programmers to write software that can access vast databases of such things as news and real estate information. Many web services also are available via a website, for users who want to use the service interactively. And many client applications are really just doing the same thing a browser might do when accessing a website, except that the client is likely to be a far more specialized application and it runs as a desktop application on your machine. More importantly, that client program might be able to things that your browser cannot do.

For example, there is a web service called MusicBrainz; it provides information about music, not the music itself. It can be accessed via an API. There is also a website, MusicBrainz.org, where you can search the database interactively. The API might be accessed by a CD player application; it can communicate with MusicBrainz (without you knowing it) to download information about whatever CD you happen to be listening to on your desktop. It might enable your CD player to tell you the artist’s name and variations of that name, the release date, the catalog number, etc.

Since part of the idea is that we don’t have to directly interact with web services by using a browser, their explosive growth has been very quiet. Many websites are powered by input they get by using web service APIs. These second-hand websites are often called “portals”, and many portals integrate information from a number of sources and give you access to information that would otherwise be very tedious to find on your own. Web services are thus a critical building block for many of the multimedia, highly interactive websites that constitute much of the Web 2.0 effort.

In fact, they underscore the difficulty in making a sharp distinction between the Semantic Web and Web 2.0/3.0. This is because both of them depend highly on automating the movement of information around the Internet. A Web 2.0 website (often called a web application because it provides fast access to complex information, in particular, sound, images, and/or video) cannot answer your search request quickly unless it has ongoing, rapid access to underlying streams of information on the web. But this capability, of providing us with information integrated from multiple websites, is actually a cornerstone of the emerging Semantic Web.

The difference is that the Semantic Web will (hopefully) someday put a tremendous amount of smarts into web services, and allow us to locate, transform, and integrate information in extremely complex ways. The Semantic Web, in this sense, can be viewed as an extremely aggressive extent of the Web 2.0/3.0 effort.

So there we have it. Web services, in their hidden way, are rapidly evolving the web into something incredibly powerful.


Mar 5 2009   5:05AM GMT

Multimedia, what is it? Why do we care?



Posted by: Roger “Buzz” King
Web 2.0, Web 3.0, Web development, Multimedia, SMIL, Text, the Semantic Web, Video, XML

This is the fourth in a series of blogs on the Semantic Web and Web 2.0/3.0.

To get us going here, just what is “multimedia”? At one level, it simply refers to applications that manipulate, store, and/or present multiple kinds of media, such as text, video, relational data, sound, animation, etc. More pragmatically, it refers to the introduction of blob and continuous forms of data into applications that traditionally manage simple data, like character strings and numbers. In its most aggressive form, multimedia refers to the sophisticated integration of traditional, blob, and continuous data into integrated data forms that convey their own semantics.

A quick note: Blob data is data that is stored in a semantics-less fashion, usually as simple binary or character data. This could be almost any sort of data, such as images, video, sound, or natural language - but the key element is that the language or system being used to manipulate it doesn’t have an appropriate, specific data type. Blob data is often large, of variable size, and usually requires a sophisticated, outside application to interpret and present it. It is the default, catch-all way to store advanced forms of data in relational database management systems.

Another quick note: Continuous data is data that has a temporal aspect or can be broken down into segments that have their own identity. The visual part of video can be broken into clips; in fact, it can be broken all the way down to individual pixel-based images. Sound can be cut into pieces. James Joyce’s Ulysses is a big piece of continuous textual data. Like blob data, we typically need an outside appliation to interpret it. (Even the most complex application, a human, generally has trouble doing this with Ulysses.)

Back to the Semantic Web, Web 2.0/3.0, and multimedia.

In a previous blog, we tried to define these two terms and explain why they are very different concepts. The Semantic Web is an attempt to automate the searching of the web and the integration of data collected on the web; the idea is to greatly ease the painful interactive nature of using a search engine like Google. Web 2.0/3.0 (and no, there is no sharp distinction between the two) are largely about performance, of making web applications as responsive as possible, potentially as responsive as desktop applications.

But they share one common goal: effectively managing advanced forms of media. From the Semantic Web perspective, how can we search things like sound, images, video, and natural language in a semantically-meaningful way? We use sophisticated tags and image/sound processing to do this, but it is only a small step toward a solution.

From a Web 2.0/3.0 perspective, how can we deliver up such forms of media in way that is highly responsive? Video streaming on the web is a huge challenge, for example. Or, how can we interact with video in a responsive way, in such things as games and digital libraries?

The web, in fact, is inately multimedia: we take images, icons, links, text, video, sound, and various user controls like buttons and menus, and put them together in highly sophisticated ways. And behind these web pages, databases often sit, populating dynamic pages with information in response to user requests - this data is virtually invisible to search engines. This is what makes the Semantic Web in particular such an incredible challenge. How can we ever hope to search the web automatically?

There are modest advancements that have been made. One example is something called the Synchronized Multimedia Integration Language (SMIL, pronounced like the facial phenomena). It is an XML extension that supports basic constructs to glue multiple forms of media together in two dimensions and in temporal sequences. Using XML elements and attributes (the basic constructs of XML), we can create multimedia presentations in a precise, unambiguous way.

SMIL presentations can be processed automatically. This is very significant.

So, multimedia: it’s at the core of both the Semantic Web and Web 2.0/3.0. It is one of the basic motivations for their existence.