Web Development archives - Buzz’s Blog: On Web 3.0 and the Semantic Web

Buzz’s Blog: On Web 3.0 and the Semantic Web:

Web development

Nov 19 2009   12:09AM GMT

Computer Science departments, listen up: media management is a core software skill



Posted by: Roger “Buzz” King
Web development

Multimedia in computer science departments.

I teach in a computer science department, and in the previous posting of this blog, I argued that universities and colleges have been very slow to introduce basic animation skills into their curricula. In this posting, I argue that the same is true for basic media management skills, and that this is also a critical area of study for computer science students.

It’s part of a broader expansion of the discipline.

Why?

Well, for starters, the bounds of computer science have shifted and expanded greatly. It’s not about the development of techniques for building operating systems and compilers, and formal specifications of algorithms and their running costs, and the like. Not any more, it’s not. Much of the old problems now have fairly settled and widely used solutions. We are increasingly focused on the development of web-resident information systems, the automation of web searches, the development of web services, network and database security, medical information systems, the modeling of complex 3D models in engineering and entertainment, and the like - things that have been discussed in previous postings of this blog.

Another key area is the management of video, images, audio, animation, documents, and other advanced forms of media. These topics have also been discussed in this blog in the past.

Academics and the ignorance of real world tools.

It’s not that we academics don’t know that this is a critical area. The problem is that computer science faculty typically know little or nothing about large, commercial applications for creating, manipulating, and storing media, or about the emerging standards for formatting and tagging media.

But more significantly, there is a stuffy, longstanding belief on the part of computer science academics that teaching such practical things would turn our departments into trade schools, and that we teach “principles” and “formalisms”, and that we prepare students for the next fifty years, not the next five years.

That’s BS.

Universities just have a lot of trouble evolving. We are big machines with tremendous inertia.

The necessary skills.

So what do students need to know? I admit that there is a broader question here. What is the right compromise between abstract, longstanding concepts and hands-on experience with real world tools? But surely, nobody thinks we should essentially ignore the enormous software technology base that is out there?

We cannot continue to turn out students who are only mildly aware of the vast sea of desktop and web applications for managing media, database management tools for storing and searching media, processing full text and natural language, compressing and cleaning audio and video, editing sound and video, standards for formatting images and sound and video and 2D/3D models.

You have to have some idea of what technology is out there if you are going to build the next version of that technology!

But maybe we’re doing too little too late.

Besides claiming that this knowledge is not “academic”, computer science departments claim that students pick this kind of stuff up on their own. This used to be a ridiculous claim. But in truth, energetic computing students do indeed pick this stuff up now, at least to some degree. But this is really an inditment of academic computing: the Web has become a vast, formal and informal learning grounds, and it is eclipsing computer science departments to a large degree.

Where to go from here.

So, what’s the real point?

We need to train a new generation of faculty members, radically evolve our curriculums, build computing labs that are equipped with advanced media applications and storage managers that professors actually understand, and above all else, reevaluate our position in the learning world. Students are turning away from us and toward a vast array of video, textual, and audio learning tools that have exploded onto the Web.

Nov 7 2009   10:32PM GMT

Computer Science departments, listen up: animation is a core software skill



Posted by: Roger “Buzz” King
Web development

What’s missing in computer science curriculums?

I teach in a computer science department, and one thing is painfully true: universities and colleges have been very slow to introduce basic animation skills into their curriculums. This is a big problem.

Why?

2D and 3D graphics and animation are popping up everywhere, and more and more, programmers discover they have to be part time artists. This is particularly true for developers of Web 2.0/3.0 apps. Web app developers find themselves using graphics tools to build both user interface controls and to create animated models. And, as applications that can convert 3D models and animations into lightweight renderings become more efficient and more powerful, Web app developers are having to add 3D tools to their quiver of arrows.

2D animation engines include Adobe Flash, Microsoft Silverlight, and HTML 5. The drag-and-drop application that generates Flash animation, Adobe Flash Developer (recently renamed from Flex Developer), along with Microsoft Blend, which generates Silverlight code, give the programmer a way to develop animation without having to work with an artist’s interface. But in truth, drag-and-drop tools only get the programmer so far. In order to refine, extend, and debug interfaces, the programmer has to master the two XML languages used by Flash Developer and Blend (MXML and XAML, respectively) to define interface components and 3D models. The programmer also needs to be comfortable with the two languages that these XML specifications compile down to (ActionScript and C#). and that means learning their extensive animation capabilities.

And it’s not just small scale modeling and animation.

Sophisticated 2D and 3D animation is also confronting the young programmer. Game, feature film, animation short, training video, and TV show development are rapid growth areas. Interestingly, while powerful GUI-based applications like Autodesk Maya, Autodesk 3DS Max, Toon Boom Animate, Vue, and Poser are used largely by non-programmers, there is a critical niche for the programmer-animator. Scripting languages are used to perform many basic modeling and refinement tasks. Not to mention the fact that someone has to build these huge animation apps, and these folks, well, they’re programmers. The point is that it’s hard to build an application that creates things you do not understand.

The emergence of canned content and cheap animation apps.

There is also an explosion of applications that provide canned animation capabilities, and there are a growing number of websites that sell animation content. This makes it feasible for programmers to create basic animations for websites and desktop applications, without the need for full-blown animation artists.  DAZ3d.com and contentparadise.com are two highly popular content sites. And sophisticated animation projects can be developed with applications that are cheap (or free). These include DAZ, Blender, and Carrara.

The bigger picture: the boundaries between disciplines are breaking down.

Perhaps the most compelling reason for universities and colleges to start treating animation as a first class academic citizen is that the nature of computing itself is rapidly undergoing an expansion. Computer science graduates are finding jobs in the financial, communication, genetic engineering, mechanical and electrical engineering, alternative fuels, architecture, advertising, business, and medical industries - and all of these professional disciplines have substantive animation components. It’s the age of merging fields, with borders collapsing, and computing skills becoming necessary in almost all walks of life. As non-technical types must be able to do basic programming and software configuration tasks, programmers are learning that they need a non-programming area of expertise in order to stay competitive - and tossing animation skills into the pot is a sure plus.


Nov 2 2009   4:47AM GMT

The need for declarative technology in multimedia asset management



Posted by: Roger “Buzz” King
Web development

What “declarative”  really means

In programming languages, we use the word “declarative” to refer to a language that does not force a programmer to specify more sequencing information than is strictly necessary.  The idea is for the program to tell the computer what needs to be done, and not precisely how to do it. Instead of an algorithm, we provide a static specification of what the result will look like. In an imperative (or non-declarative) language, the programmer might specify that an array is to be read from position 0 to position 99, and that at each position in the array, the value at that position is to be increased by 1.  In a declarative language, the programmer might be able to simply state that every entry in the array is to be incremented by 1.

But what does the word “declarative” really mean, English-wise?  Well, it refers to the process of making a declaration, of making a formal statement about something.

Searching web-based media assets: today’s tools

What does this have to do with the Semantic Web and/or Web 3.0? That’s what this blog is dedicated to: next generation web technology.

A major growth area for the web will be applications that manage complex forms of media, and the automatic searching of blob and continuous media, such as images, video, sound, animation, 3D models, and of mixed-mode media. These will present a major challenge. Simply put, our best technology for making advanced forms of media searchable is tagging. And this low-level tool doesn’t come close to allowing us to search according to the true meaning of media assets. Searching for blob and continuous media is still painstaking and manual.

So, how could we make things like video and 3D models more searchable? How could we improve the search process? Two important technologies offer significant help. The first is more sophisticated, high level, and content-ful tagging protocols, such as MPEG-7. Another is image processing, which is actually a highly developed area, since the U.S. government has poured many millions of dollars into it over the past several decades. It’s also true that language processing tools have been used to parse and interpret textual descriptions of media, but this sort of freeform analysis is difficult to make accurate and predictable, given the extreme complexity and ambiguity of natural language. People write “stories” with language, and a long piece of text has to be read from beginning to end, in order to understand it.

What about using notes?

But perhaps the future lies is a sort of compromise technology, one where tagging information made with tools like MPEG-7, combined with image processing, and/or natural language processing, is used to cut the search space from many thousands of media artifacts to something that could be processed interactively by humans. This is in contrast to downloading potentially huge files and viewing them in real time. Even downloading small video and audio clips and low-pixel count preview images can overwhelm the average interactive web user. These often don’t give an accurate vision of what the full pieces of media contain.

The answer might lie in highly organized “notes”, written with note-taking applications. These applications provide quick, compact, and highly visual ways for people to document their thoughts. They range from lists to outlines to hierarchically structured blocks of text to diagrammatic “mind-maps”. Note-taking applications often support video and images and sound; in a way that might seem ironic, an individual could create mini-multimedia artifacts to facilitate the searching of large multimedia assets. But in truth, this is could be a very powerful technique - because one of the primary attributes of most note-taking applications is that they provide “at-a-glance” semantics. In other words, if used right, a note or a list or a mind-map is captured on in a single screen image. And, when users build more complex notes, these applications typically facilitate very top-down structures. Notebooks have tables of content; hierarchical notes have root nodes; mind-maps are expandable.

And above all else, there is something about the note-taking philosophy that encourages compactness. In other words, they are in a sense, declarative. A note makes a quick, firm statement.

More on this, soon.


Oct 26 2009   8:58PM GMT

The imposing heterogeneity of media applications



Posted by: Roger “Buzz” King
3D animation, 3D modeling, advanced Web apps, automating Web searches, continuous data, media applications, Multimedia, the Metadata Object Description Schema, Video, video containers, Web 3.0, Web development, Web development frameworks, XML Schema

This blog is dedicated to the discussion of emerging web technologies. Today, we look at a the rapidly growing world of media applications, and their impact on the Semantic Web.

The problem of searching for media assets.

We’ve already looked at advanced media, in particular video, audio, and animation data, in previous blog postings. In particular, we’ve looked at the subtle and complex nature of media asset semantics. We’ve seen that interpreting a piece of video, for example, is far, far more difficult than interpreting an integer or character field. Since the goal of the Semantic Web effort is to make the searching of the web highly automated, advanced media is becoming a huge and critical research and development focus for the builders of next-generation web development applications.

Just how do we provide an environment where media assets can be searched in a mostly automatic fashion, so that a human does not have to painfully paw through hundreds or thousands (or millions) of video chunks to find the right one? We’ve looked at emerging technologies for marking up advanced media information, and for making it usable in a variety of web applications. We’ve also looked at the dramatic challenge presented by mega apps to would-be users; the interfaces to these applications are truly massive and cannot present to the user the way in which they are meant to be used.

The problem of proprietary formats.

One specific, and very difficult problem, is the massive heterogeneity, not just of media formats, compression technologies, and container technologies, but of the applications themselves. If we are going to automate the searching of complex modeling, video, audio, and other media assets, we’re going to have to address a key question: since many media apps make use of their own proprietary data formats, how are we going to provide automated ways of searching media assets that are stored in these formats?

The problem of highly imperfect generic formats.

There are indeed many existing, as well as soon-to-emerge, standards for importing and exporting data between powerful media applications, but transformations in and out of these formats are often “lossy”, in that information is lost or changed. In fact, locating and downloading assets that are in supposedly-generic form is often very frustrating, because these assets end up not performing well. They can be difficult to edit and reuse. 3D animation models regularly blow up when animators try to import them into animation applications and the manipulate them. A hawk may look like a hawk until you try to render it with its wings flapping, and suddenly it’s a blob of geometric garbage.

One possible direction.

So, what do we do about the fact that many media assets must be manipulated by the original applications that created them? How can we facilitate reuse? It’s extremely unrealistic to expect users to master perhaps dozens of video or audio or animation applications. Filtering assets according to their file extensions is a good idea, and it is a well established practice.

But what we really need is a globally-known site that either literally or conceptually centralizes the massive network of import/export relationships, along with information about the relative success of these mappings. Are they ever lossy? If so, can they be fixed? What series of applications might we want an asset to be imported/exported through so that in the end it is in a usable format, given the applications that the user owns and has mastered?

There is much to be done. Right now, searching for and reusing media assets is a painstaking, trial-and-error-prone process.


Oct 3 2009   9:12PM GMT

Multimedia: The Problem of Subtle Semantics



Posted by: Roger “Buzz” King
3D animation, 3D modeling, advanced Web apps, automating Web searches, blob data, continuous data, databases, information, Multimedia, rich internet apps, Semantic Web, smart search engines, tagging, Text, Web 2.0, Web 3.0, web applications, Web development, Web development frameworks, XML

The challenge of the Semantic Web.

We’ve looked at the emerging Semantic Web technology in the previous postings of this blog. The idea is to have a far, far smarter Web, one where the process of finding and interpreting and making use of far flung information can be largely automated. This is in sharp contrast with today’s Web, where these things have to be done in a painful, extremely time-consuming fashion.

So that is the key challenge. It has to do with searching the kinds of information that are important to us in our daily lives. This information, as it turns out, is very difficult to process automatically. Why is this?

The complexity of modern multimedia.

I teach a very basic 3D animation class to mostly computer science students. We use Maya, arguably the most popular 3D animation application, one that is used in the making of many animated features. The interesting thing about animation is that it is truly multimedia. It can give us a lot of insight into what we need the new Web to do for us.

That’s because the number and diversity of applications that are used for drawing, documenting, modeling, animating, motion capture, texturing, video rendering, video editing, video conversion and compression, sound editing, in even small projects, can be very impressive. Correspondingly, the wide variety and complexity of media formats involved in an animation project can be overwhelming.

What happens in an animation project? The workflow might begin with vector storyboard drawings to break the story down into scenes. In a typical animation project, 3D models in a variety of proprietary formats are used. Models must be transformed as they are exported from one application and imported into the next. Multiple video renders of animated models are made, and they must be edited together, along with multiple sound files. Multiple video and audio formats might be used. 2D images are used for textures; photographs of butterfly wings can be used to make an animated butterfly very realistic, and a checkerboard image made with Photoshop can be used to make a Linoleum floor. And along the way, a variety of note taking, screen capture, and conferencing software might be used to facilitate group communication.

There is also a heavy focus on reuse in an animation project. Building every model, editing every texture, creating every environment and background, recording every sound from scratch is frequently intractable. If existing assets cannot be tailored and reused, the project would be far too expensive and time consuming, and would demand too wide a variety of professionals to always be available. This raises the multimedia stakes, as assets of widely differing forms must be constantly reconfigured and used in concert in new ways.

But what’s the real problem? We aren’t all trying to produce complex animated videos. But very interestingly, in our everyday lives we essentially face the animator’s challenge when we try to find and use information on the Web. That’s because we’re often looking for things whose meaning, whose interpretation, demands focused human thought. We are looking not for business data, but for pieces of media, and the problem is that today, most of our searching has to be based on tags or brief textual descriptions that are associated with pieces of media, and not on the true meaning of the media itself.

The needs of the business world are not our needs.

It’s the subjective nature of media assets - this is what is at the heart of the problem facing us. Existing technology for searching the web is based on keywords and very short pieces of text.

There is other technology, though, under active development, stuff that serves as the information storage backbone of most commercial websites. It’s the technology that has for decades been used in-house (not on the Web) by businesses when they process large databases. But this stuff was designed to handle traditional business data forms, like integers, character strings, real numbers, dates, timestamps, and full text.

There is more, though. All of the major database management systems, along with tools for building and searching advanced websites are being retrofitted (or in some cases, built from the ground up) to manage more than keywords and text, more than standard business data.

But up to now, the focus has not been on supporting the kinds of information you and I are most interested in. The focus has been on extending database and Web technology to support xml documents, as well as more complex data objects, like those inside a Java program, as well as other forms of data found inside programs. This includes arrays and lists and short pieces of textual data, like the names of diseases.

In other words, we’ve been busy extending our support of the business world, so they can store complex business data in databases and make that information processable over the Web. You and I have largely been left out.

Finally, we are attacking our needs.

But there now many ongoing efforts to extend database and Web technology to make it useful to us. The new focus is on supporting blob and continuous media like images, video, and audio. This is extremely hard to do.

Why? Because the strongest means by which we deduce the meeting of business data is by looking at its internal structure and the terms that are used to describe that structure. A relational table named Prescriptions, with a character attributes Patient Name, Doctor’s Name, and Medication, and with a numeric attribute Dosage, is pretty easy to interpret.

But what do we do with a photograph, which is just a grid of pixels with no internal structure? Or a long series of images, along with a sound track, put together to form a piece of video?

The U.S. military has been pumping money into image processing for several decades, and so all is not lost. There is a vast body of mathematical research and software development that allows us to write programs that can find a particular face in a crowd and search satellite photos for airplane runways. But in general, we cannot at this time write a program that can process an arbitrary photo or video clip and tell us what it means. That means we can’t quickly search vast media database for useful pieces of information.

The goal behind the Semantic Web effort is to build a new generation of websites whose information can be searched automatically, and where information from multiple sites can be automatically integrated. To do this with numeric and character based data is quite doable. But when it comes to multimedia, like images and sound and video and 3D models and engineering designs, well, we have a long way to go. The meaning - in other words, the semantics - of these forms of data are complex and subtle, and highly dependent upon an individual’s interpretation of that media.

So, we see that we have only just begun our journey to create the new Web.


Sep 25 2009   11:31PM GMT

Semantics and the new Web: Built out of very old ideas.



Posted by: Roger “Buzz” King
automating Web searches, inferences, information, knowledge, Semantic Web, Web development

Describing the real world in computers.

The word “semantic” has been a buzzword in computer science for decades. The youthful Artificial Intelligence world invented these things called Semantic Networks or Semantic Nets a half century ago. The idea was to come up with a crisp, formal language for representing real world things inside a computer. This took the form of a small set of constructs that would be general purpose, in that they could be applied to almost any sort of information. Further, these constructs would somehow be intuitive and natural, in that they would get to the heart of what it means to describe everything from horses to insurance claims to marriages to the contents of the Bill of Rights.

Basic, long-standing, core concepts.

What emerged has certainly stood the test of time. Big time. Opinions differ widely on just what constitutes the core constructs. Different people have used different names for these terms, and, although the idea was to specify something formal, the definitions of these constructs were generally sloppy. But here is a reasonable specification, in its most rudimentary form:

There are objects (which might also be called entities, things, or concepts). Objects have unique names.

Objects are interrelated by attributes (which might also be called relationships or properties). Attributes are directional, and they have names.

In other words, things in the world can be represented as a simple directed graph. We could say that there are objects called Chickens that have an attribute called Are. The value of this might be an object called Birds. Birds might have an attribute called Lives-In, which links Birds to the object Barnyard. There might be an object called Mr. Fried, which has an attribute called IS, which connects Mr. Fried to the object Chickens.

There are many popular various of this basic idea that have emerged, and they tend to be of the following nature:

One idea is to make a sharp distinction between the notion of a subtype (or sub-kind or subset) and other attributes. So, our attribute Are might become a core concept itself, and we might name it Is-A. Chickens IS-A Birds, People IS-A Biped, etc. Other attributes like Lives-In would be considered inherently different from Is-A.

We could introduce another generalization. A general term for attributes Lives-In and other similar attributes might be Has-A. In fact, we could stop using special words for attributes in general, and just use the terms Is-A and Has-A. We would then say that Marriages Has-A Wife, as well as a Husband, as well as a Date.

These general ideas are actually old, and actually significantly predate computing. We have been struggling with the problem of describing real world objects (like Cows), real world concepts (like Marriages and Respects), and their interrelationships and categories since the emergence of the earliest philosophers. Aristotle distinguished between objects and their attributes, and carefully studied and described many animals and plants.

What does it all mean for the new Web?

So, what does all this mean to us, today, and what does it have to do with modern Web technology? Well, first of all, these concepts of objects and attributes have spread throughout all of computer science.

There have been some significant extensions, like distinguishing between an attribute that we might call a relationship, which interconnects complex objects or notions (like a driver owning a car) and attributes that interconnect complex objects and notions with atomic or simple things (like a car having a color or a driver having a name). Generally, these latter, simple kinds of attributes are now what we call attributes, and are considered inherently different from (and simpler than) relationships.

Another extension that has become a core concept in programming languages is something we might call an object identifier, which is a unique number or other identifier for individual objects; this allows us to carefully distinguish between two people who have the same mother, and two people who have mothers who just happen to have the same name.

Programing languages also introduced the concept of methods, or little programs that can give life to objects. You might be able to tell a marriage object to tell us the names of the husband and wife.

But basic concepts have not changed. There seems to be something natural and fundamental about them.

Building a new world out of old concepts.

And the Web? A revolution is happening today. We are developing languages that allow Web designers to embed machine-readable specifications in Web-resident information. This will largely automate the process of searching the Web, as well as the integration of information at multiple sites. This will in turn lead to the discovery of knowledge by putting together diverse information from across the Web. We have discussed these emerging technologies in the previous postings of this blog; they are heavily and deliberately built on top of ideas that date back to the 1950’s, and in fact can trace their roots to ancient Greece.


Mar 11 2009   8:24PM GMT

Web services: part of the Web 2.0 & Semantic Web picture



Posted by: Roger “Buzz” King
Multimedia, Rich Web Apps, the Semantic Web, Web 2.0, Web 3.0, Web development, rich internet apps, web services

This is the fifth in a series of blogs about the Semantic Web and Web 2.0/3.0. While the sequence of blog posts tell a continuous story, each blog should be fully informative if read out of sequence.

So far, we’ve discussed the Semantic Web, which is an attempt at automating the process of searching the web and integrating the results, and Web 2.0/3.0, which is largely oriented toward making media-intensive web applications highly responsive. (We noted in an earlier blog that Web 3.0 is an extension of Web 2.0, and we will look at this transition in a fugure blog.)

But there’s a third term that is often thrown into the mix: web services. What are they?

A web service is a web application that is not accessed interactively by a human, but rather by a program.

To make use of a website, we load a URL into a browser and visit the site. Once we’re there, we might - even if we don’t realize it - be operating a very sophisticated application, like Amazon. We can search their inventory - which sits inside a very large database - via Amazon’s search form.

But there’s something else you can do with Amazon. We can access their inventory via a web service. Or more precisely, we can build programs that can access their inventory by communicating with programs (called web services) that they have provided. Our programs and their programs talk to each other directly. These services can be used to do things that would usually would be very time-consuming, and in fact, often intractable, if performed with a browser. Their web services also allow third party vendors to post their stock on Amazon, and in return for a fee, let Amazon sell and ship their products. Thus, independent vendors can easily make themselves an extension of Amazon, something that works so smoothly that if we don’t look carefully, we might not realize we are buying something that is not directly marketed by Amazon.

The way a web service works is by the provider of the service (such as Amazon) making the interface to the software that implements the service publicly accessible over the Internet. Such an interface is called an Application Programming Interface, or API. This way, anyone who wants to write a program that will access the service over the web knows exactly how to write their program to talk to the web service. These programs that access web services are called “client” applications.

There is a wide class of web services available on the Internet, and many of them provide APIs that allow programmers to write software that can access vast databases of such things as news and real estate information. Many web services also are available via a website, for users who want to use the service interactively. And many client applications are really just doing the same thing a browser might do when accessing a website, except that the client is likely to be a far more specialized application and it runs as a desktop application on your machine. More importantly, that client program might be able to things that your browser cannot do.

For example, there is a web service called MusicBrainz; it provides information about music, not the music itself. It can be accessed via an API. There is also a website, MusicBrainz.org, where you can search the database interactively. The API might be accessed by a CD player application; it can communicate with MusicBrainz (without you knowing it) to download information about whatever CD you happen to be listening to on your desktop. It might enable your CD player to tell you the artist’s name and variations of that name, the release date, the catalog number, etc.

Since part of the idea is that we don’t have to directly interact with web services by using a browser, their explosive growth has been very quiet. Many websites are powered by input they get by using web service APIs. These second-hand websites are often called “portals”, and many portals integrate information from a number of sources and give you access to information that would otherwise be very tedious to find on your own. Web services are thus a critical building block for many of the multimedia, highly interactive websites that constitute much of the Web 2.0 effort.

In fact, they underscore the difficulty in making a sharp distinction between the Semantic Web and Web 2.0/3.0. This is because both of them depend highly on automating the movement of information around the Internet. A Web 2.0 website (often called a web application because it provides fast access to complex information, in particular, sound, images, and/or video) cannot answer your search request quickly unless it has ongoing, rapid access to underlying streams of information on the web. But this capability, of providing us with information integrated from multiple websites, is actually a cornerstone of the emerging Semantic Web.

The difference is that the Semantic Web will (hopefully) someday put a tremendous amount of smarts into web services, and allow us to locate, transform, and integrate information in extremely complex ways. The Semantic Web, in this sense, can be viewed as an extremely aggressive extent of the Web 2.0/3.0 effort.

So there we have it. Web services, in their hidden way, are rapidly evolving the web into something incredibly powerful.


Mar 5 2009   5:05AM GMT

Multimedia, what is it? Why do we care?



Posted by: Roger “Buzz” King
Web 2.0, Web 3.0, Web development, Multimedia, SMIL, Text, the Semantic Web, Video, XML

This is the fourth in a series of blogs on the Semantic Web and Web 2.0/3.0.

To get us going here, just what is “multimedia”? At one level, it simply refers to applications that manipulate, store, and/or present multiple kinds of media, such as text, video, relational data, sound, animation, etc. More pragmatically, it refers to the introduction of blob and continuous forms of data into applications that traditionally manage simple data, like character strings and numbers. In its most aggressive form, multimedia refers to the sophisticated integration of traditional, blob, and continuous data into integrated data forms that convey their own semantics.

A quick note: Blob data is data that is stored in a semantics-less fashion, usually as simple binary or character data. This could be almost any sort of data, such as images, video, sound, or natural language - but the key element is that the language or system being used to manipulate it doesn’t have an appropriate, specific data type. Blob data is often large, of variable size, and usually requires a sophisticated, outside application to interpret and present it. It is the default, catch-all way to store advanced forms of data in relational database management systems.

Another quick note: Continuous data is data that has a temporal aspect or can be broken down into segments that have their own identity. The visual part of video can be broken into clips; in fact, it can be broken all the way down to individual pixel-based images. Sound can be cut into pieces. James Joyce’s Ulysses is a big piece of continuous textual data. Like blob data, we typically need an outside appliation to interpret it. (Even the most complex application, a human, generally has trouble doing this with Ulysses.)

Back to the Semantic Web, Web 2.0/3.0, and multimedia.

In a previous blog, we tried to define these two terms and explain why they are very different concepts. The Semantic Web is an attempt to automate the searching of the web and the integration of data collected on the web; the idea is to greatly ease the painful interactive nature of using a search engine like Google. Web 2.0/3.0 (and no, there is no sharp distinction between the two) are largely about performance, of making web applications as responsive as possible, potentially as responsive as desktop applications.

But they share one common goal: effectively managing advanced forms of media. From the Semantic Web perspective, how can we search things like sound, images, video, and natural language in a semantically-meaningful way? We use sophisticated tags and image/sound processing to do this, but it is only a small step toward a solution.

From a Web 2.0/3.0 perspective, how can we deliver up such forms of media in way that is highly responsive? Video streaming on the web is a huge challenge, for example. Or, how can we interact with video in a responsive way, in such things as games and digital libraries?

The web, in fact, is inately multimedia: we take images, icons, links, text, video, sound, and various user controls like buttons and menus, and put them together in highly sophisticated ways. And behind these web pages, databases often sit, populating dynamic pages with information in response to user requests - this data is virtually invisible to search engines. This is what makes the Semantic Web in particular such an incredible challenge. How can we ever hope to search the web automatically?

There are modest advancements that have been made. One example is something called the Synchronized Multimedia Integration Language (SMIL, pronounced like the facial phenomena). It is an XML extension that supports basic constructs to glue multiple forms of media together in two dimensions and in temporal sequences. Using XML elements and attributes (the basic constructs of XML), we can create multimedia presentations in a precise, unambiguous way.

SMIL presentations can be processed automatically. This is very significant.

So, multimedia: it’s at the core of both the Semantic Web and Web 2.0/3.0. It is one of the basic motivations for their existence.


Feb 27 2009   3:26AM GMT

What do we mean by “Semantic” Web?



Posted by: Roger “Buzz” King
namespaces, Web 2.0, Web 3.0, Web development, the Semantic Web, language syntax and semantics, XML

This is the third in a continuing series of blogs about the Semantic Web and Web 2.0/3.0. Our focus here is on the Semantic Web.

Let’s look carefully at that word. What do we mean by “semantic”?

Even though it is very far from completely existing, the Semantic Web effort is a number of years old now. But the heavy use of this word in computing is far older, dating back to at least the late 70’s.

So, what do we mean when we use this word, in particular, with regard to the Semantic Web?

Like a human or “natural” language, a programming language has two key aspects: syntax and semantics. The syntax of a language refers to the structural rules that tell us what constitutes a legal program, just as the syntax of English tells us how to speak correctly. But syntax ignores the meaning of the program or English statement. The semantic rules of a language are what tells us the meaning.

Interestingly, a human statement can be syntactically correct, while its semantics might be ambiguous. If “Time flies”, does it mean that time goes by quickly, or that your buddy, Freddy Time, likes to fly his plane on weekends? But in general, a computer program must have only one set of semantics; otherwise, the computer doesn’t know what to do with it.

There is a broader - and far more ill-defined - use of the word “semantics” in computing. It’s used heavily, especially by researchers writing academic papers, as a sort of bragging term. We like to claim that our way of reprenting data captures more of the “semantics” of the data. In other words, the more expressive our way of representing data, the more semantics that can be deduced from its structure, and this is clearly a good thing.

Very important: when we look at the structure of the data, it includes all the terms used to describe the data. If I have a relational table called “Insurance Claims”, with a character attribute called “Subscriber Name”, and an integer attribute called “Amount Charged”, can a human with a modest knowledge of insurance deduce what it means?

Yes, in fact.

In the computing world, we are constantly creating new and more powerful ways of representing information in computers. Java and C# and C++ use object structures to represent data. MySQL and Oracle and Microsoft SQL Server use relational schemas to represent data; these consist of “relations” (also known as “tables”), along with “attributes” (also known as “columns”), along with other properties, like “primary keys”. With XML, we use things called “elements” and “attributes”, and other constructs, to model data.

It’s not really accurate for me to say “more powerful”; really, we just mean different. So, more precisely, our claim is that our way of reprenting data, given the sorts of data we are manipulating, makes it easier for us to deduce its meaning from its structure, i.e., its semantics from its syntax. XML documents are inherently very different from relational tables; they are used to model very different stuff. Neither is really more poweful than the other.

Note that we do not include the data itself when we talk about the ability of the syntax to imply the semantics of the data. The rows in a relational table are irrlevant when we are judging the power of the relational model to represent data. And often, we don’t include whatever code or logic is used to manipulate the data. When I described the relational table above, I didn’t say what SQL queries are used to manipulate the Insurance Claims table. But certainly, we could have, and it would have made perfect sense to consider this part of its structure. In fact, we include the methods of an object-oriented class in its structural definition, and of course, the syntax of Java specifies how to write legal methods. And so, the methods of a Java class are part of what we use to deduce the semantics of the data represented by that class.

So here’s one way to look at the Semantic Web: we try to use ways of structuring data that are so powerful, so rich in the way they can be used to imply the semantics of the data, that this interpretation can be done largely automatically. This would make the web far more powerful.

Let’s step back for a moment and consider the terms that are used to specify the name of a relational table (”Insurance Claims”), the names of the attributes (”Subscriber Name” and “Amount Charged”), and the names of the domains of those attributes (characters and integers). In the previous blog in this series, we looked at namespaces. We could consider these terms from our relational schema to form a namespace.

Importantly, namespaces are a major aspect of the Semantic Web, and are aimed at giving us web-wide standards for using terms as a way of describing part of the structure of data. In my relational database, I might use terms that tend to be common across all insurance companies, but are not necessarily common. And sometimes, the terms might have conflicting meanings from one insurance company to another.

But on the Semantic Web, we would specify a namespace and ask that all insurance companies use these same terms with the same meanings.

What about the rest of the definition of data on the Semantic Web? How do we put terms together in a way that is analogous to putting terms together to form a relational schema? One large research community thinks we should all use “triples”. Here’s one: <Tolstoy> <author> <War and Peace>. We’ve taken three terms and put them into a triple.

Here’s the exciting part: The left node could consist of a URL that points to a website dedicated to Tolstoy. The middle part could consist of a URL that contains a set of agreed-upon terms for describing books, in other words, a namespace. The right part could consist of a URL that has the text of War and Peace on it.

In other words, we can use namesspaces, combined with triples to glue together data on the world wide web. Then, we could imagine that a program could go out on this new “Semantic Web” and find the authors of a large set of books. One critical subtlety is that we would be guaranteed that “author” means the same thing in each case, because it has been take from a shared namespace that is used by any site that represents books and their authors on the web.

This is a key aspect of why the semantic web could be so powerful: shared namespaces guarantee common usage of terms, and triples can be used to glue information together into pieces that could be located automatically, i.e., without a human having to interactively verify and interpret every piece of data returned.

Wow.


Feb 17 2009   11:22PM GMT

The difference between Web 2 and the Semantic Web



Posted by: Roger “Buzz” King
Web 3.0, Web 2.0, Rich Web Apps, Web development frameworks, Web development, Ajax, XML Schema, XQuery

The purpose of this blog is to discuss cutting edge technology that relates to Web 2.0 and the Semantic Web. What do these terms mean?

Let’s start with the definition of a third term. A Web Application is a website that provides some sort of substantive functionality other than simply filtering and presenting information. Evernote is a fantastic web app that stores your notes on a server, and allows you to create, group, and annotate your notes. Some folks say that a web app makes it clear that there is an application at the other end of your browser, and not just a bunch of static data. This is admittedly a pretty soft definition, but it’s reasonable. Another way to look at it is that a web app provides what would otherwise be a desktop application, but makes it accessible from a server so that users do not have to install and maintain an application.

So what’s Web 2.0? It refers to web development frameworks and tools that can be used to create highly responsive websites and web applications. AJAX does this, and the conical example people give is Google Maps. AJAX allows data to be retrieved asynchronously while a prior page is being displayed and manipulated by a user, and minimizes the amount of a web page that must be replaced with the next refresh.

A somewhat newer approach is embodied in Adobe Flex and Microsoft Silverlight technologies; in these cases, a web app is sped up by running more of the application’s logic inside a browser plugin (Adobe Flash or Microsoft Silverlight), rather than making the client machine (which runs the user’s browser) continuously talk to the web server. The overall challenge is to make web pages highly dynamic (meaning the data comes from a database and is not hard-coded in the web page) while giving the user response times that approach those of a desktop application running on a dedicated or near-dedicated machine. While this is intractable at this point, it’s a good thing to hold up as a goal.

The term Semantic Web does not narrowly refer to technology that speeds up response rates. Rather, it refers to a still emerging body of software tools whose overall goal is to automate the collection and integration of information gleaned from websites. The idea is to free the Google/Yahoo user from painfully interactive, highly repetitive keyword searches where we continue to hone our queries until we seem to be finding the right stuff.

Semantic Web technology includes namespaces, which try to put more smarts in websites by having data tagged with widely shared, standardized sets of tags. And things like XML Schema and XQuery can be employed to leverage namespace technology to support high-volume, set-oriented queries of data stored on web servers. These are very similar to the sorts of queries that can today be coded in SQL and run on single database servers running database management systems like Oracle, SQL Server, DB2, MySQL, and PostgreSQL. Essentially, XML-based technology takes the ability of a relational database schema to help us interpret data, and extends it to the entire web.

We will look at XQuery and XML Schema in future entries of this blog.

By the way, some folks are already talking about Web 3.0, which in many ways draws from both Web 2.0 and Semantic Web technology. We’ll look at this in a future blog, but a key focus is on making web apps highly multimedia.