3D Modeling archives - Buzz’s Blog: On Web 3.0 and the Semantic Web

Buzz’s Blog: On Web 3.0 and the Semantic Web:

3D modeling

Oct 26 2009   8:58PM GMT

The imposing heterogeneity of media applications



Posted by: Roger “Buzz” King
3D animation, 3D modeling, advanced Web apps, automating Web searches, continuous data, media applications, Multimedia, the Metadata Object Description Schema, Video, video containers, Web 3.0, Web development, Web development frameworks, XML Schema

This blog is dedicated to the discussion of emerging web technologies. Today, we look at a the rapidly growing world of media applications, and their impact on the Semantic Web.

The problem of searching for media assets.

We’ve already looked at advanced media, in particular video, audio, and animation data, in previous blog postings. In particular, we’ve looked at the subtle and complex nature of media asset semantics. We’ve seen that interpreting a piece of video, for example, is far, far more difficult than interpreting an integer or character field. Since the goal of the Semantic Web effort is to make the searching of the web highly automated, advanced media is becoming a huge and critical research and development focus for the builders of next-generation web development applications.

Just how do we provide an environment where media assets can be searched in a mostly automatic fashion, so that a human does not have to painfully paw through hundreds or thousands (or millions) of video chunks to find the right one? We’ve looked at emerging technologies for marking up advanced media information, and for making it usable in a variety of web applications. We’ve also looked at the dramatic challenge presented by mega apps to would-be users; the interfaces to these applications are truly massive and cannot present to the user the way in which they are meant to be used.

The problem of proprietary formats.

One specific, and very difficult problem, is the massive heterogeneity, not just of media formats, compression technologies, and container technologies, but of the applications themselves. If we are going to automate the searching of complex modeling, video, audio, and other media assets, we’re going to have to address a key question: since many media apps make use of their own proprietary data formats, how are we going to provide automated ways of searching media assets that are stored in these formats?

The problem of highly imperfect generic formats.

There are indeed many existing, as well as soon-to-emerge, standards for importing and exporting data between powerful media applications, but transformations in and out of these formats are often “lossy”, in that information is lost or changed. In fact, locating and downloading assets that are in supposedly-generic form is often very frustrating, because these assets end up not performing well. They can be difficult to edit and reuse. 3D animation models regularly blow up when animators try to import them into animation applications and the manipulate them. A hawk may look like a hawk until you try to render it with its wings flapping, and suddenly it’s a blob of geometric garbage.

One possible direction.

So, what do we do about the fact that many media assets must be manipulated by the original applications that created them? How can we facilitate reuse? It’s extremely unrealistic to expect users to master perhaps dozens of video or audio or animation applications. Filtering assets according to their file extensions is a good idea, and it is a well established practice.

But what we really need is a globally-known site that either literally or conceptually centralizes the massive network of import/export relationships, along with information about the relative success of these mappings. Are they ever lossy? If so, can they be fixed? What series of applications might we want an asset to be imported/exported through so that in the end it is in a usable format, given the applications that the user owns and has mastered?

There is much to be done. Right now, searching for and reusing media assets is a painstaking, trial-and-error-prone process.

Oct 11 2009   11:07PM GMT

Making information management scale: leveraging metadata on the new Web



Posted by: Roger “Buzz” King
3D modeling, automating Web searches, databases, DB2, information, Multimedia, MySQL, Oracle, PostgreSQL, RDF, Semantic Web, Video, Web 3.0, Web development frameworks, Web3.0

Previous postings of this blog.

This blog is dedicated to advanced Web development tools and concepts. Previous blog postings have focused on the emerging Semantic Web, which promises to make the Web radically easier to search and to greatly enhance the value of the vast sea of currently-disconnected information spread across the Web. We have also looked at Web 3.0 efforts, which promise to make multimedia websites highly usable and capable of conveying far more information than the current generation of websites. Previous postings describe breadth and depth of cutting edge Web technology.

Metadata: making that ratio small.

Here’s something that’s very important: Much of the ongoing research and development that is loosely categorized as Semantic Web and Web 3.0 efforts is focused on a specific technical goal, one that has been at the core of information management technology since the mainframe era that was epitomized by the IBM 360 series. That goal is to leverage metadata as much as possible.

It’s our best weapon against the truly staggering amount of information on the Web. This includes traditional text-based and numeric data, as well as books, medical advice, photographs, entertainment and training videos, music and recorded books, investment information, educational materials, scientific materials, e-government information, etc., etc. How can we possibly organize information and then search it in a way that scales? The Web is far from a closed world. In traditional data processing environments like banking, insurance, and credit card processing, we could get our arms around all of the data, as vast as it may have seemed. But the world of information today is an open world, effectively infinite in size.

Very informally, if you look at the size of the metadata divided by the size of the data itself, the smaller that fraction the better. In traditional relational databases (built with database management systems, such as Oracle, MS SQL Server, MySQL, PostgreSQL, or DB2), the extreme focus on minimizing this ratio has enabled the fast processing of extremely large volumes of data. The tradeoff is that the table definitions (or the “schema”), which form the heart of the metadata are very, very simplistic.

The old days: relational database schemas.

An insurance claim may be defined as a table with such columns as Subscriber_Name, Medical_Provider, etc., and thus, may consist of little or no more than a series of simple character and numeric fields. But if we need to process fifty thousand of them tonight, we must be able to bring many such table rows into memory at once, and quickly move through them. The database world was an extension of the paper world: a row in an insurance claim table was effectively an electronic successor to the traditional claim form.

Today: a far more challenging problem.

But on the new Web, information can be far more complex in nature, making the metadata to data ratio far larger. We’ve looked at some of the emerging technology and technical trends for embedding metadata in advanced forms of data (and for processing that metadata); this data includes books, images, video, modeling and animation, and sound. This new generation of information formats make up our personal health records and medical records images, industrial training materials, university “distance” courses, and the like. Each instance of these tends to be far more unique than individual insurance claim forms. And, it takes a lot of metadata to properly convey their “meaning”.

The challenge.

What we’re struggling with right now is to succinctly specify the meaning of modern media assets and to automate searching based on this metadata. This is our only hope for leveraging that ratio of metadata size divided by data size.


Oct 3 2009   9:12PM GMT

Multimedia: The Problem of Subtle Semantics



Posted by: Roger “Buzz” King
3D animation, 3D modeling, advanced Web apps, automating Web searches, blob data, continuous data, databases, information, Multimedia, rich internet apps, Semantic Web, smart search engines, tagging, Text, Web 2.0, Web 3.0, web applications, Web development, Web development frameworks, XML

The challenge of the Semantic Web.

We’ve looked at the emerging Semantic Web technology in the previous postings of this blog. The idea is to have a far, far smarter Web, one where the process of finding and interpreting and making use of far flung information can be largely automated. This is in sharp contrast with today’s Web, where these things have to be done in a painful, extremely time-consuming fashion.

So that is the key challenge. It has to do with searching the kinds of information that are important to us in our daily lives. This information, as it turns out, is very difficult to process automatically. Why is this?

The complexity of modern multimedia.

I teach a very basic 3D animation class to mostly computer science students. We use Maya, arguably the most popular 3D animation application, one that is used in the making of many animated features. The interesting thing about animation is that it is truly multimedia. It can give us a lot of insight into what we need the new Web to do for us.

That’s because the number and diversity of applications that are used for drawing, documenting, modeling, animating, motion capture, texturing, video rendering, video editing, video conversion and compression, sound editing, in even small projects, can be very impressive. Correspondingly, the wide variety and complexity of media formats involved in an animation project can be overwhelming.

What happens in an animation project? The workflow might begin with vector storyboard drawings to break the story down into scenes. In a typical animation project, 3D models in a variety of proprietary formats are used. Models must be transformed as they are exported from one application and imported into the next. Multiple video renders of animated models are made, and they must be edited together, along with multiple sound files. Multiple video and audio formats might be used. 2D images are used for textures; photographs of butterfly wings can be used to make an animated butterfly very realistic, and a checkerboard image made with Photoshop can be used to make a Linoleum floor. And along the way, a variety of note taking, screen capture, and conferencing software might be used to facilitate group communication.

There is also a heavy focus on reuse in an animation project. Building every model, editing every texture, creating every environment and background, recording every sound from scratch is frequently intractable. If existing assets cannot be tailored and reused, the project would be far too expensive and time consuming, and would demand too wide a variety of professionals to always be available. This raises the multimedia stakes, as assets of widely differing forms must be constantly reconfigured and used in concert in new ways.

But what’s the real problem? We aren’t all trying to produce complex animated videos. But very interestingly, in our everyday lives we essentially face the animator’s challenge when we try to find and use information on the Web. That’s because we’re often looking for things whose meaning, whose interpretation, demands focused human thought. We are looking not for business data, but for pieces of media, and the problem is that today, most of our searching has to be based on tags or brief textual descriptions that are associated with pieces of media, and not on the true meaning of the media itself.

The needs of the business world are not our needs.

It’s the subjective nature of media assets - this is what is at the heart of the problem facing us. Existing technology for searching the web is based on keywords and very short pieces of text.

There is other technology, though, under active development, stuff that serves as the information storage backbone of most commercial websites. It’s the technology that has for decades been used in-house (not on the Web) by businesses when they process large databases. But this stuff was designed to handle traditional business data forms, like integers, character strings, real numbers, dates, timestamps, and full text.

There is more, though. All of the major database management systems, along with tools for building and searching advanced websites are being retrofitted (or in some cases, built from the ground up) to manage more than keywords and text, more than standard business data.

But up to now, the focus has not been on supporting the kinds of information you and I are most interested in. The focus has been on extending database and Web technology to support xml documents, as well as more complex data objects, like those inside a Java program, as well as other forms of data found inside programs. This includes arrays and lists and short pieces of textual data, like the names of diseases.

In other words, we’ve been busy extending our support of the business world, so they can store complex business data in databases and make that information processable over the Web. You and I have largely been left out.

Finally, we are attacking our needs.

But there now many ongoing efforts to extend database and Web technology to make it useful to us. The new focus is on supporting blob and continuous media like images, video, and audio. This is extremely hard to do.

Why? Because the strongest means by which we deduce the meeting of business data is by looking at its internal structure and the terms that are used to describe that structure. A relational table named Prescriptions, with a character attributes Patient Name, Doctor’s Name, and Medication, and with a numeric attribute Dosage, is pretty easy to interpret.

But what do we do with a photograph, which is just a grid of pixels with no internal structure? Or a long series of images, along with a sound track, put together to form a piece of video?

The U.S. military has been pumping money into image processing for several decades, and so all is not lost. There is a vast body of mathematical research and software development that allows us to write programs that can find a particular face in a crowd and search satellite photos for airplane runways. But in general, we cannot at this time write a program that can process an arbitrary photo or video clip and tell us what it means. That means we can’t quickly search vast media database for useful pieces of information.

The goal behind the Semantic Web effort is to build a new generation of websites whose information can be searched automatically, and where information from multiple sites can be automatically integrated. To do this with numeric and character based data is quite doable. But when it comes to multimedia, like images and sound and video and 3D models and engineering designs, well, we have a long way to go. The meaning - in other words, the semantics - of these forms of data are complex and subtle, and highly dependent upon an individual’s interpretation of that media.

So, we see that we have only just begun our journey to create the new Web.


Apr 13 2009   3:27AM GMT

Mega Media Apps: A Huge Challenge for Web 3.0



Posted by: Roger “Buzz” King
3D animation, Web 2.0, Web 3.0, Maya, Video, codecs, video containers, continuous data, blob data, web applications, media applications, 3D modeling

What Are Web 2.0 and Web 3.0 Apps?

In our continuing series on Web 2.0/3.0 and Semantic Web technology, we’ve discussed one particularly impressive Web 2.0 app: Evernote. The challenge is to get the best of both worlds: the interactive performance of a desktop application, and the use-it-from-anywhere convenience of the Web. Many Web applications - such as Evernote - also ensure offline usability by providing both a desktop and webpage interface, and maintaining a local version of the database, which is periodically synched with the web-resident database.

But, as cleverly engineered as it is, and as useful as it is, Evernote is still a very simple application. What about big applications? What challenges face the developers of Web 3.0 applications, ones that will manipulate large databases of continuous data, and extra-large instances of blob data? (Video and sound are continuous; an image is blob data.)

Let’s consider one of the biggest media apps out there: Maya, the high-end 3D application that is widely used to make full length animated movies. (See http://autodesk.com for Maya.)

What’s the big problem? If an application like Maya was reengineered as a Web app along the lines of Evernote, would it be usable? Might it be intractable to be continuously moving complex animation data between the server and your client machine?

3D Geometry: Just How Big Is It?

Well, the problem is not the complex geometric models that an application like Maya must store and manipulate. 3D animation applications like Maya tend to support multiple ways of creating 3D shapes, and they do indeed tend to be very data-intensive. The first image at the bottom of this page shows a Maya screen with two spheres, one built with straight line geometry and one built with curved line geometry.

As it turns out, to make the straight line model smooth, you would need to use many more lines and vertices than I have in the the model in the image. But if you think about it, the straight line model uses the geodesic dome approach; it builds a 3D sphere out of many 2D polygons - which are flat. The more polygons, the smoother the model. In the other model, we use curved lines, and so the model looks much smoother, even with not that much detail. But the mathematics are complex.

You can image that a dense scene, with a very large number of detailed, 3D models of these sorts would contain a lot of data. But no, that’s not the problem. These models can be uploaded and download very quickly. They aren’t as big as you might image - because they are not continuous data. They are blobs, either binary or of code text, and are reasonably manageable.

The Killer Problem: Video.

The problem? It’s what Maya creates at the end of the design process, when Maya renders a scene so we can watch it. It renders video. And video, whether you are looking at video shot with your home camera, or at video rendered by Maya, or video I create when I capture desktop videos on how to use Maya and post it for my animation students, well, it’s big. Really big.

Video is the killer. Video makes a lot of mega apps, and even very simple apps that happen to create video, not scale. We could manage a modest number of modest-sized video segments via a web interface, but not big chunks of video. To make videos even worse, we usually have to add a sound track.

So, the lesson is that many or most applications that create and/or edit video in any form face this challenge.

This is why we use video compression. First, you need a container, which is a way of bundling the huge series of still images that make up the video, with the sound, as so that we can move it around as a single object. (Keep in mind that often consists of at least 25 frames, or still images, per second - and that makes for big pieces of continuous data.) Popular containers for small scale projects (such as animations that will be marketed via CDs) are .mov and .avi. The first is the Apple Quicktime standard, and the second is due to Microsoft.

Once you have a container, you need a codec, which is a way of compressing and decompression video, so that it isn’t so big when you move in over the Internet or store it on a small storage device. Codec actually stands for “code” and “decode”. It cannot be overstated how powerful a codec can be; I routinely turn gigabyte videos submitted by my students into less-than-100 megabyte videos. They can be uploaded to a website and then played, and at least in a small box on a web page, they look great.

But if you want quality, if you don’t want to lose detail, and if in particular, if you are going to display a video on a large display (or at the movie theatre), you often cannot compress it enough.

That’s it. That’s the problem, and it’s one of the biggest challenges facing the makers of Web 3.0 apps, which are supposed to fluidly manipulate video segments.

A Far Bigger, Far More Universal Problem.

But perhaps the old video challenge, the one that is constantly shoved in the face of next-generation web app developers, is a distraction, something that draws us away from the real problem, the one that kills many media apps, even when they are totally desktop-based. What is it? Take a look at the animation designer’s interface to Maya, in the second image at the bottom of this page.

The problem is the size and complexity of these apps. There are made up of multiple complex windows. They have menus, palettes, and lots of little boxes that contain detailed information. Keep in mind that you only see one of the Maya windows in the image below, at the bottom of the page, and it is already too dense for a single screen, even a large one. Looking more closely at the window in this image, note that there are several places on it that contain drop down menus. Many of these menu items lead to other drop down menus. Even the main menu at the top is changed frequently during the process of creating an animation project. The designer’s GUI as a whole changes during the process of using the app.

It is very hard to fathom the incredible complexity of an interface like Maya’s until you use it. Professional video editing applications are typically simpler, but are still very complex, especially if the application supports special effects and the insertion of text. Even applications intended for the average Joe, like Photoshop Elements, are often horrifically complex.

The Bottom Line.

The problem that faces developers of all sorts of next-generation apps that must manipulate animation or sound or video or images, or that format complex documents for publication (like Adobe InDesign), or support the development of complex web pages (like Adobe Dreamweaver), is this: it is near-intractable or perhaps completely impossible to build an interface that explains to the user the process of using the application. Little wizards or chunks of documentation that contain “recipe” steps, don’t come within a thousand light-years of conveying how to use that app as a whole.

That’s it. True Web 3.0 applications would convey not just a vast, deeply embedded toolset, but the way the tools should be used. That’s the big challenge.

By the way, if you want to see a handful of videos made by my introductory animation students, go to my website at http://buzzking.squarespace.com and look at the right column, near the bottom of the page.