Posted by: Roger King
automating Web searches, inferences, information, knowledge, Semantic Web, Web development
Describing the real world in computers.
The word “semantic” has been a buzzword in computer science for decades. The youthful Artificial Intelligence world invented these things called Semantic Networks or Semantic Nets a half century ago. The idea was to come up with a crisp, formal language for representing real world things inside a computer. This took the form of a small set of constructs that would be general purpose, in that they could be applied to almost any sort of information. Further, these constructs would somehow be intuitive and natural, in that they would get to the heart of what it means to describe everything from horses to insurance claims to marriages to the contents of the Bill of Rights.
Basic, long-standing, core concepts.
What emerged has certainly stood the test of time. Big time. Opinions differ widely on just what constitutes the core constructs. Different people have used different names for these terms, and, although the idea was to specify something formal, the definitions of these constructs were generally sloppy. But here is a reasonable specification, in its most rudimentary form:
There are objects (which might also be called entities, things, or concepts). Objects have unique names.
Objects are interrelated by attributes (which might also be called relationships or properties). Attributes are directional, and they have names.
In other words, things in the world can be represented as a simple directed graph. We could say that there are objects called Chickens that have an attribute called Are. The value of this might be an object called Birds. Birds might have an attribute called Lives-In, which links Birds to the object Barnyard. There might be an object called Mr. Fried, which has an attribute called IS, which connects Mr. Fried to the object Chickens.
There are many popular various of this basic idea that have emerged, and they tend to be of the following nature:
One idea is to make a sharp distinction between the notion of a subtype (or sub-kind or subset) and other attributes. So, our attribute Are might become a core concept itself, and we might name it Is-A. Chickens IS-A Birds, People IS-A Biped, etc. Other attributes like Lives-In would be considered inherently different from Is-A.
We could introduce another generalization. A general term for attributes Lives-In and other similar attributes might be Has-A. In fact, we could stop using special words for attributes in general, and just use the terms Is-A and Has-A. We would then say that Marriages Has-A Wife, as well as a Husband, as well as a Date.
These general ideas are actually old, and actually significantly predate computing. We have been struggling with the problem of describing real world objects (like Cows), real world concepts (like Marriages and Respects), and their interrelationships and categories since the emergence of the earliest philosophers. Aristotle distinguished between objects and their attributes, and carefully studied and described many animals and plants.
What does it all mean for the new Web?
So, what does all this mean to us, today, and what does it have to do with modern Web technology? Well, first of all, these concepts of objects and attributes have spread throughout all of computer science.
There have been some significant extensions, like distinguishing between an attribute that we might call a relationship, which interconnects complex objects or notions (like a driver owning a car) and attributes that interconnect complex objects and notions with atomic or simple things (like a car having a color or a driver having a name). Generally, these latter, simple kinds of attributes are now what we call attributes, and are considered inherently different from (and simpler than) relationships.
Another extension that has become a core concept in programming languages is something we might call an object identifier, which is a unique number or other identifier for individual objects; this allows us to carefully distinguish between two people who have the same mother, and two people who have mothers who just happen to have the same name.
Programing languages also introduced the concept of methods, or little programs that can give life to objects. You might be able to tell a marriage object to tell us the names of the husband and wife.
But basic concepts have not changed. There seems to be something natural and fundamental about them.
Building a new world out of old concepts.
And the Web? A revolution is happening today. We are developing languages that allow Web designers to embed machine-readable specifications in Web-resident information. This will largely automate the process of searching the Web, as well as the integration of information at multiple sites. This will in turn lead to the discovery of knowledge by putting together diverse information from across the Web. We have discussed these emerging technologies in the previous postings of this blog; they are heavily and deliberately built on top of ideas that date back to the 1950’s, and in fact can trace their roots to ancient Greece.