Posted by: SJC
Database Design, database normalization, deduplication, Development
Data deduplication essentially refers to the elimination of redundant data. (from Wikipedia) As the term seems to be commonly used, deduplication really is referring to duplication of data on servers and perhaps shares throughout the domain. I suspect that nobody who has been around IT very long would not understand that as time goes on it is not uncommon to find multiple versions (…as well as duplicate versions) of files throughout an enterprise (of any size). This phenomenon adds considerably to time required for backups, can cause slowdowns in the network, as well as user confusion — none of which are desirable of course.
Database Normalization I believe runs a parallel to deduplication in that one goal of normalization can be elimination of redundant data. Many of the same benefits of deduplication can be realized when a database is normalized – such as faster transmission (less data), and less storage space required etc. Normalization has become very much a part of my most recent project – upgrading a 20 year old database application to a modern database using relational technology. The project is no trivial task – but I’m having fun with it!