We are re-designing a data warehouse and need to decide whether we create a solution that is snapshot based (i.e. a full copy of every customer and their attributes each month) or one that is more transaction based (i.e. a full copy of all customers, then every time their attributes change, we put an end date on the old data and add the new data).
Either way, we will build detailed snapshot based data marts each month that we?ll keep 2-3 years of history. Also, we expect to use the warehouse layer for data mining, time series analysis, etc where the data marts don?t have all the information.
This transaction based model will require about 1 TB storage compared to about 4.5 TB ? this represents a massive saving.
Has anyone out there gone with a transaction based warehouse from which they build snapshot data marts?
Are there any serious limitations that you wouldn?t already get with a snapshot?
Any information you can provide on the pros & cons of either method would be most appreciated.
Free Guide: Managing storage for virtual environments
Complete a brief survey to get a complimentary 70-page whitepaper featuring the best methods and solutions for your virtual environment, as well as hypervisor-specific management advice from TechTarget experts. Don’t miss out on this exclusive content!