Data Warehouse – ‘thin layer’ or snapshots?

Data Mining
Data warehousing applications
We are re-designing a data warehouse and need to decide whether we create a solution that is snapshot based (i.e. a full copy of every customer and their attributes each month) or one that is more transaction based (i.e. a full copy of all customers, then every time their attributes change, we put an end date on the old data and add the new data). Either way, we will build detailed snapshot based data marts each month that we?ll keep 2-3 years of history. Also, we expect to use the warehouse layer for data mining, time series analysis, etc where the data marts don?t have all the information. This transaction based model will require about 1 TB storage compared to about 4.5 TB ? this represents a massive saving. Has anyone out there gone with a transaction based warehouse from which they build snapshot data marts? Are there any serious limitations that you wouldn?t already get with a snapshot? Any information you can provide on the pros & cons of either method would be most appreciated.

Answer Wiki

Thanks. We'll let you know when a new response is added.

What we have done is a hybrid between your 2 options. Our customer information is maintained in a master database that stores all their attributes and is updated regularly. We have data marts containing sales and/or activity data pertaining to each customer that are refreshed with 24 months (106 weeks) of data as the data is available. The data marts have access to the customer master data to join the sales/activities to the corresponding customers and attributes.

Good luck with whichever model you implement.

Discuss This Question: 1  Reply

There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when members answer or reply to this question.
  • Solutions1
    From a user perspective, is the data likely to be better, worse or the same? One would think that a transactional view makes more sense (e.g., adding monthend snapshots of customers inactive in that month is obviously wasteful). Also, if there are customer attributes that change during the month (e.g., the customer is acquired or for some other reason changes name, address, etc. in say, mid-month), it would seem that a transactional view is more accurate.
    0 pointsBadges:

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

To follow this tag...

There was an error processing your information. Please try again later.

Thanks! We'll email you when relevant content is added and updated.


Share this item with your network: