Conversion of de-duplicated Customer Numbers due to source system change

Customer relationship management applications
Data warehousing applications
We have a Data Warehouse which de-duplicates Customers from 3 operational systems to enable more accurate customer reporting, data mining & CRM activities. We have several month-end snapshots history. Our organisation is building a new Operation Data Store which will now take over the de-dupe & key persistency functions. The ODS uses more up to date techniques for de-duping so it gets different results. We want to align the DW with the ODS so operations and analytics functions have the same view of the customer but because the de-dupe results are different, we can?t just have a straight swap of keys. For example, DW matches 3 source IDs together, yet ODS matches 4 source IDs together. We estimate that about 5% of source IDs will change keys each month. What?s the best way to convert our Customer Keys to: (a) minimise impact to history (b) retain usability and (c) retain a proper star schema. We will consider different solutions for conversion of history versus ongoing processing.

Answer Wiki

Thanks. We'll let you know when a new response is added.

With respect to “history,” there would seem to to two cases: 1) situations in which you want to be able to report history based on “truth” as you then saw it, and 2) situations in which you want to report based on the improved version of “truth” made possible by better “deduping.” As an example of the first case, if someone used the existing data warehouse for some form of government reporting, and there might be a need for future drilldown, you want to have consistency, so keep an “old model” DW running. On the other hand, if you are using the data warehouse for, say, some form of sales compensation based on this year versus last year, you probably want to run the previous year’s data through the “better” deduping process so that you have apples to apples comparison between last year and this year. Going forward, you presumably want to move to the better version of deduping.

It isn’t obvious why a difference in deduping would impact your star schema.

Discuss This Question:  

There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when members answer or reply to this question.

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

To follow this tag...

There was an error processing your information. Please try again later.

Thanks! We'll email you when relevant content is added and updated.


Share this item with your network: