Database modeling question

20 pts.
Tags:
Database
Modeling
Normalization
I have a quandry on my hands. Another developer and I are designing a replacement for an older custom application, and we're deep into data modeling. We can't agree on a solution. More precisely, I think it should go one way, he thinks it should go a different way, and I can't quite prove that either one is the right (or best) way to get it done. First, a bit of background. The system is intended to manage information about our suppliers and their products. Certain kinds of products have certain pieces of information that we have to track about them. The information needed is somewhat arbitrarily assigned by the users, or at least there doesn't appear to be an easy way to classify the info automatically. There are classifications (we are calling them contexts) of information, though, that are already well defined (but loosely enforced, hence the rewrite of the system). The item's classification has any number of properties that need to be tracked. To make matters worse, a property can be a part of any context. For instance, the ExpirationDate property may relate to their Insurance classification as well as their NAFTA Certificate classification. It looks to me to be a classic many-to-many relationship. To me, the resolution is to use an intermediate table to denote the interrelationship between the ClassificationContext and the Property as follows: CONTEXTS ContextID Name PROPERTIES PropertyID Name CONTEXTUAL_PROPERTIES ContextID PropertyID My associate's idea is to make Property be related into the Context: CONTEXTS ContextID PropertyID Name PROPERTIES PropertyID Name ...or vice versa: CONTEXTS ContextID Name PROPERTIES PropertyID ContextID Name My thought is that neither one is entirely dependent on the other, nor are they entirely indepenent of each other. This indefinite nature of the relationship indicates to me that it should require the first model (with the Contextual_Property 'join' table) all on its own. Making one the defacto 'owner' of the other just introduces other problems down the road. The more we leave the Context and Property concepts alone, the fewer problems come up when the business changes. My associate's approach is that adding the dependency (as shown in the 2nd and 3rd models) is just plain simpler. I'm fighting the whole "I said it, therefore it must be right" ethos inside my own head, and trying not to let ego blind me to the possibilities that I'm wrong. I'm also fighting the Keep-It-Simple urge, with the thinking that my model is the simplest that it can be broken down into while still accurately modeling the business. I'm pretty certain that my associate isn't just playing devil's advocate nor playing a Socratic form of '20 questions", but instead really believes in his solution based on his training and experience (as do I). I would call neither of us 'greenhorns' nor 'intellectual slouches', by the way. In the end, I need opinions and arguments for or against either position. Thanks...
ASKED: July 23, 2009  9:04 PM
UPDATED: July 30, 2009  1:50 PM

Answer Wiki

Thanks. We'll let you know when a new response is added.

Based on the information you have provided, I would choose the first model.

It is more flexible, scalable, <b>and it is normalized</b>.

Let’s talk about the second model. In the CONTEXTS table, the name column is going to refer to the context name, or the property name ? Maybe he wants to add another ‘name’ column to solve this dilemma. Since one property could be related to more than one context, this design will produce redundancy.

After applying normalization to this design, the result is the first one. This is a strong argument for me. Create a table with the second design, and insert some data in which one property appears for more than on context, and bring the resultant redundancy to your associate’s attention.

Now, let’s imagine that you want to give users the administration of this relations, by means of an application, in which the user see a list of contexts, and a list of properties, and he/she can establish relations between them. Using the second design, where are you going to get the properties list from ? and the contexts list ? what if you want to remove a relation between some context and some property (a property that has no other relation with any other context), but you still want to keep that property in you catalog, for future use with some other contexts ?

Hope this helps.

——————————

I would tend to agree with the above answer … BUT things are never as easy as they seem. I find it possible that the original description is not complete; its just too basic.

——————————

The best model to use is the one that correctly reflects the rules of the data. Thus in the strictest sense only of the the three models is right. You should ask yourself some basic modeling questions about the nature of your objects. However, it is difficult to answer you because you did not give us the necessary KEY information to adequately describe your tables. Just columns is not sufficient. I will provide my own keys and given these keys note the rules that fall out of each datamodel. You can ask yourself some basic questions based on these rules. If the question is NO then the related model does not work.

Model#2 tells us that each context relates to only one property. Is this true? If so Model#2 is correct.

add constraint context_Pk primary key (context_id).
add constraint context_UK unique (name)

Model#3 tells us that each property can relate to only one context. Is this true? If so then Model#3 is correct.

add constraint property_PK primary key (property_id)
add constraint property_UK unique (name)

Note that Model#2 and Model#3 are not interchangeable given the key constraints I have provided.

If you cannot claim either Model#2 or Model#3 then you must use Model#1. There really is not much debate to be had. If you two guys are thinkers then you should be able to figure out the answers to these questions.

If you do not accept the keys I have provided, then all your models are really just variations on Model#1 anyway except that in this case Models#2 and #3 have normalization flaws as was indicated by a prervious poster so why would you want to use them?

So get to it, ask if the keys I provided work or not and if not use Model#3.

Good luck, Kevin

Discuss This Question: 1  Reply

 
There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when members answer or reply to this question.

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
  • VirgilReality
    I really appreciate everyone chiming in on this. The further I got from it (and the more I read comments), the more I realized that the first design really is best. It accurately models the many-to-many relationship and its resolution inside a junction table (or whatever the data modeling books are calling it now). I think I was just stunned a bit when the second example was proposed, and I couldn't formulate an intelligent response. Thanks everyone! VR
    20 pointsBadges:
    report

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

To follow this tag...

There was an error processing your information. Please try again later.

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Thanks! We'll email you when relevant content is added and updated.

Following