Modeling to Normal Forms
I attended the on=line Data Modeling Boot Camp this past week. Among the presenters was Chris Date. He spoke about Normalization for Database Professionals. It was a deep dive into Normal Forms with proofs for why they hold true.
It got me thinking about how we create our models Specifically, what Normal Forms do we consider optimal? Or, which Normal Forms are practically achievable?
I don't have an answer (that's why I ask the questions here). I have Chris Date's answer, which I'll hold for a while. So, I ask you - How would you answer the above questions?
For my part, my conceptual data models do not have a Normal Form. They tend to be relational entities with no attributes. My logical data models tend to be in in 3NF, though I think that BCNF is optimal. My physical models tend to be 2NF for performance, which I think is optimal.
I try for 5NF. Most of the time though, once you hit 3NF, the model is also in BCNF, 4NF, and 5NF, as it is rare (but definitely possible) to see violations of these higher levels in practice.
I agree that it is uncommon for a model to reach 3NF and not also be in 4 & 5NF. I generally strive for 5NF unless situation demands otherwise.
So as with everything, the true answer is 'It depends'.
Agreed; there is always an 'it depends' but in my experience defining the logical model to 3NF is good enough to arrive at an optimal explanation of the data. The non-redundancy and adherence to keys is enough for practical purposes to derive a de-normalised database schema where necessary from a sound starting point.
However, there is no substitute for understanding the data and the business context in which it is set, so this is the best judge of whether the 3NF model needs to be taken further to represent the situation as is.