Tony Davis

Simple-Talk Editor
News, views and good brews

The myth of over-normalization

Published Monday, July 21, 2008 4:36 PM

I’ve always been suspicious of denormalizing an OLTP database. Denormalisation is a strange activity that is supposed to take place after a database has been normalized, and is assumed to be necessary in order to reduce the number of joins in queries to a tolerable level. C.J. Date is quite clear on this; well, he is slightly less opaque than usual: any denormalization to a level below 5NF is a ‘bad thing’ (he says ‘contraindicated’). 

 

In practice, normalization to fifth Normal Form is unusual. Normally, the Database designer reaches for his hat and coat after reaching Boyce/Codd Normal Form (BCNF), which is 4NF, and few databases I’ve ever seen are even reliably BCNF.

 

Why does one ever normalize a database? It is often said that it is to avoid logical inconsistencies, and to avoid insertion, delete and update anomalies. Also, to avoid redundancy, or duplication, of data. I think there is more to it than that. It also ensures that your data model makes logical sense. If, at the end of the normalization process, you arrive at a set of tables that correspond to simple, easily understood entities, then the chances are that you have got a database model that will sail through the inevitable changes in scope, changes in the application, extensions and so on. If you don’t, then it is time to tear up your database design and start again. Normalization isn’t like sprinkling on fairy-dust; it is a way of testing and ‘proving’ your design.

 

Too often, denormalization is suggested as the first thing to consider when tackling query performance problems. It is said to be a necessary compromise to be made when a rigorous logical design hits an inadequate database system. As the saying goes, “Normalize ‘til it hurts, then denormalize ‘til it works”. In fact, Denormalization always leads eventually to tears. It complicates updates, deletes and inserts; it renders your database difficult to modify.  Maybe once there was an excuse for a spot of denormalization, but on a recent version of SQL Server or Oracle, with indexed or materialized views, and covering indexes, the performance hit from multiple joins in a query is negligible. If your database is slow, it isn’t because it is ‘over-normalized’!

 

As always, we'd like to hear what you think. The best comment, according to our distinguished panel of judges, will receive a $50 Amazon voucher, and three runners-up will get a Simple-Talk gift pack.

  

Cheers,

Tony.

Comments

 

mbourgon said:

Funnily enough, a similar discussion just hit over on CodingHorror last week (http://www.codinghorror.com/blog/archives/001152.html ; no affiliation, just a reader).  Give it a read, as both sides are amply (and at times, loudly) represented.

First - you must hang out with a different group of DBAs than I've ever met.  The only people I know who consistently use BCNF, or even 3NF, are
(a) kids straight out of college
(b) Joe Celko and that type of person.  (Aside: while I love what Joe evangelizes, unfortunately a lot of it doesn't work in the real world.  Blame it on the multi-million-records-a-day tables, the reality of OLTP/OLAP on the same machine, what have you.  Practically, it doesn't work, but it doesn't lessen the ideas.)

Me, personally? If I get 2NF from the developers, I'm pretty happy.  I aim for 3NF, occasionally BCNF (if I'm lucky and have a bunch of time), but usually reality must intrude and you'll see 2NF and 3NF.

There are a couple problems with overnormalizing databases, which to me is BCNF and 5NF (and, depending on the use, even can be 3NF) - I once had the distinct dis-pleasure of working on a table that was overly normalized.  We even came up with a new word for it - abnormalized.  (Aside: isn't there a 6NF too?)

What's wrong with overnormalizing?  Well, for the main report, the raison d'etre for the entire system, I had to craft a join between 21 tables.  Each one was necessary, each one was needed.  That was not every table within the database; I was able to skip about 30 tables.  But troubleshooting issues when you're joining 21 tables is daunting at best.

So yes, while you _can_ do BCNF/5NF to your heart's content, there are several real-world issues.
1) You won't ever be the only one working on it.  The documentation (if any) won't be known by these other people.  So all they have to work off of is the structure, possibly a diagram (if you added all the FK/PKs), and the code.  That's it.  And honestly, trying to rework/fix/modify a totally normalized system can be a pain.  
2) Tradeoffs.  While materialized views and indexed views are nice, you're still saying "throw more disk space at it".  Granted, normalization should make the tables smaller, but you're still potentially adding a LOT of storage for performance.
3) End-users.  If you're fortunate enough to keep them out of the machinery, then this isn't really an issue.  If you can abstract it via English Query/Radius/SSRS/etc, then you're better off.  But frequently you don't have those options, so now you need to babysit anyone building queries, since in my experience they'll figure out some query that'll Cartesian and take out the system.  

Anyhow, I hope this explains the irrational position that many DBAs take.
July 21, 2008 4:21 PM
 

GilaMonster said:

I'm in a slightly pedantic mood... BCNF is not the same as 4th normal form. BCNF deals with inter-key dependencies (where one column of a candidate key depends on another column of a candidate key). 4th normal form deals with independent multi-values attributes (eg a pizza delivery company storing both the areas they deliver to and the types of pizza in one table)

One of the developers here asked me the question "where do you stop" during a presentation on database normalisation. while I can't claim that this is the right answer, what I told him was:
"Stop where it makes sense to do so for the data you are storing and the way it will be used. Where ever you decide to stop, make sure that you uinderstand the consequences of the normalisation levels you have applied, and the normalisation levels that you have not applied."
July 22, 2008 5:49 AM
 

Alexander Karmanov said:

mbourgon: there is 6th normal form. C.J.Date (et al) wrote a book on temporal databases where it is described.
July 22, 2008 10:27 AM
 

Granted said:

Great editorial Tony.

While I can't claim to have ever sufficiently normalized a design, I do fight tooth and nail to prevent denormalization. I still get, regularly and often, in online forums, in books, and from various individuals, the old, and LONG incorrect, maxim that "more than 3/4 joins in a query is too much." After I stop giggling I show them a 40 table join that has been running in a production OLTP system, for years. The system scaled with it with no issues. Tony, and Gila as well as others, are right. The time when a few joins caused issues is long gone. It's important and right that forums like this fight the ignorance that comes from that attitude.

Currently, several developers in my company are pushing Object Relational Mapping (ORM) as a mechanism to "solve" the database "problem." One of the issues in the database "problem" they hope to resolve is all that messiness of having to join tables. Clearly, despite demonstrable progress in the abilities of modern data management systems to handle the things we require of them, the ignorance within the database community has bled into the development community. That ignorance causes us to make gross errors at a fundamental level.

I'm not saying that ORM is such an error, but implementation of it to "fix" a problem that doesn't really exist is a clear sign that those who know better are doing a poor job of communicating and teaching.
July 23, 2008 6:49 AM
 

DavidBSQL said:

What more can be said than what Phil Factor stated in his blog entry http://www.simple-talk.com/community/blogs/philfactor/archive/2008/05/29/56525.aspx

The place where theory meets practicality will be painful if we don't have our feet on the ground with our eyes on the clouds.

By the way, whenever I get discouraged with the database model that I inherited, I just read that blog and laugh. It is to the point where my co-workers are starting to wonder. It is good to keep them guessing though.
July 23, 2008 8:24 AM
 

scb2376 said:

I normalize & denormalize because of all the obvious aforementioned reasons.  But mainly I do it to tick off application developers who think they're DBAs, which helps to perpetuate the eternal battle between application people and database people.
July 23, 2008 9:32 AM
 

CrazyOnU said:

As a developer I've been on both sides of this discussion.

Typically I'll end up advocating denormalization based on how the data will be presented back (i.e. using it to cache repeated lookups/calculations in order to simplify or optimize a reporting flow). Whenever I can, I normalize the tables to the extent that I understand how to.

The real issues for me as a developer are: 1) the tables weren't denormalized consistently (e.g. names, types, constraints, etc. vary without rhyme or reason) or 2) the relationships between tables I need to gather data from are opaque (one of which was a 'virtual join' that required an undocumented stored procedure to generate the keys that related each table).

Anybody who preaches too hard on one side or the other of the debate ought to be sentenced to a year's hard labor trying to maintain multiple poorly-designed production systems that are all "mission-critical". Vehemence is easy, practicality pays the bills...
July 23, 2008 4:41 PM
 

TadRichard said:

I agree with CrazyOnU.  Never say Never.  For us denormalization is often a necessary evil to 'get the job.'

We deal with a lot of web-based commerce systems for small clients.  We walk the line between providing these clients with administrative tools to handle the business logic (obscuring the data layer) and the client's desire to forgo (paying for) administrative tools and 'just edit the data'.

Most of these clients do not want (to pay for) a robust schema and an administrative interface that will keep them from doing something very bad -- such as deleting a ton of sales history because they decided to delete related records.

"What do you mean I can't delete American Express -- we don't accept it anymore!!!"  "Active field?  What's THAT for?!!"

In those cases, we usually compromise between enforcing a normalized schema of product and client information for future transactions, and a denormalized schema of transaction history.

So, for instance, a Sales History table will probably include redundant data such as the customer name, product name, etc.. The customer and product IDs will be there as well, but we won't enforce the relationship on the IDs.  (And we NEVER allow for cascading deletes!)

This seems to satisfy most (cheapskate) clients as they can edit the (future) business logic tables at will without affecting the historical data and vice versa.
July 23, 2008 5:57 PM
 

GSquared said:

I generally normalize as far as makes sense.  In most cases, that's 4/5NF.  I've found, time and time again, that less than that means I'll have to rebuild something the moment some business rule changes or someone needs a new report.

I don't have performance problems when I do this.  If anything, it speeds up the database, because most update/insert/delete actions are hitting very narrow tables that don't require a lot of indexes.

For reporting, I'll dump data into warehouses.  Makes more sense that way.
July 24, 2008 3:25 PM
 

Gaby Abed said:

How to keep the clients happy?  Design the database to your ideal normal form, and only give them access to views so you can easily adapt to keep them happy.  Any feedback if that's what any DBA's out there do in real life?  Fortunately, I don't have too much say in how the databases are actually designed (the DBA's just make sure the development team doesn't do anything silly and troubleshoots any difficulties) so anyone out there do this in real life (restricting access to views only)?
July 28, 2008 7:53 AM
 

caderoux said:

Anything below 3NF is unacceptable, generally - but I don't push for higher forms.

In my world, denormalization in the database is an optimization and only comes after you have attempted denormalization in the application layer through caching or similar.

There will always be denormalization in the presentation layer, regardless of whether it's an OLTP application or a reporting application or whatever.  But you can't let that view of the data leak into the database design.  It will ultimately be a disservice to the system and the users.

An area where denormalization is often acceptable is in logs, indexing systems, information interchange and ETL.  When this kind of data is being logged to the database, the database does not need to be aware of internals of the data, simply the key information and effectively a BLOB.

The end user needs and perceptions must inform the data model, but they cannot determine it.  Ultimately, familiarity with the data, patterns in the data, best practices in modelling, and the business needs (present and anticipated) will all inform a normalized database design.

For power users who want to build their own reports or extractions, I generally recommend a dimensional data warehouse or similar denormalized data model.  However, in reality, many so-called power users would be just as happy to have a report writer do this work for them - and such a developer dedicated to that can produce better-quality reports (no misunderstandings of the data model) which perform better (better understanding of what data is travelling over the wire if you are using a reporting tool which does a lot of front end joins) and are more maintainable.

The persistent myth that we should give users tools and they'll write (and change and tweak) their own reports is another which needs to be addressed.  It simply leads to poor-performing reports which are incorrect and difficult to maintain.  Similar to users who write massive systems in a hodgepodge of Excel or Access with no coherent architecture or plan, if it gets the job done, that's fine; as long as the hidden costs and waste are acceptable to the business.

If you want quality and flexibility in the face of change (typical desirable attributes of successfully designed and engineered systems), reporting turns out NOT to be a special case where software development rules don't apply.
July 28, 2008 8:58 AM
 

D.Clark said:

Coming from the reporting side, I understand normalized databases, but I see them as a necessary evil.  To create a registry for a department store, I had to join 13 tables.  That may not seem like a lot of tables to most, but this was back in 1995 and working with Crystal reports!

I've also had an opportunity to work with a system that had a normalized database and a reporting database.  The reporting database had to be loaded nightly using some pretty intricate stored procedures.  It was easier to create reports, but the data was not real-time.

Today, we are using SQL Server 2005 Reporting Services.  Our data architect gave us a "report" view.  This helps.  Plus we get to check e-mail as the reports load.
July 28, 2008 9:04 AM
 

KJPowers said:

Fabian Pascal had an excellent series of articles on this topic a few years ago:

"The Dangerous Illusion: Denormalization, Performance and Integrity, Part I
http://www.dmreview.com/issues/20020601/5251-1.html

"The Dangerous Illusion: Denormalization, Performance and Integrity, Part 2
http://www.dmreview.com/issues/20020701/5337-1.html
July 28, 2008 9:15 AM
 

bobh said:

Could we have a little discussion about the performance of normalized dbs vs denormalized ones in a large db environment?
I/O is a huge part of the workload of any query.   As a general proposition, the fewer the number of page reads and writes, especially physical operations to disk,  the faster the query.  The number of developers unfamiliar with this concept still surprises me.

Simply put, SQL reads data from cache or from disk in pages which are 8k in size.    The more rows you can squeeze into an 8k page, the fewer reads and writes are required.   Given a base table that has columns with 1 or 2 digit codes in it joined to tables with the corresponding descriptions for those codes, the base row can be much thinner, and you can fit more rows per 8k page.    As a result, you can read a lot more data at a one time for consideration, and yes this also applies to indexes built over those one or two digit codes, and views which use those indexes.    This may not mean much when your query plan produces seeks, but it means an awful lot when you are scanning masses of data.

I'm quite prepared to concede that disk is cheap and that clients demand quick turnaround times for applications.   So if you are working with tables that only contain a million rows or so, the performance difference may not be critical.    But if you are looking at truly large masses of data and your users need to work with live data, not end-of-day summaries, then normalization is your best friend.



July 28, 2008 10:00 AM
 

TroyK said:

A few points:

1) Normalization is a LOGICAL concept, and as such, has nothing to say about performance. The reason people think it so is the poor data independence in our current DBMS products which implies a strong logical-physical correlation. Google "logical physical confusion" for more.

2) If design is done properly, there should be no need to "normalize" as your relations (tables) will already be broken down along the individual business entities of interest in the problem domain.

3) If someone claims to have "denormalized for performance" (see prior post for Pascal's papers debunking this myth), ask to see the "fully normalized" design for reference. 9 times out of 10, they won't be able to produce such because it doesn't exist.

4) Most people that say "going to 3rd normal form (or BCNF) is enough" say so because they don't understand 4th and 5th normal form.

5) Denormalized physical implementations do nothing but bias the database for a specific application. This is one of the major problems that Codd addressed with the invention of the relational model. Why regress and repeat the errors of 30 years ago?

July 28, 2008 10:47 AM
 

bobh said:

Troy, while the normalization process is driven by logical concepts, it most definitely results in a physical structure which does have an impact on performance.    Performance is always an issue because applications are what pay the bills.  

As I understand your second item, it is that the logical association of relations to entities is what drives the normalization process, and I most definitely agree with that.   Restated, my point was that by reducing the amount of redundant data, normalization reduces the physical amount of disk that must be accessed, written out, backed up, etc.  

I think you are dead right with #3 and most of #4.   However, think something is going unsaid with #4.  

There are business costs for NOT getting an application up and running, and some developers feel that analysis and implementation of a 4th normal form would take too long.    So, they go with a quick look, build some tables, and put bandages on any problems later.  Although suboptimal from a design perspective, they may sometimes be better solutions from the business perspective, because they were up and running and generating revenue faster.  Most serious db flaws that I've encountered seemed to have "evolved" in this manner.  

This leads us to number #5 which I also mostly agree with, but in my experience it is usually a single critical application which leads to the creation of a database in the first place.   While I believe the discipline of normalization, with its focus on identifying entities and their attributes, promotes a cleaner, stronger foundation for future applications development, I also know that even a normalized design may be biased towards a business model.  

Two final points.   (1) I'm not an application developer, just a db guy who tries to keep an open perspective.   (2)  I really enjoy good discussions and arguments.






For the record, I have to admit redundancy is not a bad thing when performance is at issue.   Indexes are redundant, and so are summary tables for reporting purposes.    Normalization goes to the issues of data integrity and consistency in the source tables from which these redundant structures are created.  








July 28, 2008 1:54 PM
 

TheNoseKnows said:

The biggest reason most DBA's stop at 3NF is this (as I see it):

1.) They're the DBA because they learned about this SQL thingy and found it to be pretty cool.

2.) The read that 3NF means not dupicating data. That's cool! Now this BCNF and 5NF is just "over the top".

3.) They later learned that someone else had to abstract the data for whatever reason (usually a business analyst that knows access and how to link tables) and have been asked for a data dictionary and ERD.

4.) They had to look up "data dictionary" and "ERD" to do this and then had to make them.

5.) They design a DB to the BCNF/5NF (fast forward several years from when 1 through 4 happened) they remember what # 4 was. ("Can I make the business analyst do it? Can we get an intern? CRAP! I have to!" {head hangs low}" ...contemplate severe alcholism here... "OK, 3NF is good enough!")

6.) Chose severe alcholism over making an ERD and Data Dictionary on a BCNF/ 5NF database. (Isn't that the normal response?)

OK, this is satire for the most part, but anyone that's been through the pain will understand.     :-)

Nose
July 28, 2008 6:28 PM
 

ben mcintyre said:

Funny, but lately in my professional life, I have started associating triggers and denormalisation.  But already, I digress.
I'll state now that yes, normalisation is clearly the way forward in database design.  But just every so often, there comes a good reason to denormalise.
We could wish that all relationships in databases are nice, neat, foreign key/one-many joins.  However in some databases, the data dictates other relationships which are nowhere near as clean and efficient.  

For example, in a stock control system I once was involved with, we had x number of bins of product being processed into y cartons of packed stock.  Due to the nature of the process, it was not possible to link any particular output carton to a particular input bin.  Then the output cartons would be packaged and sold, in their own collective units.
It was necessary to summate costs from the input side and add them to output stock, and the only way to do this was through a weight-based cost averaging process relating the costs associated with the input bins to the output cartons.  Now, put this in a context of a large packing shed packing several 100,000 cartons a day for three months.
A relationship like this can turn your performance to mud, since the summation-and-averaging process involves all kind of per-row subquerying of complex views to get right.  So, at this point, I believe there is a strong argument for denormalisation.
We do our summations, store the results in a table optimised for extraction of the data in its final form, perhaps even add a denormalised column here and there on the output side to mitigate the worst of the calculation hell.

And here is where I go back to the issue of triggers.  While in general I hate triggers with a passion, I think possibly the only good use for them is enforcing integrity in denormalisation - they are naturally suited for that purpose.  They are able to give back (to some degree at least) the promise of data integrity that denormalisation breaks.

So, in summary, I'd say that it is the fact that the real world is not a nice, neat normalised place that can force us into situations where denormalisation is the only practical option.  
Really, programming (including database design) is a form of mathematical modelling.  And Inner/Outer joins and FK relationships are a simplistic first-line (although very useful) element of this modelling.  In real-life complex designs, there are a lot of situations which cannot be modelled using these standard elements.
And this is where the fun begins ...

I'm reminded of the quote (from James Gleick's 'Chaos') of the mathematician commenting that the study of nonlinear equations (a slightly exotic field in maths) is like speaking of 'the study of nonelephant animals'.  That is, there are far more problems in the world that are 'messy', nonlinear and even potentially unsolvable, that there are problems that are amenable to linear analysis.
However we do seem obsessed with studying the range of problems that we are most likely to find the answer to, rather like Nasrudin looking for his keys under the streetlight 'because there is more light there', rather than in the dark street where he lost them.
Quite often, when a problem crops up which doesn't fit our nice, comfortable analysis methods, it is tempting to label it as an anomaly rather that realising that it may be leading us on to a deeper level of comprehension.
July 28, 2008 7:41 PM
 

Andrew Clarke said:

(posted on behalf of Joe Celko)

3NF is not enough.  You need to go to 5NF to be safe or you will screw yourself up and never know it.  Fifth Normal Form (5NF), also called the Join-Projection Normal Form or the Projection-Join Normal Form, is based on the idea of a lossless JOIN or the lack of a join-projection anomaly.  This problem occurs when you have an n-way relationship, where n > 2.  A quick check for 5NF is to see if the table is in 3NF and all the candidate keys are single columns.  

As an example of the problems solved by 5NF, consider a table of house notes, since that is in the news right now, that records the buyer, the seller, and the lender:


HouseNotes

buyer     seller       lender
==================================
'Smith'   'Jones'     'National Bank'
'Smith'   'Wilson'    'Home Bank'
'Nelson'  'Jones'     'Home Bank'

This table is a three-way relationship, but because many CASE tools allow only binary relationships it might have to be expressed in an E-R diagram as three binary relationships, which would generate CREATE TABLE statements leading to these tables, which are all keys.  

BuyerLender
buyer        lender
=============================
'Smith'      'National Bank'
'Smith'      'Home Bank'
'Nelson'     'Home Bank'

SellerLender
seller      lender
=======================
'Jones'     'National Bank'
'Wilson'    'Home Bank'
'Jones'     'Home Bank'


BuyerSeller
buyer     seller
================
'Smith'   'Jones'
'Smith'   'Wilson'
'Nelson'  'Jones'



The trouble is that when you try to assemble the original information by joining pairs of these three tables together, thus:

SELECT BS.buyer, SL.seller, BL.lender  
  
FROM BuyerLender AS BL,
      
SellerLender AS SL,
      
BuyerSeller AS BS
WHERE BL.buyer = BS.buyer
  
AND BL.lender = SL.lender
  
AND SL.seller = BS.seller;

... you will recreate all the valid rows in the original table, such as ('Smith', 'Jones', 'National Bank'), but there will also be false rows, such as ('Smith', 'Jones', 'Home Bank'), which were not part of the original table.  This is called a join-projection anomaly.

There are also strong JPNF and overstrong JPNF, which make use of JOIN dependencies (JD for short).  Unfortunately, there is no systematic way to find a JPNF or 4NF schema, because the problem is known to be NP complete.  This is a mathematical term that means as the number of elements in a problem increase, the effort to solve it increases so fast and requires so many resources that you cannot find a general answer.  

As an aside, Third Normal Form is very popular with CASE tools and most of them can generate a schema where all of the tables are in 3NF.  They obtain the FDs from an E-R (entity-relationship) diagram or from a statistical analysis of the existing data, then put them together into tables and check for normal forms.  

The bad news is that it is often possible to derive more than one 3NF schema from a set of FDs.  Most of CASE tools that produce an E-R diagram will find only one of them, and go no further.  However, if you use an (Object Role Model) tool properly, the schema will be in 5NF.  I suggest strongly that you get any of the books by Terry Halpin on this technique.  

(note, posted on behalf of Joe Celko)
July 29, 2008 3:34 AM
 

crew said:

While I cannot speak for the educational community or the evangelical dba's, I've worked with dozens of business requirements where the optimal solution bypassed normalization.

A perfect example is a data mining solution I participated in.  The data in question involved fifty or more static tables across ten or twelve databases with billions of rows.  The end user required the ability to query complex calculated results where some fields.  The original 3NF solution definitely provided the fastest method of calculating these results in realtime, on a row by row basis.  

The problem:  The data sets being queried by the end user (high paid business analysts) would take as long as twenty or thirty minutes to return due to many aggregate and complex calculations.

The solution:  Create de-normalized derived tables of this data, where all complex calculations were completed in advance.  While processing the entire set of data took weeks rather than minutes, and occupied double the storage space of the normalized data, analysts were able to query data in seconds.  Strategies, such as step by step calcuations with writes but no deletes and no updates, using multiple derived table - contrary to popular belief - provided extreme performance gains in this mining process.  Leveraging such stategies was not possible within the normalized solution.

My point being, regardless of normalization theory, there exists in some situations, an economic cost.  In this example, tossing normalization aside proved far more profitable for the business requirement, as business answers were produced in higher volume, and the analyst costs were reduced.  I have many other similar examples and will continue to solve such requirements in my clients best economic interest, even if my professor frowns in discontent.  I suggest always looking past the theories, and applying real world solutions where applicable.
July 29, 2008 8:03 AM
 

randyvol said:

I guess I have been pretty lucky because 0% of my performance problems over the years have been rooted in (de)normalization issues.

100% of my performance problems have been rooted in:
'cursor-happy' developer's stored procedures - 65%
multi-table joins using non-indexed columns - 30%
other miscellaneous code-related issues - 3%
mal-formed queries - 2%

I guess if you had a really poor normalized model you'd find performance to be problematic - but I'd be willing to bet the issues above are far more prevailing in the 'real' world.
July 29, 2008 11:03 AM
 

Birdman said:

The problem being discussed really needs to be looked at using two different design mentalities.  A transactional system by design needs to be secure, fast, efficient, available, and controlled.  All these reasons lead to high performance hardware and normalized designs since a normalized database performs exponentially better depending on the application.  A highly transactional system is not design for a highly transaction environment in fact nothing could be worse.  If you want easy query writes and fast reporting access then you build large data store or data warehouses or at least move your transactional data to a read only filegroup.  In those cases denormalization makes sense since the objective is not transactional processing, but easy effective business intelligence and easy reporting for the business analysts'.  That is my take on it working in both enviornments I understand the reasoning for both, but again both are needed it just depends on what you are trying to do.
August 1, 2008 12:16 PM
 

Birdman said:

Sorry about sentence number four.  It should say, "A highly transaction system is designed for high transactions and not for a large reporting environment in fact nothing could be worse."
August 1, 2008 12:21 PM
 

YESChandana -Blog said:

My favorite links from the last week - 1st Week of August 2008
August 3, 2008 6:14 AM
 

Roenbaeck said:

There are actually some warehouses modeled using a 6NF variant called anchor modeling that are functioning really well. I'm glad to see that we're not the only ones seeing the benefits of normalization. You can find more information about anchor modeling here: http://www.anchormodeling.com and it is especially suitable for SQL Server, due to the optimizer being able to eliminate tables from queries using views.
September 14, 2008 6:33 AM
You need to sign in to comment on this blog

















<July 2008>
SuMoTuWeThFrSa
293012345
6789101112
13141516171819
20212223242526
272829303112
3456789
On the Trail of the Expanding Databases
 It is sometimes difficult for other IT people to understand the constraints that DBAs have to work... Read more...

SQL Server 2008: The New Data Types
 Brad continues his helicopter-level view of the most interesting new features of SQL Server 2008 with a... Read more...

Reporting on Mobile Device Activity Using Exchange 2007 ActiveSync Logs
 In this new column giving practical advice on all things Sys Admin related, Ben Lye takes on the often... Read more...

The Bejeweled Puzzle in SQL
 Alex Kozak provides another SQL puzzle to hone your SQL Skills with.  Read more...

Using Powershell to Generate Table-Creation Scripts
 For all of us who learn best by trying out examples, Bob Sheldon produces a PowerShell script file for... Read more...