ByIndex Associates

Normalisation

Normalisation

When designing databases you are faced with a series of choices: -

  • How many tables do you need?
  • What data to they represent?
  • Which fields/column go into what table?
  • What relationship exists between multiple tables?

The answer to these questions lies with database normalisation, the process of simplifying the design of a database to achieve the optimum structure.

There are six progressive forms of normalisation: -

  • First norman form
  • Second normal form
  • Third normal form
  • Boyce Codd normal form
  • Fourth normal form
  • Fifth normal form.

For most database designs achieving third level form is sufficient to provide a good database structure.

Normalisation produces tables that have the following qualities: -

  • They describe one entity
  • They have no duplicate rows; hence there is always a primary key
  • The columns are unordered
  • The rows are unordered

NOTE: an “entity” refers to a single “thing” in the real world such as invoice information, customer information, stock information. Put another way an entity refers to only one type on thing and in the real world this gets very mixed up because an invoice would have customer information on it? This is what we want to clarify with normalisation.

First Normalisation Form

All fields/columns must be atomic

The word atomic comes from the Latin, meaning “indivisible”  (literally, not to cut). This means that for every field (row-by-column position) there exists only one value and not an array or list of values.

For example, a field called “item sold” should not contain “5 hammer, 3 screwdriver, 6 monkey wrench”

Therefore to solve the first requirement (atomic) you might create multiple item, quantity and cost fields to cope with the sale above! This creates it’s own set of problems and eventually you will sell more items than your have fields available?

Prohibits the presence of repeating groups of information

To solve this problem you would simply add a new record for very item sold and not repeat the fields. However, now the invoice number (“InvoiceNo”) would repeat, and assuming this is the primary key that’s not possible. So we would add another field called “ItemRef” and have the primary key on both fields – now we have a unique reference again.

Second Normal Form

First Normal Form criteria must be met

Every non-key field/column is fully dependent on the entire primary key

The second form has firstly to be fully first form compliant and then each record/row must be uniquely identifiable. But that’s what we’ve done above by creating a primary key on two fields/columns - wrong.

The trick here is the word entire, following the above example the customer field would be dependent on the “InvoiceNo” number field but not the “ItemRef” field. We have created non-key fields dependent on parts of the composite primary key.

Decomposition

To solve the above problem, we can break the table down into two tables holding invoice and items data separately

  • When normalising, you do not throw away information (non-loss decomposition)
  • You decompose the tables in such a way as to allow them to be put back together using queries

Third Normalisation

  • Second Normal Form criteria must be met
  • All non-key fields/columns are mutually independent

An example of a dependent field/column would be a calculated field such as cost (quantity x price). This would not meet the above criteria; it would be better to put such a field/column in a query, form or report as a calculated text box.

Be careful, you might put two or more fields into your table that are dependent on one another e.g. “ProductCode” and “ProductDescription”. Therefore we’d break this set of information out into another table, leaving “ProductCode” in the items table and creating a Products table with the code and description etc.

[Home] [News] [Services] [Trng Resources] [Access] [Project] [Excel] [Contact] [Calendar] [IP BANK]