What is Normalization of Database?


Database / Tuesday, December 3rd, 2019

Normalization

Normalization theory is a useful aid in the design of database. This theory is built around the concept of normal form. It is a step method to refine the database design. Finally the database becomes available in the highest normal form.

Normal form in Normalization

A relation is said to be in a normal form if it satisfies a certain specified set of constraints. The normal forms are:

First Normal Form (1NF)
Second Normal Form (2NF)
Third Normal Form (3NF)
Boyce/Codd Normal Form (BCNF)
Fourth Normal Form (4NF)

Non-Normal Form in Normalization

A relation is said to be in non-normal form if and only if the underlying domains do not contain atomic values.

Atomic value means of composite attribute over here. In the example, order-details attribute of the relation ORDERS contains the details about part-no and qty in single attribute. It is called non-atomic value.

ORDERS

order-noorder-dateorder-details
part-noqty
110/9/2018p110
p230
22/10/2019p510
p616
p752
315-11-2019p652
p344
p849
p315

All The Normal forms of Normalization

First Normal Form (1NF)

A relation R is in first normal form (1NF) if and only if the entire underlying domain contains atomic values.

The above table can be converted into 1NF as follows:

ORDERS

order-noorder-dateorder-details
part-noqty
110/9/2018p110
110/9/2018p230
22/10/2019p510
22/10/2019p616
22/10/2019p752
315/11/2019p652
315/11/2019p344
315/11/2019p849
315/11/2019p315

Normalization of Database

 

Second Normal Form (2NF)

A relation R is in second normal form (2NF) if and only if it is in 1NF and every non-key attribute is fully functionally dependent on the primary key.

To show how this constraints applies, let us turn to relation ORDERS as shown in above two tables. Here the relation key is { order-no, part-no}. The key attributes are order-no and part-no. The non key attributes are order-date and qty. The value of order-date, however depends only on order-no. The relation is therefore not in 2NF.

Relations that are not in second normal form can always be decomposed into second normal form relations. To do this we can use a very simple rule that is: remove the offending functional dependency.

This means that we take the functional dependency that violated the 2NF constraint and make a new relation out of all attributes in this functional dependency. For example, the ORDERS is not in 2NF. The offending functional dependency is:

order-no → order-date

To make it in 2NF we make a new relation ORDERS from this functional dependency and remove order-date from the original relation. Relation ORDERS now stores facts about orders only. The original relation is now replaced by relation ORDERS-CONTENTS, which stores facts about order lines. The relation key of relation ORDERS is order-no and ORDERS-CONTENTS is {order-no, part-no}. The newly generated relations are as shown in the following tables:

ORDERS

order-noorder-date
110/9/2018
22/10/2019
315/11/2019

ORDER-CONTENTS

order-nopart-noqty
1p110
1p230
2p510
2p616
2p752
3p652
3p344
3p849
3p315

Third Normal Form (3NF)

A relation R is in third normal form (3NF) if and only if it is 2NF and every non key attribute is not-transitively dependent on the primary key.

Relations in 2NF still contain redundancies and additional constraint must be satisfied to eliminate redundancies. Relations in third normal form (3NF) must satisfy yet another constraint. In such relations there should be no dependencies between non-key attributes.

Let us consider a relation VEHICLE, which in not in third normal form.

VEHICLE

reg-noownermodelmanufacturerno-cyl
01Ramm1Ford4
77Marym2Maruti6
30Nalinm3Tata4
37Georgem4Ford4
83Rameshm1Ford4
99Nutanm6Matiz6
46Nirajm7Bajaj4
96Ashishm6Matiz6
74Amitm2Maruti6
92Asham1Ford4

The VEHICLE stores information regarding vehicles. Each vehicle is uniquely identified by reg-no and has one owner. The relation key is reg-no. The values for all the non key attributes can be determined easily by reg-no, therefore this relation is in 2NF. But non key attribute are transitive. For example no-cyl is dependent on model and manufacturer. Hence to make it in third normal form we decompose the relation in two relations REGISTRATION AND VEHICLE. These are as follows:

REGISTRATION (key: reg-no)

reg-noownermodelmanufacturer
1Ramm1Ford
77Marym2Maruti
30Nalinm3Tata
37Georgem4Ford
83Rameshm1Ford
99Nutanm6Matiz
46Nirajm7Bajaj
96Ashishm6Matiz
74Amitm2Maruti
92Asham1Ford

VEHICLE (key: mode, manufacturer)

modelmanufacturerno-cyl
m1Ford4
m2Maruti6
m3Tata4
m4Ford4
m6Matiz6
m7Bajaj4

Boyce/Codd Normal Form (BCNF)

A relation R is in BCNF if and only if every determinant is a candidate key.

Following relation WORK is not in BCNF as it contains overlapping keys.

WORK

project-idperson-idmanagertime-spent
1j1m130
2j1m220
1j2m111
2j2m250
3j2m315
1j3m25

This relation can be decomposed into further simpler relations who are now in BCNF.

PROJECTS (key: project-id)

project-idmanager
1m1
2m2
3m3

WORK (key: project-id, person-id)

project-idperson-idtime-spent
1j130
2j120
1j211
2j250
3j215
1j35

Fourth Normal Form (4NF)

A relation R is in fourth normal form (4NF) if and only if whenever there exists a Multi Valued dependency in R say A → B then all attributes of R are also functionally dependent on A. Hence a relation in 4NF should not contain independent Multi Valued dependency.

Following two relations in context with the previous one are in 4NF.

KNOWLEDGE

person-idskill
j1computing
j2science
j3physics
j3maths
j2computer
j1chemistry

ASSIGNMENTS

person-idproject-id
j11
j12
j21
j22
j23
j31

 

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *