# What is Normalization of Database?

Database / Tuesday, December 3rd, 2019

# Normalization

Normalization theory is a useful aid in the design of database. This theory is built around the concept of normal form. It is a step method to refine the database design. Finally the database becomes available in the highest normal form.

## Normal form in Normalization

A relation is said to be in a normal form if it satisfies a certain specified set of constraints. The normal forms are:

First Normal Form (1NF)
Second Normal Form (2NF)
Third Normal Form (3NF)
Boyce/Codd Normal Form (BCNF)
Fourth Normal Form (4NF)

## Non-Normal Form in Normalization

A relation is said to be in non-normal form if and only if the underlying domains do not contain atomic values.

Atomic value means of composite attribute over here. In the example, order-details attribute of the relation ORDERS contains the details about part-no and qty in single attribute. It is called non-atomic value.

ORDERS

 order-no order-date order-details part-no qty 1 10/9/2018 p1 10 p2 30 2 2/10/2019 p5 10 p6 16 p7 52 3 15-11-2019 p6 52 p3 44 p8 49 p3 15

## All The Normal forms of Normalization

### First Normal Form (1NF)

A relation R is in first normal form (1NF) if and only if the entire underlying domain contains atomic values.

The above table can be converted into 1NF as follows:

ORDERS

 order-no order-date order-details part-no qty 1 10/9/2018 p1 10 1 10/9/2018 p2 30 2 2/10/2019 p5 10 2 2/10/2019 p6 16 2 2/10/2019 p7 52 3 15/11/2019 p6 52 3 15/11/2019 p3 44 3 15/11/2019 p8 49 3 15/11/2019 p3 15

### Second Normal Form (2NF)

A relation R is in second normal form (2NF) if and only if it is in 1NF and every non-key attribute is fully functionally dependent on the primary key.

To show how this constraints applies, let us turn to relation ORDERS as shown in above two tables. Here the relation key is { order-no, part-no}. The key attributes are order-no and part-no. The non key attributes are order-date and qty. The value of order-date, however depends only on order-no. The relation is therefore not in 2NF.

Relations that are not in second normal form can always be decomposed into second normal form relations. To do this we can use a very simple rule that is: remove the offending functional dependency.

This means that we take the functional dependency that violated the 2NF constraint and make a new relation out of all attributes in this functional dependency. For example, the ORDERS is not in 2NF. The offending functional dependency is:

order-no → order-date

To make it in 2NF we make a new relation ORDERS from this functional dependency and remove order-date from the original relation. Relation ORDERS now stores facts about orders only. The original relation is now replaced by relation ORDERS-CONTENTS, which stores facts about order lines. The relation key of relation ORDERS is order-no and ORDERS-CONTENTS is {order-no, part-no}. The newly generated relations are as shown in the following tables:

ORDERS

 order-no order-date 1 10/9/2018 2 2/10/2019 3 15/11/2019

ORDER-CONTENTS

 order-no part-no qty 1 p1 10 1 p2 30 2 p5 10 2 p6 16 2 p7 52 3 p6 52 3 p3 44 3 p8 49 3 p3 15

### Third Normal Form (3NF)

A relation R is in third normal form (3NF) if and only if it is 2NF and every non key attribute is not-transitively dependent on the primary key.

Relations in 2NF still contain redundancies and additional constraint must be satisfied to eliminate redundancies. Relations in third normal form (3NF) must satisfy yet another constraint. In such relations there should be no dependencies between non-key attributes.

Let us consider a relation VEHICLE, which in not in third normal form.

VEHICLE

 reg-no owner model manufacturer no-cyl 01 Ram m1 Ford 4 77 Mary m2 Maruti 6 30 Nalin m3 Tata 4 37 George m4 Ford 4 83 Ramesh m1 Ford 4 99 Nutan m6 Matiz 6 46 Niraj m7 Bajaj 4 96 Ashish m6 Matiz 6 74 Amit m2 Maruti 6 92 Asha m1 Ford 4

The VEHICLE stores information regarding vehicles. Each vehicle is uniquely identified by reg-no and has one owner. The relation key is reg-no. The values for all the non key attributes can be determined easily by reg-no, therefore this relation is in 2NF. But non key attribute are transitive. For example no-cyl is dependent on model and manufacturer. Hence to make it in third normal form we decompose the relation in two relations REGISTRATION AND VEHICLE. These are as follows:

REGISTRATION (key: reg-no)

 reg-no owner model manufacturer 1 Ram m1 Ford 77 Mary m2 Maruti 30 Nalin m3 Tata 37 George m4 Ford 83 Ramesh m1 Ford 99 Nutan m6 Matiz 46 Niraj m7 Bajaj 96 Ashish m6 Matiz 74 Amit m2 Maruti 92 Asha m1 Ford

VEHICLE (key: mode, manufacturer)

 model manufacturer no-cyl m1 Ford 4 m2 Maruti 6 m3 Tata 4 m4 Ford 4 m6 Matiz 6 m7 Bajaj 4

### Boyce/Codd Normal Form (BCNF)

A relation R is in BCNF if and only if every determinant is a candidate key.

Following relation WORK is not in BCNF as it contains overlapping keys.

WORK

 project-id person-id manager time-spent 1 j1 m1 30 2 j1 m2 20 1 j2 m1 11 2 j2 m2 50 3 j2 m3 15 1 j3 m2 5

This relation can be decomposed into further simpler relations who are now in BCNF.

PROJECTS (key: project-id)

 project-id manager 1 m1 2 m2 3 m3

WORK (key: project-id, person-id)

 project-id person-id time-spent 1 j1 30 2 j1 20 1 j2 11 2 j2 50 3 j2 15 1 j3 5

### Fourth Normal Form (4NF)

A relation R is in fourth normal form (4NF) if and only if whenever there exists a Multi Valued dependency in R say A → B then all attributes of R are also functionally dependent on A. Hence a relation in 4NF should not contain independent Multi Valued dependency.

Following two relations in context with the previous one are in 4NF.

KNOWLEDGE

 person-id skill j1 computing j2 science j3 physics j3 maths j2 computer j1 chemistry

ASSIGNMENTS

 person-id project-id j1 1 j1 2 j2 1 j2 2 j2 3 j3 1