Normalization

Normalization is a database design technique that organizes data into multiple related tables to eliminate redundancy and dependency by dividing large tables into smaller ones and defining relationships between them.

Normalization improves data integrity and consistency by reducing redundancy. It involves dividing a database into smaller, related tables and ensuring that each table contains data related to a single concept.

Normalization follows a set of rules called normal forms (NF):

  1. First Normal Form (1NF): Each column must contain atomic values (no repeating groups).
  2. Second Normal Form (2NF): Meets 1NF and ensures all non-key attributes depend entirely on the primary key.
  3. Third Normal Form (3NF): Meets 2NF and removes transitive dependencies (non-key attributes should not depend on other non-key attributes).

Example Before Normalization (1 Table with Redundancy):

CREATE TABLE Orders (
    order_id INT PRIMARY KEY,
    customer_name VARCHAR(100),
    product_name VARCHAR(100),
    product_price DECIMAL(10,2)
);

Here, customer_name is repeated in multiple rows if the same customer places multiple orders.

Example After Normalization (Using Two Tables):

CREATE TABLE Customers (
    customer_id INT PRIMARY KEY,
    customer_name VARCHAR(100)
);

CREATE TABLE Orders (
    order_id INT PRIMARY KEY,
    customer_id INT,
    product_name VARCHAR(100),
    product_price DECIMAL(10,2),
    FOREIGN KEY (customer_id) REFERENCES Customers(customer_id)
);