The Database Management System – DBMS notes have been given for quick revision of DBMS subject.
A Database is a collection of related data organised in a way that data can be easily accessed, managed and updated.
E-R diagrams and their transformation to relational design
Entity-relationship model is a model used for design and representation of relationships between data.
The main data objects are termed as Entities, with their details defined as attributes, some of these attributes are important and are used to identity the entity, and different entities are related using relationships.
In short, to understand about the ER Model, we must understand about:
- Entity and Entity Set
- What are Attributes? And Types of Attributes.
Let’s take an example to explain everything. For a School Management Software, we will have to store Student information, Teacher information, Classes, Subjects taught in each class etc.
Considering the above example, Student is an entity, Teacher is an entity, similarly, Class, Subject etc are also entities.
An Entity is generally a real-world object which has characteristics and holds relationships in a DBMS.
If a Student is an Entity, then the complete dataset of all the students will be the Entity Set
If a Student is an Entity, then student’s roll no., student’s name, student’s age, student’s gender etc will be its attributes.
An attribute can be of many types, here are different types of attributes defined in ER database model:
- Simple attribute: The attributes with values that are atomic and cannot be broken down further are simple attributes. For example, student’s age.
- Composite attribute: A composite attribute is made up of more than one simple attribute. For example, student’s address will contain, house no., street name, pincode etc.
- Derived attribute: These are the attributes which are not present in the whole database management system, but are derived using other attributes. For example, average age of students in a class.
- Single-valued attribute: As the name suggests, they have a single value.
- Multi-valued attribute: And, they can have multiple values.
If the attribute roll no. can uniquely identify a student entity, amongst all the students, then the attribute roll no. will be said to be a key.
Following are the types of Keys:
- Super Key − A set of attributes (one or more) that collectively identifies an entity in an entity set.
- Candidate Key − A minimal super key is called a candidate key. An entity set may have more than one candidate key.
- Primary Key − A primary key is one of the candidate keys chosen by the database designer to uniquely identify the entity set.
- Foreign key – Identifies the relationship between entities.
The association among entities is called a relationship. For example, an employee works_at a department, a student enrolls in a course. Here, Works_at and Enrolls are called relationships.
A set of relationships of similar type is called a relationship set. Like entities, a relationship too can have attributes. These attributes are called descriptive attributes.
Degree of Relationship
The number of participating entities in a relationship defines the degree of the relationship.
- Binary = degree 2
- Ternary = degree 3
- n-ary = degree
Cardinality defines the number of entities in one entity set, which can be associated with the number of entities of other set via relationship set.
- One-to-one − One entity from entity set A can be associated with at most one entity of entity set B and vice versa.
- One-to-many − One entity from entity set A can be associated with more than one entities of entity set B however an entity from entity set B, can be associated with at most one entity.
- Many-to-one − More than one entities from entity set A can be associated with at most one entity of entity set B, however an entity from entity set B can be associated with more than one entity from entity set A.
- Many-to-many − One entity from A can be associated with more than one entity from B and vice versa.
ER models and data models are typically drawn at up to three levels of detail:
- Conceptual data model: The highest-level view containing the least detail. Its value is showing overall scope of the model and portraying the system architecture. For a system of smaller scope, it may not be necessary to draw. Instead, start with the logical model.
- Logical data model: Contains more detail than a conceptual model. More detailed operational and transactional entities are now defined. The logical model is independent of the technology in which it will be implemented.
- Physical data model: One or more physical model may be developed from each logical model. The physical models must show enough technology detail to produce and implement the actual database.
Normalization rules are divided into the following normal forms:
- First Normal Form
- Second Normal Form
- Third Normal Form
- Fourth Normal Form
First Normal Form (1NF)
For a table to be in the First Normal Form, it should follow the following 4 rules:
- It should only have single(atomic) valued attributes/columns.
- Values stored in a column should be of the same domain
- All the columns in a table should have unique names.
- And the order in which data is stored, does not matter.
Second Normal Form (2NF)
For a table to be in the Second Normal Form,
- It should be in the First Normal form.
- And, it should not have Partial Dependency.
Third Normal Form (3NF)
A table is said to be in the Third Normal Form when,
- It is in the Second Normal form.
- And, it doesn’t have Transitive Dependency.
Boyce and Codd Normal Form (BCNF)
Boyce and Codd Normal Form is a higher version of the Third Normal form.This normal form also known as the 3.5 Normal form of database normalization. This form deals with certain type of anomaly that is not handled by 3NF. A 3NF table which does not have multiple overlapping candidate keys is said to be in BCNF. For a table to be in BCNF, following conditions must be satisfied:
- R must be in 3rd Normal Form
and, for each functional dependency ( X → Y ), X should be a super Key.
Fourth Normal Form (4NF)
- A table is said to be in the Fourth Normal Form when,
- It is in the Boyce-Codd Normal Form.
- And, it doesn’t have Multi-Valued Dependency.
Limitations of 4NF and BCNF
Fourth normal form (4NF) is a level of database normalization where there are no non-trivial multivalued dependencies other than a candidate key.
It builds on the first three normal forms (1NF, 2NF and 3NF) and the Boyce-Codd Normal Form (BCNF). It states that, in addition to a database meeting the requirements of BCNF, it must not contain more than one multivalued dependency.
Multivalued dependency is best illustrated using an example. In a table containing a list of three things – college courses, the lecturer in charge of each course and the recommended book for each course – these three elements (course, lecturer and book) are independent of one another. Changing the course’s recommended book, for instance, has no effect on the course itself. This is an example of multivalued dependency: An item depends on more than one value. In this example, the course depends on both lecturer and book.
Thus, 4NF states that a table should not have more than one of these dependencies. 4NF is rarely used outside of academic circles.
- Lossless Join Decomposition
- Dependency Preserving Decomposition
Lossless Join Decomposition
• A decomposition should not lose information
• A decomposition (R1,…,Rn) of a schema, R, is lossless if every valid instance, r, of R can be reconstructed from its components.
Testing for Lossless Join Decomposition
- If both R1 U R2 gives complete set of attributes in R
- A (binary) decomposition of R = (R, F) into R1 = (R1, F1) and R2 = (R2, F2) is lossless if and only if :
either the FD (R1 ∩ R2 ) → R1 is in F+
or the FD (R1 ∩ R2 ) → R2 is in F+
- Intuitively: the attributes common to R1 and R2 must contain a key for either R1 or R2.
Dependency preserving Decomposition
It mean that decomposition should be such that every dependency of relation R must be implied in sub-relations or in combination of dependencies implied in sub relation.
Let R be any relation with FD set F,decomposed into sub-relations R1 and R2 with FD sets F1 and F2respectively.
Then every dependency of R must be implied in F1 or F2 or in combination of these two.
If F1∪F2 ⊂ F (not dependency preserving decomposition)
F1∪F2 ⊃ F (this is not possible)