We now know what data is and what a database is. We used a spreadsheet to build a rudimentary employee database. and then improved the database structure by addressing the insert, update and delete anomalies. Then I said, we will now move on to real databases. I think I lied because there is something we need to address first.
So, in this article, I will teach you some concepts and terminology you will use throughout your career as a programmer. We will start with two concepts, Entities, and Relationships and then see how they translate into Tables and Keys in the next article.
We will use the employee database from the previous article as a starting point. If you remember correctly, we started with an Employee section and ended up with three sections, Departments, Job Titles and Employees where both Departments and Job Titles are referenced in the Employee section via a key instead of a description.
Our Employee Database
Employees:
ID | LastName | FirstName | Initials | DepartmentID | JobTitleID |
1 | Smith | John | J | 4 | 1 |
2 | King | Peter | P | 4 | 2 |
3 | Peters | Jane | JS | 4 | 3 |
4 | Jenkins | Sue | S | 4 | 3 |
5 | Smith | Judy | J | 4 | 1 |
6 | Hansen | Bob | BJ | 3 | 4 |
Departments:
ID | Name |
1 | Finance |
2 | HR |
3 | Sales |
4 | IT |
Job Titles:
ID | Name |
1 | Programmer |
2 | Network Administrator |
3 | Database Administrator |
4 | Representative |
Entities
Modeling
Programming and database design is about modeling things we find in the real world. We call the things we are modeling entities. Thus, we create virtual entities of real-world entities so we can do something useful with them.
For example, we need to store information about a person instead of referring to a loose cluster of items like name, date of birth, and address. In that case, we can consolidate these items into a "person" entity as follows:
Person(Name, Date of Birth, Address)
In the same way, an employee can be written as:
Employee(Name, Date of Birth, Address, Salary)
Or even better:
Employee(Person, Salary)
In our employee database, we identified three entities:
Department(ID, Name)
Job Title(ID, Name)
Employee(ID, Last Name, First Name, initials, Department, JobTitle)
or
Employee(ID, Last Name, First Name, initials, DepartmentID, JobTitleID)
During discussions, we may present the conceptual view Employee(ID, Last Name, First Name, initials, Department, JobTitle)
to the organization and the physical view Employee(ID, Last Name, First Name, initials, DepartmentID, JobTitleID)
to the programmers as it is more technical in that it refers to the DepartmentID and JobTitleID foreign keys and not the Department and JobTitle entities.
Abstraction
I want to talk about abstraction before we go any further. Please note that the purists in Object Oriented Programming may frown upon my use of the term abstraction here.
The little boy asked his mom, “Mom, can I ask you a question?” The mom said, “Not now, Johnny, I am busy. Go ask your dad.” The boy replied, “Mom, I don’t want to know that much about it.” And, no, I do not imply that Dad is more clever (cleverer?) than Mom. Dad would just give more information than what is required.
Similarly, we can learn much about an organization's Departments. We can learn in what building and rooms the department is located, the number of employees the department has, the head of the department, and so on.
However, sometimes we don’t want to know that much. So we deliberately ignore some things and only focus on what we need to know for our application. In our database example, we only need to know the name of the department. So we ignore or abstract all the parts we are not interested in.
In programming, we have an acronym, YAGNI, short for "You ain't gonna need it." Adding data you will never use can become costly over time. For example:
The database is growing bigger unnecessarily
The users must acquire the data to capture it, resulting in wasted time, paper, and phone calls.
The application has to perform extra work on the unnecessary data.
The more variables there are in an application, the more complex the application becomes, and the bigger the chance for things to go wrong.
Thus, it is essential to think carefully about what data to include in your databases.
Relationships
Relationships can be seen as the links between these entities, in other words, how they relate to each other. In our employee database, the Employee entity is linked to the Department and Job Title entities. The Department and Job Title entities however have no relationship with each other. The one doesn't know that the other even exists.
Relationships come in three flavors or degrees:
One-to-One relationships,
One-to-many relationships, and
Many-to-Many relationships.
One-to-One Relationships
Assume we have the following two entities:
Employee(ID, Name)
Desk(ID, Name)
We have an Employee and a Desk. The company decides that each employee will have their own desk, not to be shared or used by other employees. This ruling allows us to update our entities as follows:
Employee(ID, Name, Desk)
Desk(ID, Name, Employee)
Each employee is assigned one desk, and each desk belongs to one employee. This is referred to as a One-to-One relationship.
An Entity Relationship diagram or ER diagram will look like this:
The line between the Employee and the Desk has a single vertical line on each side.
We read it as follows:
From left to right, we read, “One employee is assigned one desk.”
From right to left, we read, “One desk is assigned to one employee.”
Do not worry about the PK (primary key) and FK (foreign key) in this diagram. We will tackle that in the next article.
One-to-Many Relationships
Let’s now look at our employee database entities:
Department(ID, Name, Employees)
Job Title(ID, Name, Employees)
Employee(ID, Last Name, First Name, initials, Department, JobTitle)
Employees:
ID | LastName | FirstName | Initials | DepartmentID | JobTitleID |
1 | Smith | John | J | 4 | 1 |
2 | King | Peter | P | 4 | 2 |
3 | Peters | Jane | JS | 4 | 3 |
4 | Jenkins | Sue | S | 4 | 3 |
5 | Smith | Judy | J | 4 | 1 |
6 | Hansen | Bob | BJ | 3 | 4 |
You can see that an employee can be assigned to only one Department, but a department can have many employees assigned to it. From the Department’s perspective, we have a One-to-Many relationship. One Department has many employees. From the employee's perspective, we have a Many-to-One relationship. Many employees can belong to one Department.
The ER diagram:
The line that connects the Employee table has three little lines and is called a crow’s foot. Don’t ask!
We read it as follows:
From left to right, we read, “One Employee is assigned to one Department.”
From right to left, we read, “One Department is assigned many Employees.
Many-to-Many Relationships
What if we have to assign employees to projects in progress? Employees can be assigned to one or more projects, and each project can have one or more employees assigned to it. This is called a Many-to-Many relationship. We can, however, not present a Many-to-Many relationship with two entities as before. We have to introduce a third entity:
Project(ID, Name, Employees)
Employee(ID, Last Name, First Name, initials, Department, JobTitle, Projects)
To materialize this conceptual model, we do the following with our employee database:
Employees:
ID | LastName | FirstName | Initials |
1 | Smith | John | J |
2 | King | Peter | P |
3 | Peters | Jane | JS |
4 | Jenkins | Sue | S |
5 | Smith | Judy | J |
6 | Hansen | Bob | BJ |
Projects:
ID | Name |
1 | Project A |
2 | Project B |
3 | Project C |
4 | Project D |
Employee Projects:
Employee ID | Project ID |
1 | 1 |
2 | 2 |
3 | 3 |
4 | 2 |
1 | 2 |
3 | 1 |
3 | 2 |
From the above entity, we can derive to following:
John Smith is working on Project A and Project B.
Peter King is working on Project B only.
Jane Peters is working on Project A, Project B, and Project C.
Sue Jenkins is working on Project B only.
Judy Smith and Bob Hansen are not working on any projects.
Project A is assigned to John Smith and Jane Peters.
Project B is assigned to John Smith, Peter King, Jane Peters, and Sue Jenkins
Project C is assigned to Jane Peters only
Project D is not assigned to anybody
And there you have it, a Many-to-Many relationship in action.
The ER diagram:
Here we have a crow’s foot on each side of the relationship, and we read it as follows:
From left to right, we read, “One Employee is assigned to many Projects.”
From right to left, we read, “One Project is assigned to many Employees.”
Cardinality
But what if we need to be more specific? This is where cardinality comes into play. Let’s refine each of these relationships.
One-to-One Relationships
We have now added a zero to the left side of the relationship and an extra vertical line to the right side of the relationship, and we read it as follows:
From left to right, we read, “One employee is assigned one and only one desk.”
From right to left, we read, “One desk is assigned to zero or one employee.”
As you can see, each Employee must have a desk, but there may be desks where nobody sits.
One-to-Many Relationships
We read it as follows:
From left to right, we read, “One Employee is assigned to one and only one Department.”
From right to left, we read, “One Department is assigned to zero, one or many Employees.”
Many-to-Many Relationships
We read it as follows:
From left to right, we read, “One Employee is assigned to many zero, one, or many Projects.”
From right to left, we read, “One Project is assigned to one or many Employees.”
Thus, not all employees are assigned to projects, but a project must have at least one employee assigned to it.
Fixed Relationships
This one is just for fun! The ER diagram above models a basketball team. We read it as follows:
From left to right, we read, “One Player is assigned to zero or one Team.”
From right to left, we read, “One Team is assigned exactly 5 Players.”
You will encounter this type of relationship infrequently as it is tricky to enforce in a relational database. There are other ways to get around the problem. We will discuss that sometime in the future.
Conclusion
We are one step further. You now know that Entities are virtual constructs we create by abstracting real-world things (also called entities). You also now know that entities have Relationships with other entities and that we have to take care when defining both entities and their respective relationships.
In the next article, we will see how these Entities and Relationships are translated into Tables and Keys, two essential constructs in relational database systems. See you there!