Updated date:

Database Introduction

Author:
database-introduction

Why people need databases?

Organizations hold large amounts of data, and this needs to be shared among users. For example, a university wants to record who the students are, their details, also have they paid fees, and which modules they have passed or failed, and the grade. This is a lot of data and needs to be accessed, stored, and modified correctly, and in an organized manner.

The language of choice for this type of data management is SQL. It isn’t just universities that benefit from what is known as a relational database, where connections are based on relationships between the data in the SQL database. You can have a large organization with an HR database about who is employed there, how much they are getting paid, and if they have taken any holidays.

Business traders can also use relational databases to help them keep track of their customers, the orders they have placed, and if they have been issued an invoice, and then what level of stock is to be adjusted.

Relational Databases

A relational database system has four properties. Known as ACID.

Atomic

This is where each transaction is all or nothing. If it fails it will do so completely. It won’t be partially completed.

Consistent

There can be no transaction failure in such a way that the constraints, (Primary and foreign keys) are violated.

Isolated

This means that the actions in one transaction will be invisible to another transaction.

Durable

This is when once a COMMIT operation has been successful, then the change is considered permanent.

What is SQL?

SQL stands for Structured Query Language. It is used to communicate with a database and has access to just some of the commands such as Select, Create, Insert, Update, and Delete. What is it about SQL that makes it what it is? The components for SQL are the following:

DML - Data Manipulation Language

These types of statements let you get data in and out and means you can change existing data.

DDL - Data Definition Language

This allows you to create and destroy tables. These are the fundamental structures used to store the data. It can also set up connections between tables. Making it relational. You can create rules to prevent accidental damage.

DCL - Data Control Language

This means you can set different permissions so that the users have varying levels of access to the data in the database.

TCL - Transaction Control Language

With this component, you can specify what happens when a data conflict occurs. Also, you can specify what will happen if someone changes the data while another person is reading it.

MySQL Vs MSSQL

The MySQL interpretation of the SQL language is also a relational database management system. Or abbreviated to RDBMS.

Some of the following differences are important to note between this version, and the one developed by Microsoft.

MySQL supports many languages, such as C++, Java, Perl, Haskel, and PHP. Whereas SQL Server by MS supports the visual studio suite of programming languages, such as C++, C#, and Visual Basic. It does have support for many more.

Storage space is also very different from SQL Server needing large amounts of operational storage space, compared to MySQL which needs much less.

Something that marks SQL Server as different from MySQL is query cancellation. This means SQL Server has the edge here, with the ability to cancel execution midway through a transaction of data.

With regards to backup, while MySQL is taking a backup, it will block the database. This is different from SQL Server, which still allows full read-write access.

One of the biggest factors in choosing between the two is the price, and it’s almost a no brainer for smaller developers, as MySQL is free, compared to SQL Server which is expensive.

Commands in SQL

There is the WHERE clause. This is optional, but you can use it to include any condition, which should be expressed in Boolean expression format. The condition will usually involve one or more columns of the table.

There is a conditional operator known as LIKE. This is a logical operator that will test if a string contains a special pattern or not. It would be used like:

SELECT name FROM world WHERE continent = ‘Africa’ AND name LIKE ‘H%’;

The LIKE operator has two wildcards for pattern matching. These are the percentage % which allows you to match any string of zero or more characters. There is also the underscore _ which will allow you to match a single character.

If you use the % wildcard for ‘%land’; in a continent table, you will return, Scotland, England, Poland, Ireland. Do do this you would use the following code:

SELECT name FROM world WHERE continent = ‘Europe’ AND name LIKE ‘%land’;

MySQL also allows RLIKE which is used in conjunction with regular expressions.

With the WHERE command you can also use the operators = > < >= <= <>

You can also use AND as well as OR and NOT. LIKE and IN are also to commands, and also BETWEEN.

Relationships

This describes ways in which entities are connected to each other. An example of this is when name is part of person, name lives in city, or name creates widgets.

A relation is also known as a table, and a relationship can be different from a relation.

In relationships, you have what is known as cardinality. This is when the relationship is one to many, one to one, or many to many. There are also another few terms to be aware of. This is optional when one member of the university staff leads zero or one program. Also, Mandatory, where one of the programs is led by one and only one member of the university staff.

The number of sets in a relation is called its degree. If we want to know how this applies to database tables, it's the number of columns it has. The term tuple is used to express a group of related values.

database-introduction

Data Modelling Components

Entities

These are important things in the real world that have to be modeled, such as people, places, and objects. An entity could be a book, if you were modelling a library. The library would not be an entity type.

Attributes

This is an individual item of data that is associated with an entity. Examples of this are name, national insurance number, height, date of birth.

Relationships

This describes ways in which entities are connected to each other. An example of this is when name is part of person, name lives in city, or name creates widgets.

A relation is also known as a table, and a relationship can be different from a relation.

In relationships, you have what is known as cardinality. This is when the relationship is one too many, one to one, or many to many. There are also another few terms to be aware of. This is optional when one member of the university staff leads zero or one program. Also, Mandatory, where one of the programs is led by one and only one member of the university staff.

Data Domains

In data management and database analysis, the term data domain is used to describe a collection of values that a data element contains. The domain boundary can be the data type with an enumerated list of values, containing data. For example, the data domain for the gender column in your database is either M or F.

The definition of a domain concept is as an area over which control is exercised through a relationship, and the mathematical idea of a set of values, is known as an independent variable, and the function is known as the domain of a function.

The database designer must identify a column in the tables’ domain and implement the required constraints, ensuring that only a legitimate value can be inserted into the column. This process is known as domain integrity.

database-introduction

Keys and Entity Integrity

Known as entity integrity, this is when there is the ability to uniquely identify one of the rows in a table. The key is an attribute or group known as attributes, which differentiate each of the rows. This could be, for example, serial number, country symbol, (UK, USA, DE).

The key to making a good key is making it unique, and so it doesn’t change. It should be a simple description of the data object you are storing. This can be for example date_of_birth.

Types of key

To uniquely identify data, and create working relationships with other tables, a key is used.

Superkey

This is a key that uniquely identifies a single row.

Candidate key

Is a key that uniquely identifies a row however, it contains no redundant attributes.

Primary key

This is one of the selected candidate keys.

The JOIN operation returns a combination of the columns from more than one table where a condition is true. You could have for example have two different tables, linked with a primary key to create a third table, which is an amalgamation of the two tables.

Foreign Key

This is when both the tables have a matching commonality in terms of their data columns, which one would be the primary key in the first table, and then the foreign key in the second table. This is because the second table already has a primary key, so there can be a related link via this method.

Composite key

This is when two columns are combined, or also known as when two attributes are combined. For example, if five different people have their own office, you might then have an additional person sharing an office, so the composite would be the employee name and the room number. It’s unlikely you would get two people the same name-sharing an office.

Referential Integrity

This is the accuracy and consistency of the data within a relationship. It must be possible to identify a parent record. If one of the foreign key fields has a value, the matching parent record must also exist, and be referenced as a primary key.

A DBMS will prevent you from deleting a parent record while a child record still exists. The insertion of a child record with an unknown foreign key value is also prevented by the Database Management System. The ordering of the operation is also important when you want to maintain data.

Entity-relationship modeling

This is when you first go about identifying all the objects of interest in the system you are modeling. This is almost like a brainstorming session. Your job is to identify as many entities as possible, and then later discard any that don’t meet your system requirements. You then have to remove the duplicate entities, and also ensure you are not modeling the system itself as an entity type.

You will then have to list the attributes of each entity, and ensure that they are attached to the correct entity, and don’t belong elsewhere. This then allows you to identify the primary key. The next step is defining the relationship. After this, you then describe the multiplicity of each relationship. Is it one to one, or one to many for example. And finally, you should remove any redundant relationships, in the case that they are required, or duplicate another relationship.

© 2020 Kit