Sunday, January 13, 2013

Databases (Course / Tutorial), Part 1 - Relational Model

This is the first article of a series of where I will explain the basics of Relational Databases. In this series I will cover Relational Database theory, XML, JSON, SQL, Querying, UML, Indexes, Views, Transactions, Constrains and Triggers, Authorization, Recursion in SQL, OLAP, and NoSQL Systems.

There will be a small section for Relational Algebra, which I consider you can skip. However, I highly recommend to go through it, it helps you to structure your relational analysis when designing a relation model.

There is no need to know a programming language. However, it would be best if you know one. I won't be providing any instructions on any language since this series deals only with databases. At some point I may use Python or Java to illustrate a db real life implementation, but the usage of the programming language is not required to understand the essence of the database theory.

If you have any question feel free to post them on the comments. I will try to answer them to the best of my knowledge, at the same time, I invite you to solve any questions that you may know the answer to in the comments.

Ok, so let's get started.

Introduction

Todays digital world is data driven. While programs are useful to manipulate, gather, analyze and present raw data in a way that it makes sense. Databases run in the background making the our interactions with websites, software, even shopping experience at the supermarket easier.

So in short we can think of databases as useful for:

  • Handling small and/or massive
  • Store data in a way that it outlives the programs that executes on that data. This means that data is persistent
  • Managing data safely following ACID (Atomicity, Consistency, Isolation, Durability). This means that databases should be:
    • Concurrent (multiple users)
    • Reliable (99.99999% up time)
    • Speed/Efficiency to query data
    • Transactions atomicity (i.e that data is process fully not partially)
It is important to know that Data-Intensive applications are not always using Database systems. Examples of data not stored in DB's is files like spreadsheets.

Relational Model Terminology

Database: Set of named Tables(or Relations)
Table/Relation: has a set of named Columns(or attributes) 
Row/Tuple: has a value/data for each Column(or attribute)
Column/Attribute: has a type(or domain)

Schema: Structure (description) of the relations in the database
Instance: Actual contents of the Table(Relation)

key: is an attribute whose values/fields should be unique for each Row/Tuple. The key values/fields uniquely identify a Row.
Null: value for undefined or unknown fields/value in the database.

Querying Relational Databases

To start I will not include any code for how to query a DB. Let's just focus on idea of how queries take place in a DB System.







To Query data we need:
1. Design (Scheme) 
2. Initial Data
3. Query language like SQL

When we perform a query we get a new relation. This makes it possible to query an initial query, that is known as composition.

Query Languages:
  • Relational Algebra - formal: in the example below we are querying employees id's that have a salary greater than 3500 and that have a job title of 'Assistant' to determine a promotion. This creates a natural join (⋈ ) between employee and promotion.
π (id) = σ Salary > 3500 ^ (job_title ="Assistant" (employee ⋈ promotion)
  • SQL = actual/implemented: The same query for the employee id for promotion in SQL would be:
 SELECT Employee.ID  
 FROM Employee, Promotion  
 WHERE Employee.ID = Promotion.ID  
 AND Salary>3500 and title="Assistant"  

Conclusion

From the above there might be a lot that seems confusing at first. Don't be dis-encouraged by the complexity of the Algebraic expressions used for the query. In real life applications you will not use them unless your are working on creating a database. 

From this first post, you learn the terminology and how an SQL looks like. You now know that a Relational Database is nothing more than a group of tables that are related with other tables. Later in this series you will see how relationships are established by using the Keys as the means to create relationships.

Links:

Additional to what I present here, I recommend the following link:

This video may be a little more graphic on the concepts that introduced here:

No comments:

Post a Comment