SQL Explained

These are my notes on SQL explained.

Samsung Store on Amazon! If you buy something, I get a small commission and that makes it easier to keep on writing. Thank you in advance if you buy something.

 

Table of Contents

Introduction to SQL


Information is the most valuable thing in the world now. Finding it when we need
to then becomes the next most important thing. This is what databases and the
SQL language are for. In this book, I will help you get started.

Database systems manage a lot of data in many different forms. These database
systems are called relational databases. They manage the data and access to it.
There are many different engines and graphical interfaces for accessing them. At
the heart of these systems is the language SQL. While there are minor
differences in SQL among the many different projects, they are similar enough to
do a quick search to get them working, if you have any prior knowledge of SQL. 

I am going to use the PostgreSQL engine on a Linux installation. I will be using
the Fedora distribution and the command line to do everything. PostgreSQL and
Linux are not the easiest way to get started, but I feel learning these skills
will serve you better in the long-term.

PostgreSQL started at the University of California at Berkeley. It was a
relational database and opensource. 

SQL is a programming language used to query and add information to databases. It
tells a database what to do. 

Installing PostgreSQL
These instructions are for a Fedora 40 distribution of Linux.
Since there are a lot of different operating systems and ways to install things,
keep that in mind if you try to use something different than Fedora. this is
what i did to install on my Linux laptop.

So, open a terminal and type:



dnf install postgresql-server postgresql-contrib
systemctl enable postgresql
postgresql-setup --initdb --unit postgresql
systemctl start postgresql
su - postgres
psql
CREATE USER user_name WITH PASSWORD 'your_password';
CREATE DATABASE database_name OWNER user_name;
\q
psql database_name



That should do it.

Creating Your Database
Now, let us start working with some data. 
I am going to make a database to hold various kinds of information. I will start
with a library table so I can put some books into it to keep track of. Btw,
every database is made up of different tables. These tables should all hold
different kinds of information. For example, with our database, I could make a
library table, music table, blue-ray table, laptop table, and so on.



create database information;


We have created a database that can hold information. We can delete it if we
want to at some later point by using this command:



drop database information;



If you do that, it is gone. So, becareful if you decide to do it.

Creating Your Table
Now that we have a database, we need to add tables to hold and organize our
information. I am going to start with my books. I will call the table 'library'.



create table library
(
title varchar(100),
author varchar (100),
pages integer,
publish_date varchar(100),
isbn varchar(100)
);



Inserting Data
We now have a table to add books to. The above lines give the structure of how
we want the books added. We will use an insert statement to add individual books
to the table.



insert into library values
(
'Dinosaurs-The Grand Tour',
'Keiron Pim',
352,
'2016',
'978-1-61519-274-8'
);



Each of the lines with a comma at the end represent a column. This makes it easy
to read and understand what you have. When creating tables for other projects,
think about what makes sense to have in your table. 

Select Statement
If a few days have passed and you can't remember what book entry you have done,
that means you need to query your table to see what it contains. 



select * from library;



This will show you the library table and any books it contains. You can see how
it is broken up into columns.

You can also see just parts of the table if you wish.



select title from library;



This just gives you a list of titles in your table. We designed our table to
have columns of title, author, page number, date published, and isbn. So, you
can select any of those with the select statement.

Where Statement
The "where" statement gives you the ability to filter information when searching
a table. For example, we can search our table by author.



select * from library where author = 'Keiron Pim';



This allows you to search specific records and return the results. 
We can use operators with the "where" clause and this makes it even more
powerful. 



select * from library where author <> 'Keiron Pim';



This returns any book not by this author. You can then combine operators such as
"and/or" to give very specific data. 

The inserted data is stored in the table, each chunk in a column of its own,
making a record. This allows for addressing specific columns by name when
selecting the data. 

Delete Statement
We can delete records just as easy as creating them. For example, if we lose a
book, we could delete the record of that book if desired. To delete everything
in a table, type:



delete from library;



Please do not do this without thinking. Everything will be gone afterwards. 
The "delete" statement can be used to to get rid of specific records or small
groups of them. To delete books by a certain author, we would use:



delete from library where author = 'Pim';



Book records by this author will be gone after this statement. To delete a
specific book record, you would want to use a where statement with multiple
criteria such as author and page number. That would probabaly get the job done.
However, for books, you could just delete the "isbn" record since it should be
unique. 



delete from library where isbn = '978-1-61519-274-8';



Alter Statement
Another way to make records unique is to create a table with the records
numbered. This is called a primary key. Then you just use a where statement with
the appropriate key and the record is gone. To add another column we do this:



alter table library add column id integer;



Look at your table again with the "select" statement:


select * from library;



As you can see, we can use the "alter" statement to make our tables look the way
we want them. It can be done as many times as we need to. 
We can delete any columns we want just as easily:



alter table library drop column id;



This will get rid of the column. Use the "select" statement to confirm it is
gone as you intended.



select * from library;



Update Statement
We can adjust data in any column using the "update" statement. 
To get the exact record we have to use an id or multiple criteria. Let us look
at some examples.

If you have an "id" column:


update library set author = 'Moore' where id = 3;



If you want to use multiple criteria:


update library set pages = 450 where author = 'Pim' and publication-date =
'2016';




Modeling Data in SQL

When dealing with databases, we are working with files and the processes that
manage them. There are rules to follow and each database vendor has a slightly
different outlook on how you should do things. These different rules affect how
you model the data you want to work with. 

Data modeling is the first task you need to tackle. It is the process of putting
your ideas into rows, columns, and relations. There are three types of data
modeling. They are physical, logical, and conceptual. Each layer refers to a
degree of elaboration ending with a model that can be translated into sql
statements and implemented into your database system fo chocie. 

Conceptual Modeling
Conceptual modeling deals with the highest level of abstractions such as
entities and relationships. Entities refer to the actual physical objects or
abstract concepts in your requirements. If you are modeling dats for a library,
you might think of books, authors, borrowers, and librarians. If you are
creating a data model for a bank, your entities would be accounts, clients,
and money. Relationships model the way these entities interact. Relationships
are defined through primary and foreign keys.

Logical Modeling
This is the next step in data modeling. Here you begin by thinking of
attributes. The trick is to distinguish between attributes that you want to
track as opposed to those that do not matter. 

Physical Modeling
Physical modeling is the final stage. This is where your abstract ideas become
something real. You are getting ready to translate your ideas into scripts and
implement them in an actual database system. This is the stage where you define
the names of your tables, names for columns, and most importantly, your data
types. 

Data Types
Database systems were created to store information. Data types help to tune
database performance. For example, xml data can be represented as text. It is
text but text processing is not optimized for the structured nature of xml
documents. With the exception of desktop database systems, every single database
system vendor decided to implement a specific xml data type to address the
issue. 

Another reason for data types' existence is their role in enforcing domain
integrity. This refers to the ability of a specific data type to enforce
constraints. For example, with a "date" data type, it might be possible to enter
a date that also meant something else. Before we can enter the data into our
relational database, we have to break the data into pieces of specific data
types such as characters, numbers, and pictures. This information is
reconstituted by a client application later on.

Character Data
All character strings in sql can be of fixed length or varying length. A
character string can be defined as a sequence of characters that belong to a
predefined character set. A character set is the language your database stores.
You might remember specifying collation order or locale during your database
system setup process. The length of the string is the number of characters in
the sequence. This is where internal representation matters. A character can be
represented by one or more bytes with latin-based languages. Everyone else in
the world, just about, cannot.

A byte is a computetr term for a unit of information storage that consists of 8
bits. Each bit can be either 1 or 0. The combination of 8 bits allows us to
store 256 distinct values, which form the foundation of ASCII character sets.
While there were attempts to remedy the situation with extended code pages,
ultimately the solution came with the introduction of Unicode. It is a standard
doub;e byte character set that assigns a unique number to every single
character, so it can represent many more characters than ASCII. The Unicode
standard is the result of development coordinated by a nonprofit organizaiton
called the Unicode consortium. 

Sql thinks in characters but computers count bits and bytes. If you define a
string to be of fixed length, the system allocates a certain number of bytes in
memory or on a computer hard disk. If your character set is latin based, a total
of 10 bytes will be allocated for the string, but if you use a double byte
character set, such as chinese, 20 bytes will have to be allocated. It does not
matter whether the actual value to be stored in that string is exactly that many
bytes or not, it will occupy the whole space that is allocated to it. So, all
string will have exactly the same length.

If you define a string variable as a varying length string with a maximum of 10
characters to store, the behavior will be different. The actual memory or disk
space required to hold the value will be allocated dynamically as needed. Only
strings that are 10 characters long will have all 10 characters allocated to
them, but if you have a string that is only 2 characters, only 2 bytes of
storage will be allocated. 

As you can imagine, this flexibility comes with a performance hit because the
database system must perform the additional task of dynamic allocations. A
standard piece of advice when you need to squeeze the last drop of performance
out of the database is to profile your data and allocated fixed length strings
for values that always come in predefined lengths. Just don't forget to have
checks in place for your inserts. An attempt to insert 11 characters into a
field defined as char(10) would result in an error. Also, when comparing two
character strings, variable length strings would not care much about trailing
blanks but fixed length strings would.

Binary Strings
A binary string is a sequence of bytes in the same way that a character string
is a sequence of characters, but unlike character strings that usually contain
information in the form of text, a binary string is used to hold nontraditional
data such as images, audio, and video files. Binary strings can be used for
purposes similar to those of character strings, but the two data types are not
compatible. The difference is like text and a photo of the same text. 

It might be a little confusing to learn that plain text documents can be stored
as character strings and a Word document has to be treated as a binary string. A
Word or Pdf file is a text document from a user's point of view, but from a
computer storage perspective, it is not. In addition to plain text characters,
it contains many special markers and instructions that only those programs can
interpret. The same is true for any other special files such as bitmaps.
spreadsheets, audio files, and video files. 

Because a Pdf is a binary file, it also contains information about fonts,
positioning, coloring, and other formatting. This information is a set of
instructions that the host program understands and can interpret to display
human readable text and any other objects that might be embedded into the
document.

Numeric Data
After the characters come numbers. Exact numbers can either be whole integers or
have decimal points. Numbers can be positive and negative. They can have
precision and scale. Precision determines the maximum total number of decimal
digits that can be stored. Scale specifies the maximum number of decimals
allowed. 

Approximate numbers are numbers that cannot be represented with absolute
precision. Each numeric data type has limits, a range of values that it can
represent which is pretty consistent across systems. The value ranges for some
common numeric data types. 

Most of the time, numbers are hidden in the database. Once in a while, you might
need to use numbers as a value inserted into your query, and databases can be
very particular in what format you supply them. Literals for numbers are
represented as strings, optionally preceded by plus or minus signs, with an
optional decimal part for numeric or decimal data types separated by a dot.

Time Data Types
Handling dates and time is probably one of the most confusing and inconsistent
topics in sql. Partly them comes from the inconsistency of the human devised
system of tracking days and times. The date data type behaves differently from
implementation to implementation. Some system have separate keywords and others
comebine date and times into one keyword for queries. 

The dates and times get into the database as literals as a return result from a
function or as a conversion. The database systems have implemented a number of
sql functions to help handle this peculiar data type. While it might be apparent
to humans that some literal strings are veritable dates, computers have no such
insight and try to treat anything as a date. Once dates are in the database, you
need to take extra care manipulating them. For example, if you compare two dates
that have the same day, month, and year component but differ in time they will
be evaluated as not equal. Be sure to compare apples to apples. There are many
date and time related functions to help you compare values, extract information,
and even do date arithmetic.

Binaery Data
Binary data are for computers to understand and interpret so that humans can
understand. Prime examples of binary data are pictures and only your image
editor knows how to arrange these ones and zeros into a picture. There is not
much you can do with binary data in sql besides storing it and retrieving it on
demand. To store binary data, database systems have introduced a number of
binary data types.