December 31, 2023

Learning Git Version Control

This is a guide for learning Git version control.

Getting Started With Git
Working With Remotes
Tagging With Git
Aliases in Git

Getting Started With Git

Git is a way of doing version control. It is a way of keeping track of changes to files. You want to do this so you can look at previous versions and see what has changed. It is also a good way to work collaboratively. Git is a distributed version control system. This means that everyone who has cloned a repository has a full copy of the history.

Git thinks of your data as a snapshot. When you save the state of your project, it is called making a commit. I will get into this more later. Every commit is therefore a snapshot.

Git operates locally on your computer. Your local files are referenced and that link is saved on your computer. Git does not need to reach out to other computers or over a network in order to work. This makes it fast.

This also makes it easy to do your work. If you have to edit a file, you can do so and make your commit. Afterwards, when you find an internet connection, you can continue the process of uploading to a repository.

Almost everything you do in Git adds to the local database. That includes modifying files and committing files. You can lose work that you have not committed, so make sure and commit often.

File States

In Git, there are three states to your files. These states are modified, staged, and committed. Modified means a file has been changed but it has not been marked for staging and it has not been sent to the local database yet. Staged means your file has been marked to send to the database the next time you make a commit. Committed means your file has been sent to the database and a record of its changes are in the database.

File Workflow

This leads to the next major concept in working with Git. It is the workflow. Understanding these concepts will make things easier later on. Your workflow will be the working directory, staging area, and the ‘.git directory’. The working directory is the current version of the project you are working on. When you decide to work on a certain version of a project, files are taken from the database and placed on your drive in order for you to modify.

The staging area is technically a file in your Git directory. It contains the information on what will go into your next commit. It is also called the index because of what it keeps track of.

The Git directory is where all of the information and the database are stored. This is the part that is copied when you clone a repository.

Git Configuration

If you do not know whether you have Git installed or what version you have, you can just ask it in your terminal.

Git –version

Once you have Git installed, it is time to do a one time setup. This is done through the command:

Git config

This version with no parameters just shows you the options if you ever need them later. There are, however, a few specific actions you will need to do first to make Git work properly. We are going to set some details for your environment. The first is:

git config --global user.name "Jason Moore"

Just put your name in quotes there instead of mine.

Now we want to set your email variable. Do this like:

git config --global user.email [email protected]

Just put your email in there that you want to use.

You can also set your default text editor. A text editor is something like Vim, Emacs, or Nano. You will know if you use one. They are the quickest and easiest way to work with text files. You set your default like this:

Git config –global core.editor vim

You can check your settings at any time. You can do so like this:

Git config –list

This shows you everything you have setup and some extra the system did.

You can change anything you did by just entering the command in again, such as the email.

Using Help

The help system is very useful. It is well maintained so make sure and use it.

Git help [command]

Or:

Git [command] help

Starting a Repository

The first step most people do is take a directory they are working in and make it a repository. Doing this means Git will start tracking everything that happens in that directory. I am assuming you are on the command line and know how to change directories to make that directory your current and active directory that will now be tracked by Git.

For me it looks like this:

Cd /home/jason/writing

Once you are in the directory that you want Git to track you type:

Git init

This will create a new subdirectory in your project folder that contains all the necessary repository files. Nothing will be tracked, though, until you start adding files and commit them. Remember the workflow from earlier? You have to modify a file, stage it, and then commit it to the local database.

The first step is adding files.

You can add all files in your project directory at once to be staged by:

Git add .

The period at the end means to add everything in your current directory.

You can add all of a certain type of file.

Git add *.py

git add *.cpp
You can add any individual file:

Git add algebra_examples

Everything you decide to add to staging is ready to be committed to the database. Do so like this:

Git commit -m ‘first files to track’

You will see an output message, this is good.

Working In Your Repository

Now you work on your files. Your system is set up and ready to track changes. So, write the next chapter in your novel or write some code. When you do a little bit of work, you should add it to staging, then commit it to your local database so it does not get lost. This is like saving a document in a program. If you have done enough to save, then you have done enough to make another commit. The principle is the same.

The next thing you want to do is check the status of your files and tracked directory. You do this with the ‘status’ command.

Git status

If there is a file that Git says is untracked, that means it has not been added to the local database. You can always add it later.

Git add geometry

The next time you run the status command you will not see that file labeled as untracked.

You can also track directories. If you do this, everything in that directory is automatically tracked by Git. I would do this by using this command:

Git add writing

That will get all the files in the writing directory and start tracking them.

Ignoring Files

After you have been working on your files a while, you will eventually find that you do not want some files tracked. An example of this would be a ‘.log’ file that you will never directly work in. To do this, you can make a file called:

.gitignore

Inside, you can add things like this for Git to ignore these types of files:

*.log

*.o

*.pdf

You should get the idea. When you need this, I hope you will remember to come back here and look. It is a good idea to set this up in your project directory at the beginning so you will remember it is an option later on.

Viewing Your Changed Files

Sometimes we need to know what has changed in a file, not just the name of the file. There is a good command for this and it is:

Git diff

This command shows the difference in files, what you have changed but not staged, and what you have staged. It shows you the exact lines that were added and removed but have not been staged.

Committing Your File Changes

Once everything has been staged that you want, it is time to commit your file changes. The only files that will be committed to your local database are the ones you have run ‘git add’ on. You can run:

Git commit

However, this method is a bit clunkier as it will open your default editor and you will have to edit it to actually make the commit. Some prefer this method and that is just fine. I think it is easier when starting out to just use

Git commit -m ‘your commit message here’

This does the same thing. I think it is better for those new to Git.

A commit only records changes that were staged with the ‘git add’ command.

You can even skip the staging area if you want to. You can do this with the ‘-a’ flag.

This will commit file changes that were not staged with the ‘git add’ command. An example looks like this:

Git commit -a -m ‘Added a chapter to statistics study guide’

Now you get the same effect.

Untracking Files

Sometimes, you will want to stop tracking a particular file or directory. To do this, you remove the file from tracking. If I want to stop tracking my geometry file, I do this:

Rm geometry

This is almost like modifying the file.

Git rm geometry

This command stages it for the next commit.

Git -m “this day’s commit”

The geometry file will be gone and untracked on your next commit, whatever it is.

Moving Files

You can easily move files within the Git system and it still keeps track of them. Git considers a file renamed when it has been moved. Remember that distinction.

So:

Git mv proofs proofs_changed

Git status

You will see that Git considers the file renamed.

Seeing Commit History

Seeing your commit history can be very useful. For example, if you have cloned someone else’s repository, then you will want to read the commit history of the project to see what has happened in it. To do this, we are going to use the command:

Git log

Git will show you the commit history of the project you are in. The most recent commits are shown first. To get more detail:

Git log -p

This shows you the patch difference between commits.

If you just want to see the last 3 commits then do this:

Git log -p -3

This shows you the last 3 commits and the patch history.

Another useful flag is ‘stat’

Git log –stat

You can also make it look nicer with the ‘pretty’ flag

Git log –pretty-oneline

Another interesting flag is:

Git log –graph

This will give you a little visual information.

We can limit the log output easily. It looks like this:

Git log –since=4.weeks

There are a lot more options to do with this.

Safety Features

We all make mistakes. When we do, it is nice to be able to get out of it without doing further damage. That is why Git has this section.

When we leave something out of a commit, we use the ‘amend’ option.

Git commit –amend

To unstage a file we use:

Git reset HEAD statistics

Then run:

Git status

We can also use ‘git restore’ which is newer. This gives us another way to unstage a file.

Git restore –staged statistics

Then run git status again to see the changes that Git sees

Working With Remotes

To help or collaborate with others on a project, you have to learn about Git remotes. These are remote repositories hosted somewhere else. The repository could be yours that someone else is helping with or you could be helping someone with their repository. You could have many repositories on your machine. The goal is to help and this involves sending and receiving data from the specified repository.

Seeing Your Remotes

To see the remote servers you have, you run the command:

Git remote

It will list the remotes you have configured to work with. If you have cloned a remote, it will show that in the name. You can see the web addresses that are associated with the remote repository by using the command:

Git remote -v

If you do not have any remotes, then I will help you get that set up. Read on.

You Need A Key

One of the first things you need to do when working with remotes is get a key to the repository. If it is not your repository then someone else will have to allow access for you. However, if this is your repository on Github, Gitlab, or somethign similar, then you need to generate a key on your local system. Then you add it to your Github or Gitlab account. This will give you permission to push/pull from the repository.

To start with, run the command:

ssh-keygen

This command will generate a key and put it in your .ssh folder. The .ssh folder will probably be in your username folder. Inside the .ssh folder you will see any keys, you are looking for a .pub file. Use whatever editor you like to open this file and copy the contents to your clipboard.

Next, log in to your remote repository in Github or Gitlab and find the keys section. You will have an option to paste a key and add it to your profile. Once done, you will be able to clone, push, and pull from this repository.

Create Remote Repository

If this is your own account on something like Gitlab or Github then you need to create a project folder next. Initialize it with a "readme".

Now you need to clone the remote project folder to your local machine. Go into your project folder. Under the "code" section you will see something like "clone with ssh". This is the address you want since we are using ssk keys that we generated earlier.

Linking Local Machine to Remote Repo

It is now time to clone the repository. The clone command will add a remote for you.

git clone [email protected]:UserName/FolderName.git

This will clone the remote project folder to your local machine.

Adding Files

Some of this is a repeat, but it is important you get the order right at first. After you have created your remote repo and cloned that repo to your local computer, you can now create files to track with Git. I personally name my local folder the same thing as the remote repo so it is easier for me to keep track of. So python folder on local machine and the remote project folder. This lets me know these two folders will be synced. You can create files immediately or copy files over into this folder now. Either way, we are ready to add files to be tracked.

git add .

git commit -m "your commit message"

Pushing To Remotes

Now you can send your new files to the remote repo you set up originally. This is called pushing to the remote. When you push to a remote, you are sending your code or other work to the remote repository. You will want to do this every so often anyway. It is your way to save and let others see where the project is currently. You use:

git push

After doing so, you should see success messages giving statistics on your files. Check your online repository to make sure your new files are there. If so, then you should be set up properly and good.

Seeing Your Remotes

You can inspect any of your remotes to get information about it.

Git remote show [address]

It will show you all sorts of information like the web address and the tracking branch information. It will show you all the remote references as well. If the remote repository is more complex, with a lot of users for example, then [git remote show] will give you a lot more data.

Renaming Remotes

If you want to change the nickname for a remote, it is simple to do.

Git remote rename geo geometry

That changes the nickname from [geo] to [geometry]. You can check to make sure it changed to the correct name by running the remote command again.

Git remote

It will show you the server and nickname for it.

You can also just remove a remote. You might do this if you no longer want to work on it.

Git remote remove geometry

Git remote

You will see an updated list of your servers and see the [geometry] server is gone.

Fetching Data

Then, when I want to download everything to my machine from that remote I use:

Git fetch geo

The git fetch command is new, so let me explain. If you have access to a remote, you just use the fetch command to pull new information from it to your local machine.

Git fetch origin fetches any new work since you cloned it or last fetched from it. It does not merge with any of your previous work.

Tagging With Git

Listing Tags
Tagging with git is the ability to mark specific points of interest in your
repositories. Usually, they will mark a significant change. To see existing
tags:

git tag

It just lists them in alphabetic order. You can search for tags that have some
meaning to you.

git tag -l "1.0.5"

This will list tags with this version number. You can also use wildcards to
search just like you would with anything else.

Creating Tags
There are two kinds of tags in Git. They are called lightweight and annotated. A
lightweight tag is like a branch that does not change, it is just a pointer to a
specific commit. Annotated tags are stored as objects in the Git database. They
are checksummed and include the tagger's name, email, and date.

A lightweight tag is the commit checksum stored in a file. There is no other
information stored with it. To create a lightweight tag:

git tag "tag-name"

You can see the information on a specific tag like this:

git show "tag-name"

Creating annotated tags is just as easy. Use the -a flag when creating a tag,
this will provide more information with the tag. You can also provide a message,
like with a commit, with the -m flag. So:

git tag -a 1.0.2 -m "this is the version I worked on"

Now use the "show" command to see the information related to the tag.

git show 1.0.2

It shows all the information such as tagger, date, and the message with it.

Sharing Tags
By deafult, the "git push" command does not transfer tags to remote servers.
However, we can make this happen. You just have to tell the "push" command to do
so.

git push origin "tag-name"

This will push a single tag to your remote. If you want to push many tags at
once, we just use:

git push origin --tags

This will push all the tags available. In fact, it will push both lightweight
and annotated tags.

Deleting Tags
To delete a tag in your local repository, use the -d flag:

git tag -d "tag-name"

This will not remove the tag from remote servers. To delete a tag from a remote
server, use:

git push origin --delete "tag-name"

Checking Out Tags
If you want to view the versions of files a tag is pointing to, you can use:

git checkout "tag-name"

It will say you are in "detached head" state. While pretty ominous sounding, it
simply means that if you make changes then create a commit, the tag will stay
the same. However, your new commit will not belong to any branch and be
unreachable except by the same commit hash. So, if you need to make any changes
to an older version, you need to create another branch.

Aliases in Git

Aliases in git can make your command line experience easier. Many commands are
not intuitive and it is hard to remember the important options that go with
them. By using an alias, you can encapsulate one lengthy command into a
1-2 letter alias, whatever makes sense to you.

Here are some examples you can do from the command line:
git config --global alias.co 'checkout'
git config --global alias.br 'branch'
git config --global alias.cm 'commit -m'
git config --global alias.st 'status'

Of course, you can set the alias as anything you wish.
This means you can type:
git st
This will invoke the status command. The other aliases for commands work the
same way. Make other aliases as it seems useful for you. For anything that
seems difficult to remember, make an alias for it.

Some more aliases that are helpful for me are:
git config --global alias.p 'push'
git config --global alias.f 'fetch'
git config --global alias.lo 'log --onleline'
git config --global alias.last 'log -l head --stat'
git commit --global alias.rv 'remote -v'

To see the aliases you have created, in case you forget, like me:
git config --global -l

Table of Contents

Getting Started With Git

Working With Remotes

Tagging With Git

Aliases in Git

You should also read:

Learning Economics

Learning R for Data Science

Learning the Linux Operating System

Subscribe