Managing Files In A Linux Environment
This article gives you a foundation for managing files in a Linux environment.
Introduction
Linux systems are organized by a tree structure based at its root. We access all files and directories in this way.
Prerequisites
You will need a working Linux system to follow along. I recommend doing this as it helps you learn and get more comfortable with the Linux file system.
Filepaths
File names are absolute or relative. Absolute paths begin with a /. We separate each directory in a path by a /. You can always tell where you are in a couple different ways. We accomplish this by using the $pwd variable and just using the ‘pwd’ command in your terminal.
Listing File Details
Information about a file is recorded in an inode. An inode will give you lots of information when you query it. Some of this includes the owner, creation date, size, and access rights. When you use the ‘ls’ command with the ‘l’ option, it will give you some of this, anyway.
ls -l
total 231
-rw-r--r-- 1 Jason 197121 9 Jul 14 18:04 README.md
-rw-r--r-- 1 Jason 197121 64 Jul 14 18:04 'compiling and running'
-rw-r--r-- 1 Jason 197121 160 Jul 14 22:15 variables1.cpp
-rwxr-xr-x 1 Jason 197121 44458 Jul 14 22:15 variables1.exe*
-rw-r--r-- 1 Jason 197121 170 Jul 14 22:21 variables2.cpp
-rwxr-xr-x 1 Jason 197121 44599 Jul 14 22:21 variables2.exe*
-rw-r--r-- 1 Jason 197121 415 Jul 14 22:38 variables3.cpp
-rwxr-xr-x 1 Jason 197121 45334 Jul 14 22:38 variables3.exe*
-rw-r--r-- 1 Jason 197121 175 Jul 14 22:43 variables4.cpp
-rwxr-xr-x 1 Jason 197121 44511 Jul 14 22:43 variables4.exe*
-rw-r--r-- 1 Jason 197121 373 Jul 14 22:54 variables5.cpp
-rwxr-xr-x 1 Jason 197121 44934 Jul 14 22:54 variables5.exe*
You can see the '-l' option gives you a lot more useful information. It is also presented in a nicer way.
The ‘ls’ command does not list hidden files unless you ask it too. You use the ‘a’ option with ‘ls’ to see hidden files and directories. To get the best view of files, use ‘ls -al’ in your terminal. This will show you all the files and give you lots of information about each.
ls -al
total 243
drwxr-xr-x 1 Jason 197121 0 Jul 14 22:54 ./
drwxr-xr-x 1 Jason 197121 0 Jul 14 18:04 ../
drwxr-xr-x 1 Jason 197121 0 Jul 14 18:04 .git/
-rw-r--r-- 1 Jason 197121 9 Jul 14 18:04 README.md
-rw-r--r-- 1 Jason 197121 64 Jul 14 18:04 'compiling and running'
-rw-r--r-- 1 Jason 197121 160 Jul 14 22:15 variables1.cpp
-rwxr-xr-x 1 Jason 197121 44458 Jul 14 22:15 variables1.exe*
-rw-r--r-- 1 Jason 197121 170 Jul 14 22:21 variables2.cpp
-rwxr-xr-x 1 Jason 197121 44599 Jul 14 22:21 variables2.exe*
-rw-r--r-- 1 Jason 197121 415 Jul 14 22:38 variables3.cpp
-rwxr-xr-x 1 Jason 197121 45334 Jul 14 22:38 variables3.exe*
-rw-r--r-- 1 Jason 197121 175 Jul 14 22:43 variables4.cpp
-rwxr-xr-x 1 Jason 197121 44511 Jul 14 22:43 variables4.exe*
-rw-r--r-- 1 Jason 197121 373 Jul 14 22:54 variables5.cpp
-rwxr-xr-x 1 Jason 197121 44934 Jul 14 22:54 variables5.exe*
You can see the hidden directories at the top. Git is tracking this directory so we see our usual '.git' file there.
Another good option to use when looking at directory structure is the '-h' option. This is for human readable format, instead of machine. There is not much difference but it makes the file sizes easier to read.
ls -lh
total 231K
-rw-r--r-- 1 Jason 197121 9 Jul 14 18:04 README.md
-rw-r--r-- 1 Jason 197121 64 Jul 14 18:04 'compiling and running'
-rw-r--r-- 1 Jason 197121 160 Jul 14 22:15 variables1.cpp
-rwxr-xr-x 1 Jason 197121 44K Jul 14 22:15 variables1.exe*
-rw-r--r-- 1 Jason 197121 170 Jul 14 22:21 variables2.cpp
-rwxr-xr-x 1 Jason 197121 44K Jul 14 22:21 variables2.exe*
-rw-r--r-- 1 Jason 197121 415 Jul 14 22:38 variables3.cpp
-rwxr-xr-x 1 Jason 197121 45K Jul 14 22:38 variables3.exe*
-rw-r--r-- 1 Jason 197121 175 Jul 14 22:43 variables4.cpp
-rwxr-xr-x 1 Jason 197121 44K Jul 14 22:43 variables4.exe*
-rw-r--r-- 1 Jason 197121 373 Jul 14 22:54 variables5.cpp
-rwxr-xr-x 1 Jason 197121 44K Jul 14 22:54 variables5.exe*
You will notice in most of these listings, that some basic information is shown over and over. That is the characteristic of the 'ls' command.
This listing shows you these attributes:
- file or directory
- links to the file
- file’s owner
- owner’s group
- size of file in bytes
- time stamp
- last modification
- name of file or directory
Another option that is useful is ‘i’ along with ‘ls’. This will give you inode information a file.
ls -li
total 231
104427216359722182 -rw-r--r-- 1 Jason 197121 9 Jul 14 18:04 README.md
24206847997189317 -rw-r--r-- 1 Jason 197121 64 Jul 14 18:04 'compiling and running'
9851624184970397 -rw-r--r-- 1 Jason 197121 160 Jul 14 22:15 variables1.cpp
3659174697337710 -rwxr-xr-x 1 Jason 197121 44458 Jul 14 22:15 variables1.exe*
17169973579462677 -rw-r--r-- 1 Jason 197121 170 Jul 14 22:21 variables2.cpp
8444249301432886 -rwxr-xr-x 1 Jason 197121 44599 Jul 14 22:21 variables2.exe*
56576470318862464 -rw-r--r-- 1 Jason 197121 415 Jul 14 22:38 variables3.cpp
2533274790509358 -rwxr-xr-x 1 Jason 197121 45334 Jul 14 22:38 variables3.exe*
5348024557584218 -rw-r--r-- 1 Jason 197121 175 Jul 14 22:43 variables4.cpp
7036874417879868 -rwxr-xr-x 1 Jason 197121 44511 Jul 14 22:43 variables4.exe*
27866022694408530 -rw-r--r-- 1 Jason 197121 373 Jul 14 22:54 variables5.cpp
13510798882225059 -rwxr-xr-x 1 Jason 197121 44934 Jul 14 22:54 variables5.exe*
The inode information is at the beginning on the far left. This can be very useful.
You can list directories and their child directories by using ‘ls -R’. This only lists the directories and not the files inside them.
The last example for 'ls' I am going to show is another I like a lot. It is the '-S' option, which stands for size. This will list files in order of size, I use this one a lot when organizing.
ls -lS
total 231
-rwxr-xr-x 1 Jason 197121 45334 Jul 14 22:38 variables3.exe*
-rwxr-xr-x 1 Jason 197121 44934 Jul 14 22:54 variables5.exe*
-rwxr-xr-x 1 Jason 197121 44599 Jul 14 22:21 variables2.exe*
-rwxr-xr-x 1 Jason 197121 44511 Jul 14 22:43 variables4.exe*
-rwxr-xr-x 1 Jason 197121 44458 Jul 14 22:15 variables1.exe*
-rw-r--r-- 1 Jason 197121 415 Jul 14 22:38 variables3.cpp
-rw-r--r-- 1 Jason 197121 373 Jul 14 22:54 variables5.cpp
-rw-r--r-- 1 Jason 197121 175 Jul 14 22:43 variables4.cpp
-rw-r--r-- 1 Jason 197121 170 Jul 14 22:21 variables2.cpp
-rw-r--r-- 1 Jason 197121 160 Jul 14 22:15 variables1.cpp
-rw-r--r-- 1 Jason 197121 64 Jul 14 18:04 'compiling and running'
-rw-r--r-- 1 Jason 197121 9 Jul 14 18:04 README.md
Sorting Output
You have some options in how ‘ls’ gives output to you. Alphabetically is the default output. This is easy to change, however. To modify by time, use ‘ls -t’, which is quite useful. You can add the ‘r’ option to reverse that time listing, which gives you a lot of flexibility.
Copying Files
Copying files is a task that happens a lot. We use the ‘cp’ command for that. It can make copies of files or directories. Copying multiple files or directories at a time is also a simple task. To use ‘cp’, you have to give sources and target name. The source must include a path.
$ cp file1 dir1
If your target is a directory, then your sources will be put inside it.
Copying recursively is a simple task too. You use the ‘cp -R’ to do this. However, you can’t use the source directory as a target.
Moving Files
I do this with the ‘mv’ command. It is used to move or rename file and directories. It follows the same rules as the ‘cp’ command mostly. The behavior is about the same as copying a file and then deleting the original.
mv file1 dir2
This simple example just moves 'file1' into the directory 'dir2'. This was all in the same folder. However, just add the paths to both file and directory if you are not in the same directory.
Removing Files
You remove files with the ‘rm’ command. It can remove files or directories and multiples at a time. You should always be careful about doing this and make backups, just in case. The basic usage is this.
rm filename
Many stories have populated the internet of people who have used ‘rm’ to delete system files and killed their system. It happened because they did not truly understand what they were deleting.
$ ls -l
total 0
-rw-r--r-- 1 Jason 197121 0 Jul 16 08:13 file1
Jason@home-jahlelin MINGW64 ~/Documents/atom/cpp/dir1 (main)
$ rm file1
Jason@home-jahlelin MINGW64 ~/Documents/atom/cpp/dir1 (main)
$ ls -l
total 0
This is nice and simple but again, add paths if you need to. Deleting directories is done like this:
$ rm -R dir1
By default, directories can only be deleted when they are empty. So, if you need to wipe things out, the information on how to do is in the 'man' pages. I'll let you look there for it. Most people should not ever need to do that.
The ‘rm -i’ command and option is a nice safeguard and you should use it. You could set up an alias for ‘rm’ equaling ‘rm -i’ which would be an extra precaution.
Making Directories
You create directories through the ‘mkdir’ command. It could be something as simple as, ‘mkdir music’. Create several directories at once by using ‘mkdir music videos’. You can see newly created directories by following up with an ‘ls’ command to make sure it happened correctly.
mkdir music
Jason@home-jahlelin MINGW64 ~/Documents/atom/cpp (main)
$ ls -l
total 231
-rw-r--r-- 1 Jason 197121 9 Jul 14 18:04 README.md
-rw-r--r-- 1 Jason 197121 64 Jul 14 18:04 'compiling and running'
drwxr-xr-x 1 Jason 197121 0 Jul 16 08:22 dir1/
drwxr-xr-x 1 Jason 197121 0 Jul 16 08:17 dir2/
drwxr-xr-x 1 Jason 197121 0 Jul 16 08:28 music/
-rw-r--r-- 1 Jason 197121 160 Jul 14 22:15 variables1.cpp
-rwxr-xr-x 1 Jason 197121 44458 Jul 14 22:15 variables1.exe*
-rw-r--r-- 1 Jason 197121 170 Jul 14 22:21 variables2.cpp
-rwxr-xr-x 1 Jason 197121 44599 Jul 14 22:21 variables2.exe*
-rw-r--r-- 1 Jason 197121 415 Jul 14 22:38 variables3.cpp
-rwxr-xr-x 1 Jason 197121 45334 Jul 14 22:38 variables3.exe*
-rw-r--r-- 1 Jason 197121 175 Jul 14 22:43 variables4.cpp
-rwxr-xr-x 1 Jason 197121 44511 Jul 14 22:43 variables4.exe*
-rw-r--r-- 1 Jason 197121 373 Jul 14 22:54 variables5.cpp
-rwxr-xr-x 1 Jason 197121 44934 Jul 14 22:54 variables5.exe*
Make nested directories by using ‘mkdir -p /music/albums/BigBang’. This creates the BigBang directory along with the two parent directories.
$ mkdir -p music/albums/BigBang
Removing Directories
Removing directories is easy with the ‘rm’ command. It also has the ‘-p’ option. Any use of the ‘rm’ command must be used with caution. Removing directories and their files can be dangerous so make backups please. Directories, by default, have to be empty before you can remove them.
rmdir dir1
Removing directories can be done recursively as well. I do this by using the ‘rmdir -R’ command. Again, please be careful doing this. You shouldn’t have to do this very often, if at all. I will hesitantly add the ‘-f’ option to force things through. I do this if a system needs cleaning and I don’t own all the files.
Touching Files
The touch command is a multi-purpose one. It can do several things such as updating file access times, modification times, create empty files, and specifying time stamps. In this first example, I create an empty file.
$ touch SuperJunior
Use ‘touch’ to update the modification time of a file. I do this by using ‘touch’ with the filename as a parameter. This sets the timestamp to the current time.
$ touch -a SuperJunior
If the file you use as a parameter does not exist, then ‘touch’ creates the file for you. If you don’t want it to create the file for you, you can give it the ‘-c’ option. This will tell it to not create an empty file for you. It looks like ‘touch -c file1’.
$ touch -c SuperJunior
You can use ‘touch’ to set the modification time of a file to anything you want. It uses the format ‘YYMMddhhmmss’. So, you can be very precise when setting it. The command looks looks like ‘touch -t YYMMddhhmmss filename’.
Using Find
You use ‘find’ to be very precise in your search. Several criteria exist you can specify. These include name, time stamp, owner, creation date, and size. The choice is up to you. When you use a name to search, it can be all or part of a name. You can also make use of wildcards in your search patterns.
Jason@home-jahlelin MINGW64 ~/Documents/atom (master)
$ find . -name variables1.cpp
./cpp/variables1.cpp
Find will show you what the path is for the file your looking for. You can also look for directories.
$ find / -type d -name cpp
Searching for size is easy and has several options. Finding a file that is around a certain size is doable. You can also set lower and upward bounds to find anything in that range. When using ‘size’, the ‘-c’ parameter is for bytes and the ‘-k’ parameter is for kilobytes. Then, you can also search for empty files like this ‘find . -size 0’.
Get File Details
To get file details, you use the ‘file’ command. Since Linux files do not have an extension that identifies what they are, this is useful to you about them. This is because you need to know what program to use to open the file.
$ file music
music: directory
The ‘file’ command performs a few tests in order to accomplish this. Basically, it checks to see if the file is empty and what data is inside.
Now, let us look at a file to see what it is all about.
$ file variables1.cpp
variables1.cpp: C source, ASCII text, with CRLF line terminators
That tells us a lot of information. Without a doubt, it should tell us what kind of file that. I gave this file an extension, but if it did not have one, this information would be vital to knowing what kind of file it is. I am pretty forgetful, so this helps me a lot.
Compressing Files
Compressing and decompressing files is a necessary skill in any operating system. Compressing files is helpful any time you want to back up or send files some place. One of the best utilities for compressing and decompressing files is ‘gzip’.
Text files are the easiest files to compress and have the most effect. However, binary and image files do not have the same effect, if they are successful. Use the ‘gzip filename’ command to compress the file. Then you can use the ‘gunzip filename’ to decompress it.
File Archival
Files, directories, and file systems all need to be archived. There are two major commands that get this job done. They are the ‘tar’ and ‘dd’ commands.
One of the primary purposes of archival tools is performing backups, which is what I want to discuss. There are three main types of backups, incremental, differential, and a full backup.
An incremental backup is the changes since the last incremental. Recovering from a disaster requires the last full backup and all the subsequent incremental backups.
A differential backup is all the changes since the last full backup. Recovery for these needs the last full backup plus the latest differential.
A full backup is everything. It is usually of a complete filesystem or all the user files. It takes the longest to recover from since it will have the most data.
Tar Command
‘Tar’ stands for tape archive. It will create an archive file from a list of source files or directories. Restoring is easy, the ‘tar’command does this as well. So, you can handle all archival processes with one command. Another nice thing when archiving sub-directories are automatically included when the original source is a directory.
Output can be a file, hardware, or ‘stdout’. This makes it very flexible, you can use it whichever way you need to. You choose the output option with ‘-f’. To extract an archive, you use the ‘-x’ option. Let's start with a quick example.
$ tar -cfv cpp.tar /home/cpp/
There are a couple other options I have not mentioned. The '-c' option means to make a new archive, while the '-v' option will show us progress. The archive made will be called 'cpp.tar'. It is an archive of the 'cpp' directory.
We can create an archive file and compress it at the same time. This is a popular way to do it. We will add the '-z' option to the 'tar' command.
$ tar -cvfz cpp.tar.gz /home/cpp/
The '-z' option will compress the 'tar' file with 'gzip' that we mentioned earlier.
When we want to extract files from a 'tar' file, we use the '-x' option. The command will now look like this.
$ tar -vfx cpp.tar
This will extract the archived files into the current directory.
The same process works when you want to decompress an archive file. Again, you use the '-x' option.
$ tar -vfx cpp.tar.gz
This will extract the files, because of the '-x' option and decompress them.
You can always list the contents of a 'tar' file, in case you forgot. You use the '-t' option to get this done. The command now looks like this.
$ tar -vft cpp.tar
It works the very same way with a compressed archive.
$ tar -vft cpp.tar.gz
This shows us what files we originally archived and compressed.
DD Command
The ‘dd’ command is another type of copy command. However, it can do a lot more than the ‘cp’ utility. It is quite powerful. The ‘dd’ command can do some basic conversions on files and write to raw devices.
The basic usage is ‘dd if=source of=target’. It will use ‘stdin’ and ‘stdout’ if you want it to, but its genuine power is choosing what it does. Either of the source or target destinations can be a raw device. This lets you work with disks easily.
You can compress your archive with ‘gzip’ using a pipe to decrease the size. We can do it in the same command. Archiving will take a lot longer if you are compressing at the same time. So, just be aware of that.
Let's do a basic example to see how it works.
$ dd if=/dev/sdb of=/dev/sdc
This will just make a straight copy of one device to another. The output device has to be larger than the input device.
The next option is creating an image with the 'dd' command. This is very handy when backing things up. You can do this with devices or files.
$ dd if=/dev/sdb of=/backups/sdb.img
This makes a backup 'img' file of the 'sdb' device. You can't have too many backups.
We can also compress while we make a backup. This is done by piping to the 'gzip' utility.
$ dd if=/dev/sdb | gzip -c >/backups/sdb.img.gz
As you can see, this makes an 'img' file that is also compressed. The command piped the output into 'gzip' which compresses the 'img' file.
Now, we need to go over restoring files. You just reverse the process.
$ dd if=/backups/sdb.img of=/dev/sdb
This is a straight copy back to the original device.
We can do the same process to restore a compressed 'img' file. It looks like this.
$ gzip -dc /backups/sdb.img.gz | dd of=/dev/sdb
This command uses the 'gzip' utility to decompress the 'img' file and pipe it to the 'dd' command. The 'dd' command then restores the decompressed file back to the original device. There is actually a lot more you can do with the 'dd' command, it is definitely one of my favorite Linux commands.
Conclusion
This has been a fun guide to work on. In it, we have discussed listing contents of directories, copying files, moving files, finding files and directories, compressing files, and archiving files. There is still a lot I left out but that information is in the 'man' pages if you are curious.
Thanks For Reading
If you read this far, I really appreciate it. I hope my guides help others and I also love spreading around Linux knowledge.
If you would like to join my newsletter, you can do so here.
If you don't know what to read next, try these options: