
Searching for Files in Linux
This is a guide on searching for files in Linux.
Locate
The locate program performs a rapid database search of pathnames and then
outputs every name that matches a given substring.
locate Documents/python
Locate will search its database of pathnames and output any that contain the
string Documents/python.
You can combine locate with other search tools like grep.
locate python | grep bin
Find
While the locate program can find a file based solely on its name, the find
program searches a given directory for files based on a variety of attributes.
We are going to spend a lot of time with find because it has a lot of
interesting features that we will see again and again when we start to cover
programming concepts.
In its simplest use, find is given one or more names of directories to search.
For example, to produce a listing of our home directory, we can do this:
find ~
On most active user accounts, this will produce a large list. Because the list
is sent to standard output, we can pipe the list into other programs.
find ~ | wc -1
The beauty of find is that it can be used to identify files that meet specific
criteria. It does this through the application of options, tests, and actions.
Let us say we want a list of directories from our search. To do this, we could
add the following test:
find ~ -type d | wc -l
Find Tests
Adding the test -type d limited the search to directories. Conversely, we could
have limited the search to regular files with this test:
find ~ -type f | wc -l
Here are the common file types we can use with find:
b block special device file
c character special device file
d directory
f regular file
l symbolic link
We can also search by file size and filename by adding some additional tests.
Let us look for all the regular files that match the wildcard pattern *.jpg and
are larger than one megabyte.
find ~ -type f -name "*.jpg" -size +1M | wc -l
In this example, we add the -name test followed by the wildcard pattern. Notice
how we enclose it it in quotes to prevent pathname expansion by the shell. Next,
we add the -size test followed by the string +1M. The leading plus sign
indicates that we are looking for files larger than the specified number. A
leading minus sign would change the meaning of the string to be smaller than the
specified number. Using no sign means "match the value exactly". The trailing
letter M indicates that the unit of measurement is megabytes.
Find Operators
Even with all the tests that find provides, we might still need a better way to
describe the logical relationship between the tests. For example, what if we
needed to determine whether all the file and subdirectories in a directory had
secure permissions?
We would look for all the files with permissions that are not 0600 and the
directories with permissions that are not 0700. Fortunately, find provides a way
to combine tests using logical operators to create more complex logical
relationships.
find ~ \( -type f -not -perm 0600 \) -or \( -type d -not -perm 0700\)
Xargs
The xargs command performs an interesting function. It accepts input from
standard input and converts it into an argument list for a specified command.
find ~ -type f -name 'python' -print | xargs ls -l
Here we see the output of the find command piped into xargs,which constructs an argument list for the ls command and then executes it.
While the number of arguments that can be placed into a command line is quite
large, it is not unlimited. It is possible to create commands that are too long
for the shell to accept. When a command line exceeds the max length supported by
the system, xargs executes the specified command with the max number of
arguments possible and then repeats this process until standard input is
exhausted. To see the max size of the command line, execute xargs with the
--show limits option.
Find Options
Finally, we have the options, which are used to control the scope of find
search. They may be included with other tests and actions when constructing find
expressions.
-depth Direct find to process a directory's files before the directory itself.
-maxdepth levels Set the max number of levels that find will descend into a
directory tree before applying tests and actions.
-mount Direct find to to traverse directories that are mounted on other file
systems.
-noleaf Direct find not to optimize its search based on the assumption that it
is searching a Unix-like file system.