Monday, April 1, 2019

Linux Find -- Your Tool To Searching For Dinosaur Bones


Imagine that you have a Swiss Army knife, but you only use the file on it.  Punch a hole in a leather belt, smooth out a broken nail, opening a can of beans, or filleting a trout….chances are you can get the job done with it, but you’re certainly making your job harder than it needs to be.

That’s a pretty fair analogy to how I’ve used the ‘find’ command for the majority of my life.  ‘Shoulder surfing’ of colleagues and I feel that’s the norm.  I’ve only recently (say over the past couple years) began using some of the other find ‘blades’ which can really make your job easier.

Searching for files satisfying a file (or directory) name and finding files of a particular type (e.g. file, directory) are certainly the most common uses for using find.

$ find . –name “*.cpp”
$ find . –iname “*main*.cpp”
$ find . –type f

Often, you want to act on the file list, like grepping for a specific string in each of the files.  For yeeeeeeeeeears I did that by passing the results of the find command to a new command line.  For example, if I was interested in locating the main function declarations in any C++ files found from the current directory I’d do it by either:


$ grep –l “main“ `find . –name “*.cpp”`
$ grep –l “main” $(find . –name “*.cpp”)

*shrug*; so….what’s wrong with that?  Whelp…a couple things: 1) its more complex than it needs to be and 2) it’s not uncommon for the results of find to exceed the command line length limits.  So, why in the world did I do that for literally DECADES?  Simple…I didn’t know any better and frankly the ‘-exec’ option confused my little ‘ol knowledge nugget. 
Understanding the ‘-exec’ subcommand will pay dividends almost immediately, once you get over the confusing syntax.  The subcommand takes the form ‘-exec somecommand {} \;’, the results of the find command will be substituted for the brackets.  The following ‘\;’ indicates the end of the command chain….just get in the habit of slapping it on the end for now.
So, the equivalent of the previous commands would take the form:


$ find . –name “*.cpp” –exec grep –l “main” {} \;

Suppose you have 2 files that satisfy this find: file1.cpp & file2.cpp, notionally this would result in the equivalent of ‘grep –l “main” file1.cpp file2.cpp’.
Well, that doesn’t seem much simpler….why bother?  Let’s say you’re interested in searching for header and implementation files (e.g. *.h & *.cpp).  You could certainly accomplish this without the exec subcommand:


$ grep –l “main“ `find . –name “*.cpp”` `find . –name “*.h”`
$ grep –l “main” $(find . –name “*.cpp”) $(find . –name “*.h”)

The equivalent using the exec subcommand would be:


$ find . \( -name "*.cpp" -o -name "*.h" \) -exec grep -l "main" {} \;

The first half specifies a search criteria of all files satisfying “*.h” OR “*.cpp”.

Last bit, consider how you’d find all files that reference ‘main’ but aren’t headers or implementation files?  Extremely simple change for the exec subcommand example, consider the complexity of not using exec.



$ find . –not \( -name "*.cpp" -o -name "*.h" \) -exec grep -l "main" {} \;

If you haven’t already done so, start using the special blades of your find command.

No comments:

Post a Comment