Friday, March 7, 2014

Batch file processing from the command line (photo-editing, renaming, etc)

One of the convenient things about the command line is being able to easily perform batch operations on a large number of files, things that are systemically applied to each file in the same way, such as replacing all of the spaces in file names with underscores or hyphens, resizing images or converting from .png to .jpg, etc. Here I'm going to present some of the basic tools for performing these kind of operations and you can then put them together in various ways to perform any number of customized tasks. Note that I'm using Bash for the interactive shell (or interpreter) here, and the parameter expansions presented here, although useful, are not portable.

Looping

The most common idiom you will see for iterating over a set of files is using the glob to match a set of files:

for f in /path/*

The * matches any file in /path, so it becomes the list of files in that directory. The variable f becomes each file (or rather, the full path to each file) in turn, and we can use that in the code that follows. The command following the for should begin with do and the loop will continue until it reaches the done command and then repeat for each file. The glob can be used to match part of a file, so we could operate on only the .png files in /path like this:

for img in /path/*.png

In addition to the glob, you can also use other things like brace expansion. For instance, /path/*.{png,jpg,gif} will give you all of the files ending in .png, .jpg, or .gif. If you have a set of pictures with names like DSC_nnnn.JPG, but you only want to work on images 5 through 22, /path/DSC_{0005..0022}.JPG would let you do this. You can also use command substitution to loop over the output of a command by wrapping the command with $() or ``.[1]

Parameter Expansion

Bash parameter expansion provides a really handy way of manipulating file names. Here we will see how to separately get the filename, path, and extension, and substitute one character for another.

Get the base name of a file:

base=${file##*/}

Get the path to a file:

path=${file%/*}

Strip the last extension from a filename:

new=${file%.*}

Strip all extensions from a filename:

new=${file%%.*}

These work by stripping a prefix (# and ##) or suffix (% and %%) from $file. In the first example, the * comes before the / because we are stripping a forward slash and everything that comes before it, while in the latter two examples the * comes after the . because we are stripping a dot and everything that comes after it. The double forms strip the longest match, while the single forms strip the shortest match. But with the latter two examples, beware files and directories with unexpected dots! For instance:

file="/home/user/my.pics/selfie.jpg"
echo ${file%%.*}


will print "/home/user/my",

file="/home/user/web2.0essay"
echo ${file%.*}


will print "/home/user/web2". Errors like these are easier to avoid when you are typing a quick command and know what kind of files you are dealing with. If you are writing a script that might later be used in different context, you must be extra careful that it doesn't break when file and directory names don't conform to your initial expectations.

Replacing spaces in a filename with hyphens:

new=${file// /-}

This form actually does pattern matching. You can also require it to match at the beginning or end by replacing the second forward slash with % or #, respectively. If you leave out the last part, whatever matches the pattern will be replaced by nothing, that is, it will be deleted. In this case, you can also omit the final forward slash. So you could remove only a three character file extension with:

new=${file/%.???}

Or you could remove a two digit prefix with:

new=${file/#[0-9][0-9]}

Resizing Images

For command line image processing, we will be using convert from the imagemagick package. The basic command for resizing an image is:

convert picture.jpg -resize 1232x816 smallerpicture.jpg

Note that this will not necessarily make the image exactly 1232x816, it will make it fit inside a box that is 1232x816; it will not squeeze, stretch, or crop the image to fit that exact size. If you want to distort the image to fit:

convert picture.jpg -resize 1232x816\! smallerpicture.jpg

Distortion, however, is often undesirable and it may be better to crop the image. Suppose you wanted to create a series of 64x64 thumbnails without distorting the images. You could crop them with:

convert picture.jpg -resize 64x64^ -gravity center -extent 64x64 thumbnail.jpg

The ^ means to make the image fill, rather than fit into, the 64x64 box. The extent crops the image to 64x64, and the -gravity center means to center the image when cropping so the thumbnail comes from the center of the original image.

Now suppose that you wanted to shrink a series of larger images, but you don't want to enlarge any that are smaller. You can do that like so:

convert picture.jpg -resize 64x64\> resized.jpg

You will notice that all of the above examples require a file name to be supplied for the output, so if we actually want to use them on a batch of files, we will need to combine them with a loop, like this:

mkdir /path/images/thumbnails
for img in /path/images/*.{jpg,gif,png}
do basename=${img##*/}
name=${basename%.*}
convert "$img" -resize 64x64^ -gravity center -extent 64x64 "/path/images/thumbnails/${name}-thumb.jpg"
done


For each image.jpg this will produce an image-thumb.jpg in /path/images/thumbnails. Note that this will create jpg thumbnails for gif and png images, too. You may also note that I didn't bother to write out a script for this, these commands can easily enough be typed in on the command line for a simple task, as explained in Protip #2. Of course, you could write a script if you will be doing the exact same operation frequently.

Convert Image Types

You will notice that our thumbnail example actually converted any gif or png images to jpg when generating thumbnails. Similarly, it is very simple to convert one image type to another, such as if you wanted to use a series of jpg images to make an animated gif or png (similar to this), you just leave out the resize part:

for img in /path/*.jpg
do convert "$img" "${img%.*}.png"
done

Renaming Files

We saw the basics for this when we introduced parameter expansion, but I'm going to go ahead and show a full example. Suppose that we have some files with spaces in their names and find this annoying when manipulating them from the command line, so we decide to replace the spaces with underscores. But some of the files have hyphens and the resulting "_-_" just doesn't look right, so you want to collapse the spaces surrounding a hyphen. Easy enough, we'll just use two steps:

for f in /path/*
do f1=${f// - /-}
new=${f1// /_}
mv "$f" "$new"

done

But what if the path contains spaces? That would be a problem here, but it's not too hard to get around:

for f in /path/*
do basename=${f##*/}
f1=${basename// - /-}
new=${f1// /_}
mv "$f" "${f%/*}$new"


We simply removed the path first, and added it back at the end.

And More!

You can do many more interesting things on the command line with Bash, and ImageMagick offers a huge selection of image editing features not mentioned here. If you want to learn more, you may want to check out the Bash Reference Manual and Examples of ImageMagick Usage. I also recommend GreyCat's Wiki, which has several resources on Bash, including an introductory guide, pitfalls, FAQ, and quick reference sheet.

No comments:

Post a Comment