xargs
How to you take a list of files and do something with them in the UNIX shell? xargs is the key.
If you’ve run in to xargs, it’s probably in it’s most simple form:
xargs rm < list.txt
Where list.txt is a list of white-space separated file names. <
list.txt
is shell speak for send the file list.txt STDIN of the
command and has the same effect that cat list.txt | xargs rm
without
the overhead of an extra command.
one.txt
two.txt
three.txt
...
nine-thousand-four-two.txt
What the above command line does is break the file up in to chunks of
5000 (the default many vary from system to system) and pass them to
the command rm
. The effect being the same as running:
rm one.txt two.txt ... four-hundred-ninety-nine.txt five-thousand.txt
rm five-thousand-one.txt ... nine-thousand-four-one.txt nine-thousand-four-two.txt
xargs will run the command as many times as it needs to consume the
chunks of 5000, the last chunk will be however many are left
over. You can change the number of arguments per chunk with -n
i.e. -n 500
.
That’s handy, but not super usefully, much more handy is using xargs with a pipe:
find . -user alice -print0 | xargs -0 -n 500 chgrp staff
The find
command finds all of the files owned by “alice” (more on
find some other time) and passes them to xargs which changes the group
ID of the files, in chunks of 500, to be group “staff”. What’s with
the -print0
and -0
arguments?
As I noted above, xargs expects arguments to be separated by white space, meaning that:
one.txt
two.txt
three.txt
and:
one.txt two.txt three.txt
both appears a three separate files to xargs. Of course UNIX like systems allow spaces in file names, given a file called My Secret Plans xargs would see three files, My, Secret, and Plans.
The -0
argument tells xargs to instead use a NUL (“\0”) as the
separator. find normally prints each file name followed by the a
newline. -print0
causes it to instead follow the file name with a
NUL. The combination of these to arguments allows xargs to handle
file names with spaces, tabs, and even newlines in them.
That’s as far as most people get with xargs, but it has a few other tricks up it’s sleeves.
Before we get to them, it’s important to note that there’s a bit of a schism in xargs versions. The two main variants are the GNU version, which ships with most, if not all Linux destros, and the BSD version which ships with the BSDs and with OS X.
First, verbosity, the -t
option will print each command line before
executing it. If you’re not sure you trust the xargs command you can
take it one step further with -p
. In addition to printing the
command you will be prompted before each execution:
ls | xargs -p rm
rm 1.txt 2.txt 3.txt 4.txt?...
There’s positional arguments. Most examples of xargs pass the
as the final arguments to the command. However, by using the -I
option you can specify a string that will be replaced with the name of
the with the arguments. For example to copy a list of directories and
files to “destdir”.
xargs -I % cp -rp % destdir < list-of-things-to-copy.txt
# cp -rp file-one.txt destdir
# cp -rp some-directory destdir
This form will run one command per argument, as if you had specified
-n 1
. The BSD version also has the -J
option that
works like -I
but replaces the string with the whole chunk of
arguments. The GNU version lacks this option.
xargs -J % cp -rp % destdir < list-of-things-to-copy.txt
# cp -rp file-one.txt some-directory destdir
Instead, the GNU version has a (deprecated) -i
option which is the
equivalent of -I {}
:
xargs -i cp -rp {} destdir < list-of-things-to-copy.txt
# cp -rp file-one.txt destdir
# cp -rp some-directory destdir
Finally, there’s Parallel mode. xargs default behavior it to run one
process with each chunk of data, waiting for the command to complete
before continuing. -P 5
will cause it to run 5 processes at a
time. Some versions of xargs allow -P 0
which means “run as many
processes as possible”.
It’s important to note that while in most examples it’s being passed list of files, xargs doesn’t care, arguments are arguments. For example, to add a list of users to the developers group:
sudo xargs -n 1 useradd -G developers < list-of-users.txt
So, next time you have need process a big list of arguments, reach for xargs. You can find a little more on this history and usage of xargs at the Wikipedia.
Comments