xargs is used to execute a command one or potentially multiple times using the standard input as arguments to that command.
What sometimes in unclear is whether the command invoked by xargs is executed once or multiple times.
This blog post should clear things up …
Firstly, here is the xargs program usage in layman’s terms
xargs [xargs options] [command] [arguments for command based on xargs standard input]
The standard input would typically be output that was piped from an earlier command such as find / ls / echo etc. It doesn’t have to be though, you can invoke xargs as a standalone no-arg command and simply type the input and terminate with a control-d to signal end-of-terminal.
Let’s try a simple example combining the xargs --verbose option so that we can see the command-line that xargs will execute on the standard error output before it is actually invoked. In the screenshot below, I invoke the xargs command, and then enter 1 to 5 separating each with a newline (enter), followed by a terminating ctrl-d. As I provided the --verbose option to xargs, it wrote out the command that it will execute and the arguments that it is going to provide to that command.
The output above shows that be default if you don’t provide an explicit command for xargs to invoke, it will leverage /bin/echo. You can also see xargs invoked this echo command just a single time. What may not be obvious is how xargs processed the standard input to come up with arguments to supply the (in this case 'echo') command. xargs will by default treat whitespace and newlines as delimiters. In the example below, I entered: 1 TAB 2 NEWLINE 3 SPACE 4 NEWLINE 5 NEWLINE ctrl-d.
So how and under what conditions does the command that xargs executes get invoked multiple times?
Well.. you can either explicitly tell xargs that a particular command can only operate on a specific number of arguments at a time, and that the command should be reinvoked as required to work through the remaining arguments; OR … xargs may determine itself that the command to execute along with any arguments hits a maximum command-line length, in which case it automatically splits the arguments across multiple command invocations.
Let’s first explicitly tell xargs that a command invocation should only work on a maximum number of arguments at a time. You do this through the -n option …
The above example first shows supplying a “-n1” option which results in a command being invocated for each argument. Later the “-n2” option is leveraged which result in a command being invoked for every two arguments. The final xargs test above shows how 1 TAB 2 NEWLINE 3 SPACE 4 NEWLINE 5 NEWLINE ctrl-d is processed with the “-n1” option. You can see that xargs immediately starts invoking commands after each line of input is processed.
As mentioned prior, xargs may itself automatically split arguments across multiple commands if a maximum command-line character length is reached. The xargs binary will likely have a default size limit hardcoded to the operating system ARG_MAX length.
You can explicitly tell xargs the max command-line length using the --max-chars or -s options. The length comprises the command and initial-arguments and the terminating nulls at the ends of the argument strings. As seen below “/bin/echo 1” will take 12 chars, and “/bin/echo 1 2” will take 14 chars (including terminating null).
/bin/echo 1 2
12345678901234
Let’s try out this “-s” option…
Here is a final example that ties everything together. It demonstrates the following:
- xargs receiving piped standard-input from a prior command (i.e tr / find).
- xargs using the --verbose option to output the command that will be invoked
- xargs being told to leverage an explicit delimiter “-d” option, e.g “\n” – newline
- xargs invoking an explicit command (the Unix “file” command)
- xargs invoking an explicit command once per argument (“-n1”)
- find command leveraging the “-print0” option to delimit search results by the null character (rather than newline), in conjunction with the “-0” (or --null) xargs option so that any search results which contain whitespace are treated correctly as a single command-argument.
No comments:
Post a Comment