How to treat lines of a file as commands and execute in parallel

The problem

I had to copy a lot of files of varying sizes (a few KB up to hundreds of GB). Copying in parallel is faster so I need to chunk the large files and copy the chunks in parallel. I solved this by reading the size of each file and dynamically generating commands for all the required chunks, then executing the commands in parallel with a managed max concurrency using xargs. Each line in the file actually contained a few commands, which makes it slightly harder. An example file:

# commands.txt
echo "starting file A chunk 1" && some-cmd A 1 && echo "successfully copied file A chunk 1" || exit 255
echo "starting file A chunk 2" && some-cmd A 2 && echo "successfully copied file A chunk 2" || exit 255
echo "starting file B chunk 1" && some-cmd B 1 && echo "successfully copied file B chunk 1" || exit 255
...

Note: I’ve got the || exit 255 because that will cause xargs to fail-fast instead of finishing all runs and then reporting a failure.

The easy solution

Using parallel is the easiest solution, it “just works”. You can pipe the commands in and set your concurrency and you’re done:

parallel -j8 < commands.txt

Using xargs

I didn’t have parallel installed, and although I could install it, I was trying to avoid it. I did have xargs installed and that can handle it. You need to set some params though:

xargs -n1 --delimiter='\n' -P8 bash -c < commands.txt

comments powered by Disqus