sort ––debug is handy
Coming soon to a coreutils near you (it’s in coreutils 8.6 and later, which will hopefully be picked up by major distros soon) — the sort --debug
option. It places “underlines” under the portion of the key used to perform the sort, which is very helpful when you’re trying to figure out why sort is acting the way it is.
You can see in the snippets below the original file and two sorts of that file.
woodrow@woodrow-x200:~/tmp$ cat sort_test 1.2.3.4 2.3.4.5 12.3.2.4 12.2.3.4 12.2.3.5 11.2.3.4 1.12.3.4 20.2.3.4 21.2.3.4 12.2.3.6 12.2.3.5
The first sort, below, uses a numeric sort (-n) which extracts the first part of the value that looks like a decimal number and uses that as the sort key. You can see this in the first underline beneath each entry. It then performs a sort of last resort on the entire value, which you can see as the second underline beneath each entry.
woodrow@woodrow-x200:~/tmp$ /home/woodrow/bin/sort -n --debug sort_test /home/woodrow/bin/sort: using simple byte comparison 1.12.3.4 ____ ________ 1.2.3.4 ___ _______ 2.3.4.5 ___ _______ 11.2.3.4 ____ ________ 12.2.3.4 ____ ________ 12.2.3.5 ____ ________ 12.2.3.5 ____ ________ 12.2.3.6 ____ ________ 12.3.2.4 ____ ________ 20.2.3.4 ____ ________ 21.2.3.4 ____ ________
The second sort, below, also uses a numeric sort (-n), with the key shown by the underline. However, this sort is a stable sort (-s) meaning that it doesn’t perform a sort of last resort on the entire value. This preserves the input ordering of values which have the same key, which is why this sort is “stable” in computer science parlance.
woodrow@woodrow-x200:~/tmp$ /home/woodrow/bin/sort -n -s --debug sort_test /home/woodrow/bin/sort: using simple byte comparison 1.12.3.4 ____ 1.2.3.4 ___ 2.3.4.5 ___ 11.2.3.4 ____ 12.2.3.4 ____ 12.2.3.5 ____ 12.2.3.6 ____ 12.2.3.5 ____ 12.3.2.4 ____ 20.2.3.4 ____ 21.2.3.4 ____