Quick frequency tables in unix
I found this here and thought it should be recorded.
If you have a single series of data (in my case, AS numbers) and you want a frequency count, how can you do that on the command line?
... pipe input here ... | sort | uniq -c | sort -r -n
This 1. sorts incoming data as required by uniq, 2. outputs the unique keys and their frequency of occurrence, sorted by the key in lexicographical order, and 3. resorts the output by the frequency of occurrence in descending order, leaving you with something like:
$ grep -P " 33363$" rib.20101113.txt | awk '{print $2}' | sort | uniq -c | sort -r -n
1558 11686 317 3549 310 852 310 3130 162 8492 162 8001 162 7660 162 7018 162 701 162 6762 162 5413 162 5056 162 3356 162 3257 162 31500 162 2905 162 286 162 13030 162 1299 156 6939 155 812 155 6539 155 3561 155 293 155 2914 155 2497 155 2152 155 1668 155 1239 155 1221 76 3303
Where the first column is the frequency and the second column is the unique key in the source data stream.