Quick frequency tables in unix
I found this here and thought it should be recorded.
If you have a single series of data (in my case, AS numbers) and you want a frequency count, how can you do that on the command line?
... pipe input here ... | sort | uniq -c | sort -r -n
This 1. sorts incoming data as required by uniq, 2. outputs the unique keys and their frequency of occurrence, sorted by the key in lexicographical order, and 3. resorts the output by the frequency of occurrence in descending order, leaving you with something like:
$ grep -P " 33363$" rib.20101113.txt | awk '{print $2}' | sort | uniq -c | sort -r -n
1558 11686
317 3549
310 852
310 3130
162 8492
162 8001
162 7660
162 7018
162 701
162 6762
162 5413
162 5056
162 3356
162 3257
162 31500
162 2905
162 286
162 13030
162 1299
156 6939
155 812
155 6539
155 3561
155 293
155 2914
155 2497
155 2152
155 1668
155 1239
155 1221
76 3303
Where the first column is the frequency and the second column is the unique key in the source data stream.