Quick frequency tables in unix

I found this here and thought it should be recorded.

If you have a single series of data (in my case, AS numbers) and you want a frequency count, how can you do that on the command line?

... pipe input here ... | sort | uniq -c | sort -r -n

This 1. sorts incoming data as required by uniq, 2. outputs the unique keys and their frequency of occurrence, sorted by the key in lexicographical order, and 3. resorts the output by the frequency of occurrence in descending order, leaving you with something like:


$ grep -P " 33363$" rib.20101113.txt | awk '{print $2}' | sort | uniq -c | sort -r -n

   1558 11686
    317 3549
    310 852
    310 3130
    162 8492
    162 8001
    162 7660
    162 7018
    162 701
    162 6762
    162 5413
    162 5056
    162 3356
    162 3257
    162 31500
    162 2905
    162 286
    162 13030
    162 1299
    156 6939
    155 812
    155 6539
    155 3561
    155 293
    155 2914
    155 2497
    155 2152
    155 1668
    155 1239
    155 1221
     76 3303

Where the first column is the frequency and the second column is the unique key in the source data stream.

Comments are closed.