Generating Average related Stats with Awk (BASH)
Awk can be pretty handy for quickly pulling out various statistics
This snippet details how to pull out
- Mean (Average)
- Median
- 95th Percentile
For simplicity I've used $0
as the column number, see the examples for more details.
Details
- Language: BASH
Snippet
# Calculate Average (Mean)
awk 'BEGIN{t=0}{t=t+$0}END{print t/NR}'
# 95th percentile - input should be pre-sorted. -0.5 here forces a round down
awk '{all[NR] = $0} END{print all[int(NR*0.95 - 0.5)]}'
# Median, also known as the 50th percentile. Input should be pre-sorted
awk '{all[NR] = $0} END{print all[int(NR*0.5 - 0.5)]}'
Usage Example
# Calculate average based on the 4th column in a tab seperate-file
cat file.csv | awk -F'\t' 'BEGIN{t=0}{t=t+$4}END{print t/NR}'
# same as above, but 95th percentile
cat file.csv | sort -n -t\t -k4 | awk '{all[NR] = $4} END{print all[int(NR*0.95 - 0.5)]}'
# Calculate the media, but assume it's comma-seperated this time and use column 2
cat file.csv | sort -n -t, -k2 | awk '{all[NR] = $2} END{print all[int(NR*0.5 - 0.5)]}'