 # Generating Average related Stats with Awk

Published: 2020-01-21 16:04:39 +0000
Categories: BASH,

BASH

### Description

Awk can be pretty handy for quickly pulling out various statistics

This snippet details how to pull out

• Mean (Average)
• Median
• 95th Percentile
For simplicity I've used `\$0` as the column number, see the examples for more details.

### Snippet

``````# Calculate Average (Mean)
awk 'BEGIN{t=0}{t=t+\$0}END{print t/NR}'

# 95th percentile - input should be pre-sorted. -0.5 here forces a round down
awk '{all[NR] = \$0} END{print all[int(NR*0.95 - 0.5)]}'

# Median, also known as the 50th percentile. Input should be pre-sorted
awk '{all[NR] = \$0} END{print all[int(NR*0.5 - 0.5)]}'

``````

### Usage Example

``````# Calculate average based on the 4th column in a tab seperate-file
cat file.csv | awk -F'\t' 'BEGIN{t=0}{t=t+\$4}END{print t/NR}'

# same as above, but 95th percentile
cat file.csv | sort -n -t\t -k4 | awk '{all[NR] = \$4} END{print all[int(NR*0.95 - 0.5)]}'

# Calculate the media, but assume it's comma-seperated this time and use column 2
cat file.csv | sort -n -t, -k2 | awk '{all[NR] = \$2} END{print all[int(NR*0.5 - 0.5)]}'

``````

### Keywords

awk, stats, percentile, mean, media,