PG Casts

Mean, Median, and Mode

Episode #34 hosted by Mary Lee

Hey everyone, today we're going to look at how to find the mean, median, and mode from column data in Postgres.

Postgres Version 11

Transcript

Hey everyone, today we're going to look at how to find the mean, median, and mode from column data in Postgres.

Mean

To get the mean, or average, value for a column, we can use the avg aggregate function. This function expects the name of the column we're trying to get the average of

select avg(age) from people;

Mode

To get the mode, or the most commonly occurring value in our column, we can use the ordered set aggregate function mode(). This aggregate function requires an ordered group to be specified in order for it to work properly. In our case, we're looking for the most common age in our group of data, so we're going to pass "within group" and then "order by age".

select mode() within group (order by age) from people;

One thing to note with the mode aggregate function is that if there are multiple equally occurring results, only the first is returned.

Median

To get the median value from our data set, we won't be using an aggregate function. Instead, we'll be making use of the “offset” clause. Since we're going to be using offset, it's important that we order our data by the age column, so that we're getting the true median value.

The offset clause enables us to find the median because it can accept subqueries as part of its logic. This means that we can use offset with a subquery to find the total count of entries in our table, and divide that in half to get midway through our data set.

Finally, we'll limit our response to just one, since we only care about the median value.

select age
from people
order by age
offset (select count(*) from people)/2
limit 1;

Thanks for watching!

Setup

create table people (id serial primary key, name varchar, age integer);

insert into people (name, age)
values ('Sam', 23), ('Sue', 34), ('Fred', 45), ('Emily', 56), ('Sally', 21), ('John', 34);