Ranges in Postgres
Episode #16 hosted by Jack Christensen
In this episode we will examine range types in PostgreSQL and how they can simplify queries and improve data integrity.
Postgres Version 9.5
A range type represents a range of values, for example 1 through 5 or January 10 through January 12. In a database without range types it is necessary to represent a range as two columns such as start_date
and end_date
. While this can work, it also can be difficult to work with. PostgreSQL range types simplify this dramatically. Let's look at some of the features we get with ranges.
First, let's use a range constructor function to build a range.
select int4range(1,5);
-- int4range
-- -----------
-- [1,5)
-- (1 row)
To find if a value is included in a range use the @>
inclusion operator.
select int4range(1,5) @> 3;
-- ?column?
-- ----------
-- t
-- (1 row)
select int4range(1,5) @> 6;
-- ?column?
-- ----------
-- f
-- (1 row)
We can also check if ranges overlap.
select int4range(1,5) && int4range(3,7);
-- ?column?
-- ----------
-- t
-- (1 row)
select int4range(1,5) && int4range(6,7);
-- ?column?
-- ----------
-- f
-- (1 row)
But what about the bounds of the range? Are they inclusive or exclusive?
select int4range(1,5) @> 1;
-- ?column?
-- ----------
-- t
-- (1 row)
The lower bound is inclusive.
select int4range(1,5) @> 5;
-- ?column?
-- ----------
-- f
-- (1 row)
The upper bound is exclusive.
This can be changed by passing a third argument to the range constructor. This third argument is a two-character string with the first character standing for the lower bound type and the second character standing for the upper bound type. Square brackets ([
and ]
) mean inclusive and parenthesis ((
and )
) mean exclusive.
select int4range(1,5, '[]') @> 5;
-- ?column?
-- ----------
-- t
-- (1 row)
The default of inclusive lower and exclusive upper bounds is typically what you want.
This works nicely when testing for overlaps as it lets ranges just touch without overlap.
select int4range(1,5) && int4range(5,7);
-- ?column?
-- ----------
-- f
-- (1 row)
Ranges can also be made unbounded by using null as one of the edges.
select int4range(3,null) @> 42;
-- ?column?
-- ----------
-- t
-- (1 row)
Using a range instead of two separate columns also provides additional data integrity because it prevents ranges where the lower and upper boundaries are reversed.
select int4range(5,3);
-- ERROR: range lower bound must be less than or equal to range upper bound
In addition, PostgreSQL supports constraints that can prevent overlapping ranges from being stored. Suppose we are building a database to handle hotel reservations. A simple reservations
table would have the room and date range.
create table reservations(
id serial primary key,
room varchar not null,
dates daterange not null,
exclude using gist (room with =, dates with &&)
);
Here's how to read the exlusion constraint. The first word exclude
means exclude any row from matching any other row. using gist
means that the underlying index type is GiST which stands for Generalized Search Tree. The expressions in the parenthesis are the match conditions. First check the room column with the equal operator (=
). Then check for dates with the overlap operator (&&
).
Whoops. PostgreSQL is giving us an error.
-- ERROR: data type character varying has no default operator class for access method "gist"
-- HINT: You must specify an operator class for the index or define a default operator class for the data type.
The problem is the GiST index does not define the =
operator for varchar
. We need to install the btree_gist
extension.
create extension btree_gist;
create table reservations(
id serial primary key,
room varchar not null,
dates daterange not null,
exclude using gist (room with =, dates with &&)
);
This extension defines this and many other operators for GiST indexes.
Now let's try this out. First, we will insert a row for room 101.
insert into reservations(room, dates) values ('101', daterange('2016-11-01', '2016-11-10'));
We can insert a row with overlapping dates for a different room.
insert into reservations(room, dates) values ('101', daterange('2016-11-01', '2016-11-10'));
But we can't insert a row with overlapping dates for the same room.
insert into reservations(room, dates) values ('101', daterange('2016-11-07', '2016-11-15'));
As a bonus, not only does the exclusion constraint improve data integrity, the underlying index also can dramatically improve performance when searching with the overlap operator.
Thanks for watching.