Generating Fake Email Addresses
Episode #4 hosted by Josh Branchaud
In this episode, we will see how we can quickly generate a bunch of fake emails with nothing more than a fancy Postgres statement.
Postgres Version 9.5
Sometimes you just need a bunch of fake data. Perhaps you want to compare the relative performance of various queries or maybe you need a big table to try out that new Postgres feature.
In this episode, we will see how we can quickly generate a bunch of fake emails with nothing more than a fancy Postgres statement.
We will work with a users
table that has an email address field.
create table users (
id serial primary key,
email varchar not null unique
);
If we just need 2 or 3 records, then writing a couple insert statements should suffice. However, what if we need a lot of records, like 10,000 records?
To start, we need 10,000 of something. The generate_series()
function will help us with that.
select generate_series(1,10000);
That gives us 10,000 integers. Now we need a way to turn those integers into emails. We can do that with some string concatenation.
select 'person' || num || '@example.com'
from generate_series(1,10000) as num;
Great. We have 10,000 unique, fake emails. The next step is to insert them into our users
table. For this we can use an insert
statement with a select
clause.
insert into users (email)
select 'person' || num || '@example.com'
from generate_series(1,10000) as num;
We can run the table
command to see that everything was inserted as expected.
table users;
We can even take this all a bit further by adding some variation. How about some random email host names? We can do this with a little more string concatenation and a case statement nested in a subquery.
select
'person' || num || '@' ||
(case (random() * 2)::integer
when 0 then 'gmail'
when 1 then 'hotmail'
when 2 then 'yahoo'
end) || '.com'
from generate_series(1,10000) as num;
As each row from the generate_series
is processed, we will get a new random number from 0 to 2. The case statement then produces one of the three host names based on that random number. This of course gets concatenated into the rest of the email string.
I leave the task of actually inserting these fancier emails into our table as an exercise for the viewer.
That's it for this episode. Thanks for watching.