Counting Weekends between Two Dates in PostgreSQL
I found myself the problem of counting the occurrence of specific "days of the week" between two dates; for, of course, generating features for a predictive analysis task. For example, the number of Fridays and Saturdays between 2019-01-01 and 2019-01-15. And thankfully, good old PostgreSQL came to the rescue!
Introducing generate_series
SELECT * FROM generate_series(1, 5);
generate_series1
2
3
4
5
(5 rows)
And, if we want to spice it up by setting the interval, we can do that too! Let's make a list of all the ODD numbers between 1 and 10 (inclusive).
SELECT * FROM generate_series(1, 10, 2);
generate_series1
3
5
7
9
(5 rows)
generate_series but with Dates
Let's try just plugging in Date Strings.
SELECT * FROM generate_series('2019-01-01', '2019-01-15');
[42725] ERROR: function generate_series(unknown, unknown) is not unique Hint: Could not choose a best candidate function. You might need to add explicit type casts.
Clearly, something is off. Ah, we didn't cast those values as Dates. Of course. Let's do it.
SELECT * FROM generate_series('2019-01-01'::date, '2019-01-15'::date);
[42883] ERROR: function generate_series(date, date) does not exist Hint: No function matches the given name and argument types. You might need to add explicit type casts. Position: 15
Okay that did identify our value as a date, but it still isn't working. The thing is, we need to provide the interval explicitly. Let's do it!
SELECT *
FROM generate_series('2019-01-01'::date, '2019-01-15'::date, '1 day'::interval);
generate_series
2019-01-01 00:00:00+06
...
2019-01-15 00:00:00+06
(15 rows)
Alright! That worked like a charm! Of course, let's make it a bit prettier.
SELECT s::date
FROM generate_series('2019-01-01', '2019-01-15', '1 day'::interval) s;
s
2019-01-01
...
2019-01-15
(15 rows)
Sweet! We can now generate the ranges required.
The Week of Day
I'll let the query and the results talk in this bit.
SELECT
s::date,
extract(DOW from s),
to_char(s, 'day')
FROM generate_series('2019-01-01', '2019-01-15', '1 day'::interval) s;
s | date_part | to_char
------------+-----------+-----------
2019-01-01 | 2 | tuesday
2019-01-02 | 3 | wednesday
2019-01-03 | 4 | thursday
2019-01-04 | 5 | friday
2019-01-05 | 6 | saturday
2019-01-06 | 0 | sunday
...
2019-01-14 | 1 | monday
2019-01-15 | 2 | tuesday
(15 rows)
Starting to make sense now? We can get a numeric value for the day of the week by extracting DOW
from the interval. And getting the string representation is just as easy as calling to_char
. Let's put a weekend constraint now. Check the mapping of days of the week on the table posted above. That's how we got 5
and 6
; representing Friday
and Saturday
.
SELECT s::date,
extract(DOW from s),
to_char(s, 'day')
FROM generate_series('2019-01-01', '2019-01-15', '1 day'::interval) s
WHERE extract(DOW from s) in (5, 6);
s | date_part | to_char
------------+-----------+-----------
2019-01-04 | 5 | friday
2019-01-05 | 6 | saturday
2019-01-11 | 5 | friday
2019-01-12 | 6 | saturday
(4 rows)
That's super cool! Now, let's COUNT
!
SELECT COUNT(*)
FROM generate_series('2019-01-01', '2019-01-15', '1 day'::interval) s
WHERE extract(DOW from s) in (5, 6);
Voila!
With a Table
SELECT id,
SUM(CASE WHEN extract(dow from s) IN (5, 6) THEN 1 ELSE 0 END)
FROM tbl
JOIN generate_series(
tbl.starts_on,
tbl.ends_on,
'1 day'::interval) s ON true
GROUP BY id;
So this was my clumsy implementation, that I came up with. I am pretty sure there are better ways to pull this off. Why don't you leave your thoughts and solutions in the comments? And please don't forget to share if you learned anything new! Thanks for dropping by.