Abstract

1 This day is in which week of its month?
A user wanted a variable to indicate in which week of its month a daily date fell. That question is a challenge both to imagine different definitions of weeks within months and to produce Stata code for each interpretation. It serves as a reminder that whatever date and time problems have been solved, there are still plenty more. See, for example, Cox (2010, 2012a,b, 2018b, 2019) for some previous notes in this territory.
Even if you lack interest in this specific question, it serves as an example for showing problem-solving skills in using Stata.
In what follows, we go no further than a standard Western calendar in which the months of each year are January to December. Much more on calendars can be found in standard references such as Blackburn and Holford-Strevens (1999) and Reingold and Dershowitz (2018). Unlike years, days, and months, weeks have no physical (meaning, astronomical) basis, but they do have many different associations and implications, ranging from mythological and religious to economic and cultural (for example, Henkin [2021]).
Surprisingly or not, Stata’s built-in week functions are unlikely to be part of the answer. Stata’s definition of a week is idiosyncratic. Week 1 always starts on 1 January, week 2 always starts on 8 January, and so on with 7-day weeks, until week 52 ends on 31 December and is 8 or 9 days long depending on whether the year is a leap year. There is no week 53 in this scheme. Stata’s definition has one distinct advantage, which is that weeks always nest inside years without ever spanning two years. Otherwise, this definition seems rarely used outside Stata, and the rest of this tip is based on the assumption that you are using some different definition of a week.
We will work with a sandbox dataset with daily dates for the first three months of 2021. It is mental arithmetic to check that 2021 was not a leap year so that 90 = 31 + 28 + 31 observations will give us the right dataset size for those months.
For more on
2 Daily and monthly dates introduced
Let me first recount some basics to help readers quite new to handling dates like these. If you are already broadly familiar with dates, you can skip and skim to the next section.
In Stata, dates are held as integers. Stata holds daily and monthly dates as integers, with the origin 0 as the first possible date in 1960. We have already used
It is 22280 on a scale in which 1 January 1960 is 0. Stata really does not expect mental arithmetic from you to work out in reverse that 22280 means 31 December 2020, which leads to the next point.
Date display formats show dates conventionally. It is the job of date display formats to show you what a date means conventionally. A format has nothing to do with what is stored—that remains here integers, like 22280—but, as said, only with what is displayed when you ask for it. Again,
Special date functions are key. I used the function
shows that, regardless of whatever display format is applied elsewhere, the monthly date for January 2021 is to Stata an integer, 732. Recall the principle that monthly dates too are stored as integers counting from 0, which is the first possible date in 1960, here January 1960.
Changing the date format does not change the date. It is often misunderstood, so let’s underline that changing the date display format is never a way to convert from one kind of date to another (Cox 2012c). Short of setting up your own calculation, you need a conversion function such as
We can now focus on the title problem.
3 Week 1 is days 1 to 7, and so on
The simplest definitions of each week of the month are that week 1 is days 1 to 7, week 2 days 8 to 14, and so on. On these definitions, there are usually 5 weeks in each month, and when that is true, the last week is 1, 2, or 3 days long depending on whether the month is 29, 30, or 31 days long. The exceptions are when the month is February in a nonleap year, so the number of days in the month is then 28, and there are 4 complete weeks.
Extracting day of month from a daily date is a common and fundamental need, so you should expect there to be a function provided to yield this directly. It is
That is it. From the inside outward, we read off
You might want to take that more slowly. You could put the result of
Then, you could divide and round up:
Indeed, the last operation could also be split into two steps. You should certainly rewrite this way if you need a separate variable holding day of the month. It is also a good idea if the code is thereby clearer to you or anyone else needing to understand it.
It is prudent to check results. A programmer can rarely be too careful, but looking at a table is easy.
That is as it should be. There are 4 complete weeks and 3 days left over for January and March (31 days), and there are 4 complete weeks for February (28 days in a nonleap year). An even more careful check would be to see whether the values 1 to 5 occur when they should and whether a leap year example works too.
4 A week starts on a Sunday, and so on
Another class of definitions might include a week starting on Sunday and ending on Saturday or starting on Monday and ending on Sunday. Do not rule out a weirdseeming definition that some group uses somewhere. My own university for many years had a convention that teaching weeks ran Thursday to Wednesday in the first term of the academic year and Monday to Friday in the other terms. The offset was to accommodate welcomes and whatnot at the beginning of each academic year. Admitting different start days may seem dismaying: do we need code for seven cases? Fortunately, one trick covers all.
Stata has a day-of-week function,
That is the essence. Ensuring that observations are in the right order, we use
If you are fuzzy about what
This is perhaps the trickiest point of the entire tip, so let us follow through what happens. At the start of each month,
How does that pan out in the sandbox dataset? Again, a cross-tabulation is simple.
In 2021, the first Sundays in each month were on 3 January, 7 February, and 7 March. Correspondingly, each month started with an incomplete week, which is coded 0, zero values being recorded because no Sundays had been observed so far. As it happens, August 2021 started on a Sunday, so no week 0 would be defined then.
Now twists on the idea should seem easy.
Starting on Monday or any other day of the week just means testing for a different result from
Wanting the numbering of weeks to start at 1 just means adding 1 in calculating
Notice the conditioning there. We want to bump up the counter if and only if the counter starts at 0.
5 Small morals
Given the extraordinary range of calendars and calendar practices across countries and subject areas, there could be yet other definitions. Yet we stop there and close with some small but standard morals to be drawn from the tale.
Dates are complicated. The entire rigmarole of dates (to say nothing of times too, but those are not an issue here) is a complicated mess. From your early education onward, you have been exposed repeatedly to the rules you usually need to know, but handling dates in software like Stata poses a new set of challenges. Some software handles dates through one or more distinct data or variable types. Stata’s decision that dates and times are just integers simplifies much, but it still implies details about formats and functions that may need mastering.
Functions are helpful. The key to success here is knowing your functions, whether they are date functions that are what you need, such as
Sandboxes are useful. Often, I witness people who have read large and complicated datasets into memory and are struggling to find the syntax they need. While that is the immediate problem, it can be a good idea to back up and create a small sandbox dataset, as we did here. You can get results quickly and compare easily with whatever you know to be the right answer. Other neglected ideas include using
Supplemental Material
Supplemental Material, sj-zip-1-stj-10.1177_1536867X221083928 - Stata tip 145: Numbering weeks within months
Supplemental Material, sj-zip-1-stj-10.1177_1536867X221083928 for Stata tip 145: Numbering weeks within months by Nicholas J. Cox in The Stata Journal
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
