ResearchAutoregression as a means of assessing the strength of seasonality in a time series1 Department of Family and Community Medicine, University of Toronto, 256 McCaul Street, 2nd Floor, Toronto, ON, Canada M5T 2W5 2 Primary Care Research Unit, Sunnybrook and Women's College Health Sciences Centre, 2075 Bayview Avenue, #E-349, Toronto, ON, Canada M4N 3M5 3 Department of Public Health Sciences, University of Toronto, McMurrich Building, 12 Queen's Park Crescent W., Toronto, ON Canada, M5S 1A8 4 Institute for Clinical Evaluative Sciences, 2075 Bayview Avenue, Toronto, ON Canada M4N 3M5 5 Health Policy Management and Evaluation, University of Toronto, McMurrich Building, 2nd Floor, 12 Queen's Park Crescent West, Toronto, ON, Canada M5S 1A8 6 Faculty of Pharmacy, University of Toronto, 19 Russell Street, Toronto, ON, Canada M5S 2S2
Population Health Metrics 2003, 1:10doi:10.1186/1478-7954-1-10 The electronic version of this article is the complete one and can be found online at: http://www.pophealthmetrics.com/content/1/1/10
© 2003 Moineddin et al; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL. AbstractBackgroundThe study of the seasonal variation of disease is receiving increasing attention from health researchers. Available statistical tests for seasonality typically indicate the presence or absence of statistically significant seasonality but do not provide a meaningful measure of its strength. MethodsWe propose the coefficient of determination of the autoregressive regression model fitted to the data ( ResultsThe simulation results showed the power of the ConclusionsFor the purposes of health services research, evidence of the statistical presence of seasonality is insufficient to determine the etiologic, clinical and policy relevance of findings. Measurement of the strength of the seasonal effect, as can be determined using the BackgroundSeasonality is an important component of disease manifestation. The presence of predictable seasonality is a clue to the possible etiology of disease, be it from microbial, environmental or social factors. Understanding seasonality is also essential for setting rational policy, particularly with respect to the planning for seasonal demands for health services. For studying seasonality, several statistical methods are available ranging from simple graphical techniques to more advanced statistical methods. Additionally, autocorrelation functions can be examined to assess regularity of periodicity or seasonality. Several statistical tests have been introduced for studying the cyclical variation of time series data. For example, Edwards [1] developed a statistical test that locates weights corresponding to the number of observed cases for each month at 12 equally spaced points on a circle. The test is said to be significant if the calculated centre of the mass significantly deviates from the circle's centre. Jones et al [2] developed a test for determining whether incidence data for two or more groups have the same seasonal pattern. Further, Marrero [3] compared the performance of several tests for seasonality by simulation, which can be used as a guideline for selecting appropriate tests for a given data set based on the size of the data set and the shape of the sinusoidal curve. To apply any of these tests, however, observations must be aggregated into 12 monthly data points. Several alternative tests, which do not require aggregated data, have also been developed. These include Fisher's Kappa (FK), which tests whether the largest periodogram is statistically different from the mean of periodograms; Bartlett's Kolmogorov-Smirnov (BKS) test, which statistically compares the normalized cumulative periodogram with the cumulative distribution function of a uniform zero and one random variable; and the X-11 procedure as used by the census bureaus in the United States and Canada [2,4-11]. These tests utilize the frequency and time domain to detect seasonality. Each test provides an indication of the presence or absence of statistical significance of seasonality, however, they do not provide a sense of the magnitude of seasonality or how much variance is explained by seasonal occurrence in the data. This is particularly important in health care, as the presence of statistically significant seasonality may not translate into either etiologic or policy relevance. In an effort to address the shortcomings in existing statistical methods, we propose the application of autoregressive regression models as a means for assessing the degree of accuracy to which a new observation can be predicted by stable (seasonal factors are constant over time) seasonal variation and use it for quantifying the strength of the seasonality within a set of serially correlated observations. In classical regression analysis the coefficient of determination, R2, is a standard statistical tool for estimating the proportion of total variation of the dependent variable, which can be explained by explanatory variables. A crucial point in standard regression is that observations are independent of one another. However, time series observations can be serially autocorrelated and this correlation must be taken into account. Autoregressive regression models are a natural generalization of standard regression models for analyzing correlated data. For monthly data, one can use dummy variables for months in a regression model as a single predictor, and then, after correcting for the autocorrelation, calculate the coefficient of determination, The coefficient of determination, which lies between 0 and 1, can be used as a measure for the strength of the stable seasonality because it measures how well the next value can be predicted using month as the only predictor. When The purpose of this paper is to evaluate the utility of R-squared autoregression in explaining variance in assessing stable seasonality. To this end, we have examined the performance of the R-squared autoregression through a simulation study and using two data sets known to demonstrate statistically significant weak and strong seasonality: monthly hospitalizations for atrial fibrillation and asthma. MethodsStatistical methodsThe autoregressive linear regression model for monthly observations is defined as: Yt = Xt β + εt εt = -φ1εt-1 - φ2εt-2 - … - φp εt-p + et et ~ N (0, where Yt is the observed time series, Xt is the design matrix (a k × 12 matrix of 0 and 1), β = (μ, β1, …, β11)' is the vector of parameters, εt is the error term that follows an autoregressive model of order p. Also, we assume that et is normally and independently distributed with mean zero and variance SimulationIn order to assess the performance of the proposed We simulated 1000 replications of monthly observations over 10 years from the following model:
and calculated the Table 1. Simulation results When α and φ12 are zero, To investigate the linkages between the In order to apply the autoregression procedure to actual data it is important to eliminate the nonstationarity in the mean and variance of the observations. For the data used in the study the Dickey-Fuller unit root test [14] is used to test the stationarity of the series and determine the order of differencing required for the nonstationary series. The SAS procedure, AUTOREG, was used for calculating Data SourcesThe data were derived from two retrospective, population-based, cross-sectional time series studies assessing temporal patterns in all discharge separations for asthma (from April 1, 1988 to March 31, 2000) and atrial fibrillation (from April 1, 1988 to March 31, 2001) for the population of Ontario. Approximately 14 million residents of Ontario, Canada eligible for universal health care coverage during this time were included for analysis. The database used was the Canadian Institute for Health Information (CIHI) Discharge Abstract Database which records discharges from all Ontario acute care hospitals. All records with a most responsible discharge diagnosis of atrial fibrillation (ICD-9 code: 427.3) and asthma (ICD-9 code: 493) were selected. The numerator consisted of the total number of discharge separations for each month. Denominators were constructed from annual census data provided by Statistics Canada for each age group for residents of Ontario. Monthly population estimates were created through linear interpolation. ResultsAtrial fibrillationOverall, there were 90,199 (45,477 female and 44,472 male) discharge separations for all ages. Figure 1 shows the monthly rates of admission per 100,000 population. There is a conspicuous upward trend in admissions over the first four years (Figure 1). Visual inspection does not support conspicuous seasonality.
However, after applying first order differencing, the Dicky-Fuller test confirmed the stationarity of the differenced series. After differencing the series, the data does not show strong evidence of statistically significant seasonality. Bartlett's Kolmogorov-Smirnov Statistic (BKS) for both genders, females and males are 0.327, 0.308, 0,315 with p-values all less than 0.0001. The Fisher's Kappa (FK) test statistics are 7.17 (0.01 < p-value < 0.05), 5.78 (not significant), and 7.98 (0.01 < p-value < 0.05). The calculated AsthmaIn total, there were 206,561 (104,283 female and 102,278 male) asthma hospital discharges for all ages. Figure 2 shows the monthly rates of asthma per 100,000 population. The visual inspection of Figure 2 shows a clear autumn peak and summer trough seasonal pattern occurring every year over the 12 year period. The Dicky-Fuller unit root test confirms that the series is stationary. The results of the seasonality tests applied on the rates demonstrate statistically significant seasonality. The Fisher's Kappa (FK) test statistics for both genders, female and male are 21.22, 22.05, and 20.52 with p-values all less than 0.01. The Bartlett Kolmogorov-Smirnov (BKS) test statistics are 0.512, 0.518, and 0.489 with p-values all less than 0.0001. The calculated
DiscussionThe results of this study show that the When the technique was applied to the two data sets, it corroborated the visual evidence that asthma is more conspicuously seasonal than atrial fibrillation. The seasonality of asthma has been conclusively demonstrated in several studies and is likely a key to understanding the etiology of exacerbations of asthma [15-17]. The strength and consistency of the effect is likely of relevance to health policy and planning. The seasonality of atrial fibrillation has been reported, but outcomes were reported as relative risks [18]. The analysis provided here indicates that the seasonality of admissions for atrial fibrillation is not likely of policy or clinical significance as the magnitude is quite small. One important question that remains to be answered is how the magnitude of seasonal factor changes over time affect the ConclusionsThe proposed autoregression method is a statistical technique well suited to the study of seasonality in health data. Although monthly data was used for this analysis, it can easily be applied to weekly or seasonal data. The approach allows researchers to quantify and compare the strength of the seasonality for different genders and age groups. The coefficient of determination is easy to calculate and interpret. And finally, it is well known to health care researchers and is frequently used as a measure for goodness of fit. For the purposes of health services research and population health measurement, evidence of the statistical presence of stable seasonality is insufficient to determine the etiologic, clinical and policy relevance of findings. Measurement of the strength of the seasonal effect is also required in order to provide a robust sense of seasonality. We believe that this autoregression technique, in concert with statistical testing, graphical representation and measures of the absolute magnitude of seasonal effect, is an important component to this robust approach. Competing interestsNone declared. Authors ContributionsRM and RU initiated the idea for the study. MM and EC contributed critical intellectual input to the project. RM wrote the initial draft, and all authors contributed to the subsequent drafts. All have read and approve of the final draft. AcknowledgementsThe authors would like to thank the editor and reviewers for their very helpful suggestions that significantly improved the paper. Also we would like to thank Shari Gruman for her expert assistance in formatting the manuscript. RU is supported by a New Investigator Award from the Canadian Institutes of Health Research and a Research Scholar Award from the Department of Family and Community Medicine, University of Toronto. This research was supported by an operating grant from the Canadian Institutes of Health Research. References
Have something to say? Post a comment on this article! |





on Google Scholar









author email
corresponding author email
) as a measure for quantifying the strength of the seasonality. The performance of the proposed statistic is assessed through a simulation study and using two data sets known to demonstrate statistically significant seasonality: atrial fibrillation and asthma hospitalizations in Ontario, Canada.
, i = 1,2,...,12 denote the monthly average. The monthly averages,
are
and
. In practice
s can be used as seasonal factor estimates.
)
Figure 1.
Figure 2.