<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1478-7954-1-10</ui>
   <ji>1478-7954</ji>
   <fm>
      <dochead>Research</dochead>
      <bibl>
         <title>
            <p>Autoregression as a means of assessing the strength of seasonality in a time series</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Moineddin</snm>
               <fnm>Rahim</fnm>
               <insr iid="I1"/>
               <insr iid="I3"/>
               <email>rahim.moineddin@utoronto.ca</email>
            </au>
            <au id="A2" ca="yes">
               <snm>Upshur</snm>
               <mi>EG</mi>
               <fnm>Ross</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <insr iid="I3"/>
               <insr iid="I4"/>
               <email>rupshur@idirect.com</email>
            </au>
            <au id="A3">
               <snm>Crighton</snm>
               <fnm>Eric</fnm>
               <insr iid="I2"/>
               <email>eric.crighton@sw.on</email>
            </au>
            <au id="A4">
               <snm>Mamdani</snm>
               <fnm>Muhammad</fnm>
               <insr iid="I4"/>
               <insr iid="I5"/>
               <insr iid="I6"/>
               <email>muhammad.mamdani@ices.on.ca</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Family and Community Medicine, University of Toronto, 256 McCaul Street, 2nd Floor, Toronto, ON, Canada M5T 2W5</p>
            </ins>
            <ins id="I2">
               <p>Primary Care Research Unit, Sunnybrook and Women's College Health Sciences Centre, 2075 Bayview Avenue, #E-349, Toronto, ON, Canada M4N 3M5</p>
            </ins>
            <ins id="I3">
               <p>Department of Public Health Sciences, University of Toronto, McMurrich Building, 12 Queen's Park Crescent W., Toronto, ON Canada, M5S 1A8</p>
            </ins>
            <ins id="I4">
               <p>Institute for Clinical Evaluative Sciences, 2075 Bayview Avenue, Toronto, ON Canada M4N 3M5</p>
            </ins>
            <ins id="I5">
               <p>Health Policy Management and Evaluation, University of Toronto, McMurrich Building, 2nd Floor, 12 Queen's Park Crescent West, Toronto, ON, Canada M5S 1A8</p>
            </ins>
            <ins id="I6">
               <p>Faculty of Pharmacy, University of Toronto, 19 Russell Street, Toronto, ON, Canada M5S 2S2</p>
            </ins>
         </insg>
         <source>Population Health Metrics</source>
         <issn>1478-7954</issn>
         <pubdate>2003</pubdate>
         <volume>1</volume>
         <issue>1</issue>
         <fpage>10</fpage>
         <url>http://www.pophealthmetrics.com/content/1/1/10</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="doi">10.1186/1478-7954-1-10</pubid>
               <pubid idtype="pmpid">14675482</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>27</day>
               <month>8</month>
               <year>2003</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>15</day>
               <month>12</month>
               <year>2003</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>15</day>
               <month>12</month>
               <year>2003</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2003</year>
         <collab>Moineddin et al; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.</collab>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>The study of the seasonal variation of disease is receiving increasing attention from health researchers. Available statistical tests for seasonality typically indicate the presence or absence of statistically significant seasonality but do not provide a meaningful measure of its strength.</p>
            </sec>
            <sec>
               <st>
                  <p>Methods</p>
               </st>
               <p>We propose the coefficient of determination of the autoregressive regression model fitted to the data (<graphic file="1478-7954-1-10-i1.gif"/>) as a measure for quantifying the strength of the seasonality. The performance of the proposed statistic is assessed through a simulation study and using two data sets known to demonstrate statistically significant seasonality: atrial fibrillation and asthma hospitalizations in Ontario, Canada.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>The simulation results showed the power of the <graphic file="1478-7954-1-10-i1.gif"/> in adequately quantifying the strength of the seasonality of the simulated observations for all models. In the atrial fibrillation and asthma datasets, while the statistical tests such as Bartlett's Kolmogorov-Smirnov (BKS) and Fisher's Kappa support statistical evidence of seasonality for both, the <graphic file="1478-7954-1-10-i1.gif"/> quantifies the strength of that seasonality. Corroborating the visual evidence that asthma is more conspicuously seasonal than atrial fibrillation, the calculated <graphic file="1478-7954-1-10-i1.gif"/> for atrial fibrillation indicates a weak to moderate seasonality (<graphic file="1478-7954-1-10-i1.gif"/> = 0.44, 0.28 and 0.45 for both genders, males and females respectively), whereas for asthma, it indicates a strong seasonality (<graphic file="1478-7954-1-10-i1.gif"/> = 0.82, 0.78 and 0.82 for both genders, male and female respectively).</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusions</p>
               </st>
               <p>For the purposes of health services research, evidence of the statistical presence of seasonality is insufficient to determine the etiologic, clinical and policy relevance of findings. Measurement of the strength of the seasonal effect, as can be determined using the <graphic file="1478-7954-1-10-i1.gif"/> technique, is also important in order to provide a robust sense of seasonality.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Seasonality is an important component of disease manifestation. The presence of predictable seasonality is a clue to the possible etiology of disease, be it from microbial, environmental or social factors. Understanding seasonality is also essential for setting rational policy, particularly with respect to the planning for seasonal demands for health services.</p>
         <p>For studying seasonality, several statistical methods are available ranging from simple graphical techniques to more advanced statistical methods. Additionally, autocorrelation functions can be examined to assess regularity of periodicity or seasonality. Several statistical tests have been introduced for studying the cyclical variation of time series data. For example, Edwards <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> developed a statistical test that locates weights corresponding to the number of observed cases for each month at 12 equally spaced points on a circle. The test is said to be significant if the calculated centre of the mass significantly deviates from the circle's centre. Jones et al <abbrgrp><abbr bid="B2">2</abbr></abbrgrp> developed a test for determining whether incidence data for two or more groups have the same seasonal pattern. Further, Marrero <abbrgrp><abbr bid="B3">3</abbr></abbrgrp> compared the performance of several tests for seasonality by simulation, which can be used as a guideline for selecting appropriate tests for a given data set based on the size of the data set and the shape of the sinusoidal curve. To apply any of these tests, however, observations must be aggregated into 12 monthly data points.</p>
         <p>Several alternative tests, which do not require aggregated data, have also been developed. These include Fisher's Kappa (FK), which tests whether the largest periodogram is statistically different from the mean of periodograms; Bartlett's Kolmogorov-Smirnov (BKS) test, which statistically compares the normalized cumulative periodogram with the cumulative distribution function of a uniform zero and one random variable; and the X-11 procedure as used by the census bureaus in the United States and Canada <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp>. These tests utilize the frequency and time domain to detect seasonality. Each test provides an indication of the presence or absence of statistical significance of seasonality, however, they do not provide a sense of the magnitude of seasonality or how much variance is explained by seasonal occurrence in the data. This is particularly important in health care, as the presence of statistically significant seasonality may not translate into either etiologic or policy relevance.</p>
         <p>In an effort to address the shortcomings in existing statistical methods, we propose the application of autoregressive regression models as a means for assessing the degree of accuracy to which a new observation can be predicted by stable (seasonal factors are constant over time) seasonal variation and use it for quantifying the strength of the seasonality within a set of serially correlated observations. In classical regression analysis the coefficient of determination, <it>R</it><sup>2</sup>, is a standard statistical tool for estimating the proportion of total variation of the dependent variable, which can be explained by explanatory variables. A crucial point in standard regression is that observations are independent of one another. However, time series observations can be serially autocorrelated and this correlation must be taken into account.</p>
         <p>Autoregressive regression models are a natural generalization of standard regression models for analyzing correlated data. For monthly data, one can use dummy variables for months in a regression model as a single predictor, and then, after correcting for the autocorrelation, calculate the coefficient of determination, <graphic file="1478-7954-1-10-i1.gif"/>. When the time series is stationary and the trend is eliminated, the statistical significance of the dummy variables (months) indicates seasonality. The relationship between the stable seasonal factors and the estimates of the regression equation parameters are as follows: suppose there are <it>k </it>years monthly, <it>n </it>= 12 <it>k</it>, trend removed and centred (mean deleted) observations. Let <graphic file="1478-7954-1-10-i2.gif"/>, <it>i </it>= 1,2,...,12 denote the monthly average. The monthly averages, <graphic file="1478-7954-1-10-i2.gif"/><it>s</it>, can be interpreted as crude estimates of stable seasonal factors, therefore, the range of parameter estimates is a good estimate of the magnitude of seasonal variation. For estimating <graphic file="1478-7954-1-10-i1.gif"/> one defines 11 dummy variables <it>m</it><sub><it>i </it></sub>= 1 if month equals <it>i</it>, <it>m</it><sub><it>i </it></sub>= 0 otherwise and then regress <it>m</it><sub><it>i</it></sub>s on <it>y</it><sub><it>t</it></sub><it>s</it>. It is not difficult to show the ordinary least squared estimates of the parameters of the regression equation <graphic file="1478-7954-1-10-i3.gif"/> are <graphic file="1478-7954-1-10-i4.gif"/> and <graphic file="1478-7954-1-10-i5.gif"/>. In practice <graphic file="1478-7954-1-10-i1.gif"/> and parameters &#946;<sub><it>i</it></sub><it>s </it>are estimated simultaneously, therefore, the estimated parameters <graphic file="1478-7954-1-10-i6.gif"/><it>s </it>can be used as seasonal factor estimates.</p>
         <p>The coefficient of determination, which lies between 0 and 1, can be used as a measure for the strength of the stable seasonality because it measures how well the next value can be predicted using month as the only predictor. When <graphic file="1478-7954-1-10-i1.gif"/> is zero, there is no seasonality. When <graphic file="1478-7954-1-10-i1.gif"/> is equal to 1, observations can be perfectly predicted for each month, which means that the variable month explains 100% of the variation in the data. In other words, there is a perfect seasonality. In practice we may characterize the strength of the seasonality based on different ranges of values for <graphic file="1478-7954-1-10-i1.gif"/>. Similar to other measures of correlation or goodness of fit, we can interpret <graphic file="1478-7954-1-10-i1.gif"/> as follows: values ranging from 0 to less than 0.4 may be characterized as non-existent to weak seasonality, 0.4 to less than 0.7 represent moderate to strong seasonality, and values ranging from 0.7 to 1 represent strong to perfect seasonality. The coefficient of determination, <graphic file="1478-7954-1-10-i1.gif"/>, does not quantify the magnitude of the seasonal effect (the difference between peak and trough, which can be estimated by the difference between maximum and minimum parameter estimates of the regression equation) but rather it quantifies its strength (i.e., how well new observations can be predicted when month is the only predictor).</p>
         <p>The purpose of this paper is to evaluate the utility of R-squared autoregression in explaining variance in assessing stable seasonality. To this end, we have examined the performance of the R-squared autoregression through a simulation study and using two data sets known to demonstrate statistically significant weak and strong seasonality: monthly hospitalizations for atrial fibrillation and asthma.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Statistical methods</p>
            </st>
            <p>The autoregressive linear regression model for monthly observations is defined as:</p>
            <p><it>Y</it><sub><it>t </it></sub>= <it>X</it><sub><it>t </it></sub>&#946; + &#949;<sub><it>t</it></sub></p>
            <p>&#949;<sub><it>t </it></sub>= -&#966;<sub>1</sub>&#949;<sub><it>t</it>-1 </sub>- &#966;<sub>2</sub>&#949;<sub><it>t</it>-2 </sub>- &#8230; - &#966;<sub><it>p </it></sub>&#949;<sub><it>t-p </it></sub>+ <it>e</it><sub><it>t</it></sub></p>
            <p><it>e</it><sub><it>t </it></sub>~ <it>N </it>(0, <graphic file="1478-7954-1-10-i7.gif"/>)</p>
            <p>where <it>Y</it><sub><it>t </it></sub>is the observed time series, <it>X</it><sub><it>t </it></sub>is the design matrix (a <it>k </it>&#215; 12 matrix of 0 and 1), &#946; = (&#956;, &#946;<sub>1</sub>, &#8230;, &#946;<sub>11</sub>)' is the vector of parameters, &#949;<sub><it>t </it></sub>is the error term that follows an autoregressive model of order <it>p</it>. Also, we assume that <it>e</it><sub><it>t </it></sub>is normally and independently distributed with mean zero and variance <graphic file="1478-7954-1-10-i7.gif"/><abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. Standard statistical packages (e.g. SAS) can be used for estimating the parameters and the coefficient of determination, <graphic file="1478-7954-1-10-i1.gif"/>, after adjusting for correlated error terms.</p>
         </sec>
         <sec>
            <st>
               <p>Simulation</p>
            </st>
            <p>In order to assess the performance of the proposed <graphic file="1478-7954-1-10-i1.gif"/> for measuring the strength of the seasonality of a time series, a simulation study was conducted. Following this, the proposed technique was applied to two real data sets. The SAS software, version 8.2 (SAS Institute Inc. Cary, North Carolina) is used for simulating monthly observations and calculating <graphic file="1478-7954-1-10-i1.gif"/>.</p>
            <p>We simulated 1000 replications of monthly observations over 10 years from the following model:</p>
            <p>
               <graphic file="1478-7954-1-10-i8.gif"/>
            </p>
            <p>and calculated the <graphic file="1478-7954-1-10-i1.gif"/> for each replication. By changing &#945;, &#966;<sub>1</sub>, &#966;<sub>2</sub>, and &#966;<sub>12 </sub>the coefficients, this model generates observations with a cyclical trend component of period 12, observations from a seasonal ARMA model with a seasonal period of 12, and a combination of both. By changing the coefficients of the model we can generate pure white noise to highly correlated data with seasonal patterns. For example, a model with &#966;<sub>1 </sub>= &#966;<sub>2 </sub>= &#966;<sub>12 </sub>= 0 generates a series of observations, which is a mixture of white noise plus a cyclical trend. The size of &#945; controls the contribution of the seasonal trend in the generated observations. Parameters &#966;<sub>1 </sub>and &#966;<sub>2 </sub>control the correlation structure of the simulated data. When &#966;<sub>1 </sub>= 0.9, for example, highly correlated observations are generated. When parameter &#966;<sub>12 </sub>is nonzero, the model generates observations with a stochastic seasonal component. Similarly, nonzero &#966;<sub>12 </sub>combined with nonzero &#966;<sub>1 </sub>or &#966;<sub>2 </sub>generates a series with a stochastic seasonal component which is correlated with the non-seasonal components. When all parameters are nonzero the generated observations have a complex structure which depends on a cyclical trend, a stochastic seasonal component, and its correlation structure. In our simulation, we set the order of the error terms in the autoregression model to 2 (for all simulated observations). The mean and standard deviation of 1000 calculated <graphic file="1478-7954-1-10-i1.gif"/> procedures are given in Table <tblr tid="T1">1</tblr>.</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Simulation results</p>
               </caption>
               <tblbdy cols="7">
                  <r>
                     <c cspan="3" ca="center">
                        <p>
                           <b>Parameters</b>
                        </p>
                     </c>
                     <c cspan="4" ca="center">
                        <p>
                           <graphic file="1478-7954-1-10-i1.gif"/>
                           <b>(Std)</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>&#966;<sub>1</sub></p>
                     </c>
                     <c ca="left">
                        <p>&#966;<sub>2</sub></p>
                     </c>
                     <c ca="left">
                        <p>&#966;<sub>12</sub></p>
                     </c>
                     <c ca="center">
                        <p>&#945; = 0</p>
                     </c>
                     <c ca="center">
                        <p>&#945; = 1</p>
                     </c>
                     <c ca="center">
                        <p>&#945; = 2</p>
                     </c>
                     <c ca="center">
                        <p>&#945; = 4</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0.099 (0.041)</p>
                     </c>
                     <c ca="center">
                        <p>0.400 (0.070)</p>
                     </c>
                     <c ca="center">
                        <p>0.703 (0.051)</p>
                     </c>
                     <c ca="center">
                        <p>0.902 (0.020)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>0.5</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0.098 (0.040)</p>
                     </c>
                     <c ca="center">
                        <p>0.409 (0.070)</p>
                     </c>
                     <c ca="center">
                        <p>0.712 (0.049)</p>
                     </c>
                     <c ca="center">
                        <p>0.906 (0.019)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>-0.9</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0.103 (0.042)</p>
                     </c>
                     <c ca="center">
                        <p>0.396 (0.069)</p>
                     </c>
                     <c ca="center">
                        <p>0.698 (0.050)</p>
                     </c>
                     <c ca="center">
                        <p>0.899 (0.020)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>0.5</p>
                     </c>
                     <c ca="left">
                        <p>-0.8</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0.098 (0.039)</p>
                     </c>
                     <c ca="center">
                        <p>0.393 (0.065)</p>
                     </c>
                     <c ca="center">
                        <p>0.699 (0.044)</p>
                     </c>
                     <c ca="center">
                        <p>0.899 (0.017)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>0.5</p>
                     </c>
                     <c ca="center">
                        <p>0.256 (0.088)</p>
                     </c>
                     <c ca="center">
                        <p>0.703 (0.064)</p>
                     </c>
                     <c ca="center">
                        <p>0.896 (0.026)</p>
                     </c>
                     <c ca="center">
                        <p>0.971 (0.007)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>0.7</p>
                     </c>
                     <c ca="center">
                        <p>0.410 (0.114)</p>
                     </c>
                     <c ca="center">
                        <p>0.847 (0.043)</p>
                     </c>
                     <c ca="center">
                        <p>0.953 (0.014)</p>
                     </c>
                     <c ca="center">
                        <p>0.988 (0.004)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>0.9</p>
                     </c>
                     <c ca="center">
                        <p>0.716 (0.101)</p>
                     </c>
                     <c ca="center">
                        <p>0.974 (0.009)</p>
                     </c>
                     <c ca="center">
                        <p>0.993 (0.002)</p>
                     </c>
                     <c ca="center">
                        <p>0.998 (0.001)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>0.3</p>
                     </c>
                     <c ca="left">
                        <p>-0.2</p>
                     </c>
                     <c ca="left">
                        <p>0.3</p>
                     </c>
                     <c ca="center">
                        <p>0.171 (0.066)</p>
                     </c>
                     <c ca="center">
                        <p>0.602 (0.072)</p>
                     </c>
                     <c ca="center">
                        <p>0.847 (0.034)</p>
                     </c>
                     <c ca="center">
                        <p>0.957 (0.011)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>0.3</p>
                     </c>
                     <c ca="left">
                        <p>-0.2</p>
                     </c>
                     <c ca="left">
                        <p>0.5</p>
                     </c>
                     <c ca="center">
                        <p>0.260 (0.095)</p>
                     </c>
                     <c ca="center">
                        <p>0.764 (0.060)</p>
                     </c>
                     <c ca="center">
                        <p>0.924 (0.021)</p>
                     </c>
                     <c ca="center">
                        <p>0.980 (0.006)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>0.3</p>
                     </c>
                     <c ca="left">
                        <p>-0.2</p>
                     </c>
                     <c ca="left">
                        <p>0.7</p>
                     </c>
                     <c ca="center">
                        <p>0.470 (0.149)</p>
                     </c>
                     <c ca="center">
                        <p>0.934 (0.024)</p>
                     </c>
                     <c ca="center">
                        <p>0.982 (0.006)</p>
                     </c>
                     <c ca="center">
                        <p>0.995 (0.002)</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <p>When &#945; and &#966;<sub>12 </sub>are zero, <graphic file="1478-7954-1-10-i1.gif"/> indicates no significant seasonality even for highly correlated data (e.g. &#966;<sub>1 </sub>= -0.9). The calculated <graphic file="1478-7954-1-10-i1.gif"/> increases as &#945; increases regardless of the sign and magnitude of the other parameters. When &#945; is zero (there is no cyclical trend), the <graphic file="1478-7954-1-10-i1.gif"/> increases in all cases as &#966;<sub>12 </sub>increases. This demonstrates the ability and usefulness of <graphic file="1478-7954-1-10-i1.gif"/> in quantifying the strength of the seasonality in time series data.</p>
            <p>To investigate the linkages between the <graphic file="1478-7954-1-10-i1.gif"/> and <it>p</it>, the order of the autoregression model, we repeated the simulation experiments with <it>p </it>= 1,2,4,8, and 13. Also an additional simulation experiment using the stepwise autoregression method was conducted to select the order of the autoregressive error model whereby the maximum possible autoregressive order was set equal to 13. The stepwise autoregression method involves fitting a high order model and then sequentially removing parameters until all remaining autoregressive parameters remain statistically significant <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>. For the fixed order, <it>p</it>, simulation results (not shown here) showed that <graphic file="1478-7954-1-10-i1.gif"/> is robust to over-fitting the data and <it>p </it>does not significantly affect the estimated <graphic file="1478-7954-1-10-i1.gif"/>. Results from the stepwise experiments showed that relative to fixed orders, <it>p</it>, there were no significant differences in the <graphic file="1478-7954-1-10-i1.gif"/>. Results were, however, slightly more conservative using the stepwise method.</p>
            <p>In order to apply the autoregression procedure to actual data it is important to eliminate the nonstationarity in the mean and variance of the observations. For the data used in the study the Dickey-Fuller unit root test <abbrgrp><abbr bid="B14">14</abbr></abbrgrp> is used to test the stationarity of the series and determine the order of differencing required for the nonstationary series. The SAS procedure, AUTOREG, was used for calculating <graphic file="1478-7954-1-10-i1.gif"/>. In all cases <it>p</it>, the order of the autoregressive model for error terms was selected using the stepwise autoregression method. The maximum possible order was set equal to 13.</p>
         </sec>
         <sec>
            <st>
               <p>Data Sources</p>
            </st>
            <p>The data were derived from two retrospective, population-based, cross-sectional time series studies assessing temporal patterns in all discharge separations for asthma (from April 1, 1988 to March 31, 2000) and atrial fibrillation (from April 1, 1988 to March 31, 2001) for the population of Ontario. Approximately 14 million residents of Ontario, Canada eligible for universal health care coverage during this time were included for analysis. The database used was the Canadian Institute for Health Information (CIHI) Discharge Abstract Database which records discharges from all Ontario acute care hospitals. All records with a most responsible discharge diagnosis of atrial fibrillation (ICD-9 code: 427.3) and asthma (ICD-9 code: 493) were selected. The numerator consisted of the total number of discharge separations for each month. Denominators were constructed from annual census data provided by Statistics Canada for each age group for residents of Ontario. Monthly population estimates were created through linear interpolation.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>Atrial fibrillation</p>
            </st>
            <p>Overall, there were 90,199 (45,477 female and 44,472 male) discharge separations for all ages. Figure <figr fid="F1">1</figr> shows the monthly rates of admission per 100,000 population. There is a conspicuous upward trend in admissions over the first four years (Figure <figr fid="F1">1</figr>). Visual inspection does not support conspicuous seasonality.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Atrial fibrillation hospitalizations per 100,000 population</p>
               </caption>
               <text>
                  <p>Atrial fibrillation hospitalizations per 100,000 population</p>
               </text>
               <graphic file="1478-7954-1-10-1"/>
            </fig>
            <p>However, after applying first order differencing, the Dicky-Fuller test confirmed the stationarity of the differenced series. After differencing the series, the data does not show strong evidence of statistically significant seasonality. Bartlett's Kolmogorov-Smirnov Statistic (BKS) for both genders, females and males are 0.327, 0.308, 0,315 with p-values all less than 0.0001. The Fisher's Kappa (FK) test statistics are 7.17 (0.01 &lt; p-value &lt; 0.05), 5.78 (not significant), and 7.98 (0.01 &lt; p-value &lt; 0.05). The calculated <graphic file="1478-7954-1-10-i1.gif"/> for both genders, females, and males are 0.44, 0.28, and 0.45, respectively. The difference between maximum and minimum months parameter estimates are 2.79, 1.87, 3.85 for both genders, females and males respectively which can be interpreted directly as the difference in hospitalizations per 100,000 population between peak and trough months in one year. The small values for the amplitude of seasonal factors, and the low values of the <graphic file="1478-7954-1-10-i1.gif"/> indicate a weak to non-existent seasonality.</p>
         </sec>
         <sec>
            <st>
               <p>Asthma</p>
            </st>
            <p>In total, there were 206,561 (104,283 female and 102,278 male) asthma hospital discharges for all ages. Figure <figr fid="F2">2</figr> shows the monthly rates of asthma per 100,000 population. The visual inspection of Figure <figr fid="F2">2</figr> shows a clear autumn peak and summer trough seasonal pattern occurring every year over the 12 year period. The Dicky-Fuller unit root test confirms that the series is stationary. The results of the seasonality tests applied on the rates demonstrate statistically significant seasonality. The Fisher's Kappa (FK) test statistics for both genders, female and male are 21.22, 22.05, and 20.52 with p-values all less than 0.01. The Bartlett Kolmogorov-Smirnov (BKS) test statistics are 0.512, 0.518, and 0.489 with p-values all less than 0.0001. The calculated <graphic file="1478-7954-1-10-i1.gif"/> values are 0.82, 0.78, and 0.82 for both genders, females and males respectively. The difference between maximum and minimum months parameter estimates are 9.8, 9.1, and 11.6 for both genders, females and males respectively which directly translates into the difference in hospitalizations per 100,000 population between the peak and trough in one year. The large values for the amplitude of seasonal factors, and the high <graphic file="1478-7954-1-10-i1.gif"/> values provide clear evidence of strong seasonality.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Asthma hospitalizations per 100,000 population</p>
               </caption>
               <text>
                  <p>Asthma hospitalizations per 100,000 population</p>
               </text>
               <graphic file="1478-7954-1-10-2"/>
            </fig>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <p>The results of this study show that the <graphic file="1478-7954-1-10-i1.gif"/> can be useful in distinguishing weak from strong stable seasonal effects in both simulation and in actual data sets. While the statistical tests such as BKS and Fisher's Kappa support statistical evidence of seasonality in the data, the <graphic file="1478-7954-1-10-i1.gif"/> allows quantification of the strength of that stable seasonality, as demonstrated by the simulation results. Regardless of the values of the parameters &#966;<sub>1 </sub>and &#966;<sub>2</sub>, when the parameters &#945; and &#966;<sub>12 </sub>were zero, the <graphic file="1478-7954-1-10-i1.gif"/> was small, and when one or both of those parameters increased, <graphic file="1478-7954-1-10-i1.gif"/> increased proportionally. This is important because any proposed statistics for measuring the strength of the seasonality must be invariant of the correlation structure. The simulation results showed the power of the <graphic file="1478-7954-1-10-i1.gif"/> in adequately quantifying the strength of the seasonality of the simulated observations for all models. The magnitude of <graphic file="1478-7954-1-10-i1.gif"/> shows how well the next value can be predicted by using month as the only predictor. In other words it shows the contribution of seasonality in the total variation of the data.</p>
         <p>When the technique was applied to the two data sets, it corroborated the visual evidence that asthma is more conspicuously seasonal than atrial fibrillation. The seasonality of asthma has been conclusively demonstrated in several studies and is likely a key to understanding the etiology of exacerbations of asthma <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr></abbrgrp>. The strength and consistency of the effect is likely of relevance to health policy and planning. The seasonality of atrial fibrillation has been reported, but outcomes were reported as relative risks <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>. The analysis provided here indicates that the seasonality of admissions for atrial fibrillation is not likely of policy or clinical significance as the magnitude is quite small.</p>
         <p>One important question that remains to be answered is how the magnitude of seasonal factor changes over time affect the <graphic file="1478-7954-1-10-i1.gif"/>. For non-stable seasonal variation, a proper transformation such as log may be required to transfer a non-stable seasonal variation to a stable one. The value of <graphic file="1478-7954-1-10-i1.gif"/> may change if the sampling period changes (e.g. monthly data converted to weekly data). By including year and month as predictors we can adjust for moving seasonality, however, further research is required.</p>
      </sec>
      <sec>
         <st>
            <p>Conclusions</p>
         </st>
         <p>The proposed autoregression method is a statistical technique well suited to the study of seasonality in health data. Although monthly data was used for this analysis, it can easily be applied to weekly or seasonal data. The approach allows researchers to quantify and compare the strength of the seasonality for different genders and age groups. The coefficient of determination is easy to calculate and interpret. And finally, it is well known to health care researchers and is frequently used as a measure for goodness of fit. For the purposes of health services research and population health measurement, evidence of the statistical presence of stable seasonality is insufficient to determine the etiologic, clinical and policy relevance of findings. Measurement of the strength of the seasonal effect is also required in order to provide a robust sense of seasonality. We believe that this autoregression technique, in concert with statistical testing, graphical representation and measures of the absolute magnitude of seasonal effect, is an important component to this robust approach.</p>
      </sec>
      <sec>
         <st>
            <p>Competing interests</p>
         </st>
         <p>None declared.</p>
      </sec>
      <sec>
         <st>
            <p>Authors Contributions</p>
         </st>
         <p>RM and RU initiated the idea for the study. MM and EC contributed critical intellectual input to the project. RM wrote the initial draft, and all authors contributed to the subsequent drafts. All have read and approve of the final draft.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>The authors would like to thank the editor and reviewers for their very helpful suggestions that significantly improved the paper. Also we would like to thank Shari Gruman for her expert assistance in formatting the manuscript. RU is supported by a New Investigator Award from the Canadian Institutes of Health Research and a Research Scholar Award from the Department of Family and Community Medicine, University of Toronto. This research was supported by an operating grant from the Canadian Institutes of Health Research.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>The recognition and estimation of cyclic trends</p>
            </title>
            <aug>
               <au>
                  <snm>Edwards</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Ann Hum Genet</source>
            <pubdate>1961</pubdate>
            <volume>25</volume>
            <fpage>83</fpage>
            <lpage>86</lpage>
            <xrefbib>
               <pubid idtype="pmpid">13725808</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Seasonality comparisons among groups using incidence data</p>
            </title>
            <aug>
               <au>
                  <snm>Jones</snm>
                  <fnm>RH</fnm>
               </au>
               <au>
                  <snm>Ford</snm>
                  <fnm>PM</fnm>
               </au>
               <au>
                  <snm>Hamman</snm>
                  <fnm>RF</fnm>
               </au>
            </aug>
            <source>Biometrics</source>
            <pubdate>1988</pubdate>
            <volume>44</volume>
            <fpage>1131</fpage>
            <lpage>1144</lpage>
            <xrefbib>
               <pubid idtype="pmpid">3233250</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>The performance of several statistical tests for seasonality in monthly data</p>
            </title>
            <aug>
               <au>
                  <snm>Marrero</snm>
                  <fnm>O</fnm>
               </au>
            </aug>
            <source>J Statist Comput Simulation</source>
            <pubdate>1983</pubdate>
            <volume>17</volume>
            <fpage>275</fpage>
            <lpage>296</lpage>
         </bibl>
         <bibl id="B4">
            <aug>
               <au>
                  <snm>Fuller</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>Introduction to Statistical Time Series</source>
            <publisher>New York: John Wiley</publisher>
            <pubdate>1976</pubdate>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Testing for seasonality</p>
            </title>
            <aug>
               <au>
                  <snm>Franses</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Econ Lett</source>
            <pubdate>1992</pubdate>
            <volume>38</volume>
            <fpage>259</fpage>
            <lpage>262</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1016/0165-1765(92)90067-9</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>L'analyse de la variation saisonniere quand l'amplitude et la taille sont faibles</p>
            </title>
            <aug>
               <au>
                  <snm>Marrero</snm>
                  <fnm>O</fnm>
               </au>
            </aug>
            <source>Canad J Statist</source>
            <pubdate>1999</pubdate>
            <volume>27</volume>
            <fpage>857</fpage>
            <lpage>882</lpage>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Statistical testing for seasonality in data with multiple peaks and troughs</p>
            </title>
            <aug>
               <au>
                  <snm>Marrero</snm>
                  <fnm>O</fnm>
               </au>
            </aug>
            <source>Biom J</source>
            <pubdate>1984</pubdate>
            <volume>26</volume>
            <fpage>591</fpage>
            <lpage>608</lpage>
         </bibl>
         <bibl id="B8">
            <title>
               <p>The power of a nonparametric test for seasonality</p>
            </title>
            <aug>
               <au>
                  <snm>Marrero</snm>
                  <fnm>O</fnm>
               </au>
            </aug>
            <source>Biom J</source>
            <pubdate>1988</pubdate>
            <volume>30</volume>
            <fpage>495</fpage>
            <lpage>502</lpage>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Identification of aperiodic seasonality in non-Gaussian time series</p>
            </title>
            <aug>
               <au>
                  <snm>Normolle</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Biometrics</source>
            <pubdate>1994</pubdate>
            <volume>50</volume>
            <fpage>798</fpage>
            <lpage>812</lpage>
            <xrefbib>
               <pubid idtype="pmpid">7981399</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Statistical analysis of the seasonal variation in demographic data</p>
            </title>
            <aug>
               <au>
                  <snm>Fellman</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Eriksson</snm>
                  <fnm>AW</fnm>
               </au>
            </aug>
            <source>Hum Biol</source>
            <pubdate>2000</pubdate>
            <volume>72</volume>
            <fpage>851</fpage>
            <lpage>876</lpage>
            <xrefbib>
               <pubid idtype="pmpid">11126729</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Testing in unobserved components models</p>
            </title>
            <aug>
               <au>
                  <snm>Harvey</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>J Forecasting</source>
            <pubdate>2001</pubdate>
            <volume>20</volume>
            <fpage>1</fpage>
            <lpage>19</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1002/1099-131X(200101)20:1&lt;1::AID-FOR764>3.0.CO;2-3</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <aug>
               <au>
                  <snm>Hamilton</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Time Series Analysis</source>
            <publisher>Princeton, NJ: Princeton University Press</publisher>
            <pubdate>1994</pubdate>
            <volume>Chapter 4</volume>
         </bibl>
         <bibl id="B13">
            <source>SAS/TES User's Guide</source>
            <publisher>Cary, NC: SAS Institute Inc.</publisher>
            <pubdate>1999</pubdate>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Testing for unit roots in seasonal time series</p>
            </title>
            <aug>
               <au>
                  <snm>Dickey</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Hasza</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Fuller</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>J Am Stat Assoc</source>
            <pubdate>1984</pubdate>
            <volume>79</volume>
            <fpage>355</fpage>
            <lpage>367</lpage>
         </bibl>
         <bibl id="B15">
            <title>
               <p>A population based time series analysis of asthma hospitalisations in Ontario, Canada: 1988 to 2000</p>
            </title>
            <aug>
               <au>
                  <snm>Crighton</snm>
                  <fnm>EJ</fnm>
               </au>
               <au>
                  <snm>Mamdani</snm>
                  <fnm>MM</fnm>
               </au>
               <au>
                  <snm>Upshur</snm>
                  <fnm>RE</fnm>
               </au>
            </aug>
            <source>BMC Health Serv Res</source>
            <pubdate>2001</pubdate>
            <volume>1</volume>
            <fpage>7</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">57008</pubid>
                  <pubid idtype="pmpid" link="fulltext">11580873</pubid>
                  <pubid idtype="doi">10.1186/1472-6963-1-7</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Seasonal variation in asthma hospitalizations and death rates in New Zealand</p>
            </title>
            <aug>
               <au>
                  <snm>Kimbell-Dunn</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Pearce</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Beasley</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Respirology</source>
            <pubdate>2000</pubdate>
            <volume>5</volume>
            <fpage>241</fpage>
            <lpage>246</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1440-1843.2000.00255.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">11022986</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Seasonal variation in childhood asthma hospitalisations in Finland, 1972&#8211;1992</p>
            </title>
            <aug>
               <au>
                  <snm>Harju</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Keistinen</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Tuuponen</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Kivela</snm>
                  <fnm>SL</fnm>
               </au>
            </aug>
            <source>Eur J Pediatr</source>
            <pubdate>1997</pubdate>
            <volume>156</volume>
            <fpage>436</fpage>
            <lpage>439</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s004310050632</pubid>
                  <pubid idtype="pmpid" link="fulltext">9208236</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Seasonal variation in hospital discharge diagnosis of atrial fibrillation: a population-based study</p>
            </title>
            <aug>
               <au>
                  <snm>Frost</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Johnsen</snm>
                  <fnm>SP</fnm>
               </au>
               <au>
                  <snm>Pedersen</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Husted</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Engholm</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Sorensen</snm>
                  <fnm>HT</fnm>
               </au>
               <au>
                  <snm>Rothman</snm>
                  <fnm>KJ</fnm>
               </au>
            </aug>
            <source>Epidemiology</source>
            <pubdate>2002</pubdate>
            <volume>13</volume>
            <fpage>211</fpage>
            <lpage>215</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1097/00001648-200203000-00017</pubid>
                  <pubid idtype="pmpid" link="fulltext">11880763</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>

