# Circular Statistics

The mean of eleven o’clock and one o’clock is not six o’clock, so why does your calculator keep telling that it is?  Quantities that are cyclical, or circular, in nature need to be approached differently than quantities on a noncyclical scale.  These cyclical quantities are commonly encountered in directions (e.g., degrees or radians) and time (e.g., months and time of day).  Some quick trigonometry can reduce such quantities into cardinal vectors that allow for moments such as the mean to be calculated.

For example, consider some unit vectors (i.e., length 1 unit) in three different directions: 60, 120, and 230 degrees.

The first step is to take the sin and cosine of each vector and average these.  If this is shown graphically, it become obvious why this works.  The sin and cosine of the directions are the ‘X’ and ‘Y’ components of each unit vector.  Taking the mean of each of these yields a mean X and mean Y, or the set of coordinates corresponding to the mean of the vectors.  In this case, the mean sin or X is 0.322, and the mean cosine or Y is -0.214.

The length of this mean vector can calculated by the Pythagorean theorem, as sqrt(X^2 + Y^2), or 0.387.  Determining the direction is a two-step process.  First we take the arctangent of (mean sin/mean cosine). Which quadrant of the space the mean vector lies in can be determined by examining the signs of the mean sin and mean cosine. If the sin is positive, the mean vector lies to the right of the origin, and to the left if it is negative.  If the cosine is positive, the mean vector lies above the origin, and below it if it is negative.  Therefore, these rules are applied to determine the final angle:

Sin + and cos +: angle is as found by arctangent

Sin + and cos -: angle is 180 – arctangent

Sin – and cos 0: angle is 180 + arctangent

Sin – and cos +: angle is 360 – arctangent

The same procedure can be used for vectors that have different lengths or magnitudes.  Using this procedure with other types of data such as months or times of the day simply require that they be coded appropriately into the vectors.  For example, snowfall on January 15th could be coded such that the direction was (15/365)*360 degrees, and the length or magnitude was the snowfall in mm.

So far, this has been about radial mathematics, not radial statistics.  What if you wanted to statistically test a null hypothesis that a given pattern was random?  One way to approach such a test is through a permutation test.  A null distribution is first created by permuting the data and calculating the result.  This is then repeated many times, commonly 9,999 times.  A histogram of the results of such repeated permutations is then the null distribution.  For example, you could permute the data 9,999 times, each time randomly matching up magnitudes and directions, and build the null distribution from the resulting vector magnitudes.  The original mean vector is then compared to the null distribution, and so becomes the 10,000th value.  The proportion of null values, including your original value, that are greater than or equal to your original value, is your p-value, or the probability of seeing data as skewed as you are seeing in the absence of a real trend.  In other words, this is likelihood you have of rejecting your null hypothesis and saying something is going on, when in fact nothing is going on.  Under the null of randomness, the magnitude of the mean vector will be close to zero, because the values are distributed in all directions.  In our example, note that we have unit vectors in three of the four quadrants, and fairly even spread of data, and the mean vector is much shorter, or the magnitude is closer to zero.

A good reference for radial statistics that I drew upon heavily while learning the background to these techniques is:

Fisher, N.I., 1995, Statistical Analysis of Circular Data. Cambridge University Press.