STA 410/2102, Assignment 1, Spring 2004 For uniformly distributed data, the tests should produce p-values that are uniformly distributed over (0,1). The gap test procedure does seem to produce uniformly distributed p-values, for all sample sizes tested, with some random variation that results from simulating only 1000 data sets. However, the sum test procedure clearly does not produce uniformly distributed p-values for data sets of size n=2. In fact, we can see from the historgram that it produces NO p-values below 0.1. This can be explained by the fact that when n=2, the test statistic, S, cannot be greater than 4, which produces a p-value of 0.135. When n is increased, the distribution of p-values produced by the sum test becomes closer to uniform, as expected from the Central Limit Theorem. It's fairly close to being uniform when n=10. For data that is not uniformly distributed, we hope that the distribution of p-values will be concentrated near zero. The gap test procedure shows this behaviour for both non-uniform distributions that were tried, but the concentration near zero is not very large for n=2 and n=4. For n=10 and n=20, the distribution is clearly concentrated near zero when the data is shifted by 0.5. However, for n=10, the gap test produces p-values that are only slightly non-uniform when the data is scaled by 1.5 (so the gap test is not very powerful in this situation). When n=20 and the data is scaled by 1.5, the gap test produces a distribution of p-values that is more clearly concentrated near zero, but the test is still not very powerful. The sum test procedure doesn't work at all when n=2, and is also not very good when n=4. When n=10 or n=20, the sum test procedure produces p-values concentrated near zero when the data is shifted by 0.5, and seems to be more powerful than the gap test procedure. However, the sum test procedure seems to produce uniformly distributed p-values when the data is scaled by 1.5. In other words, the sum test procedure has no power to detect this sort of departure from uniformity, which the gap test procedure can detect. The overall recommendation is to use the sum test procedure for data sets of size at least 10 when the sort of non-uniformity expected is similar to that produced by shifting, but to use the gap test procedure if the sample size is small or if other sorts of non-uniformity are of interest.