STA 437/1005, Assignment #2, Question #2. This discussion refers to plots and R output on the course web page. The pairwise plots of differences show no clear indications of non-normality or of outliers. Due to the very small sample size, we could hope to detect non-normality only if it were extreme, so a fair degree of non-normality is still possible, but it makes sense to proceed on the assumption that the differences are normally distributed. Only one outlier, with weight difference over 20 kilograms, is apparent. Common sense says that such a difference can arise from different diet, even if the genes are the same, so there is no reason to think that this observation is erroneous. I have retained it in the data. The T-squared test of the null hypothesis of the mean difference being zero for all five variables produces a p-value of 0.54036, which indicates no evidence that the null hypothesis is false. So the researcher's idea that there might be a difference has not been confirmed. The 90% confidence intervals from the T-squared test all contain zero (consistent with the p-value from the hypothesis test). The univariate 90% confidence intervals based on the t statistic also all contain zero, except for the difference in area, where the interval is (3.757, 134.965). However, when the Bonferroni correction is applied in order to produce simultaneous 90% confidence intervals, all intervals, including that for area, contain zero. Unless the researcher had some reason to think (even before looking at the data) that the area variable was particularly likely to show a non-zero difference in mean, the univariate confidence interval for the difference in area should be discounted, since even if the null hypothesis is true, it is fairly likely that at least one of five 90% confidence intervals would not contain zero. The sample size for this study is very small, so only quite large differences in mean between the first-born and second-born twin would be detectable. If the researcher thinks has good reason to expect a difference (one within the bounds of the confidence intervals obtained from this study), they should try to obtain more data, enough so that differences of moderate size would be detectable.