QUESTION 1 The AIS collects a v ariety of data on its athletes. In one publically-av ailable data set, measurements were made on 202 athletes of v arious body and blood characteristics. The sum of skin folds (ssf), a measure of bodyfat,was recorded. Interest is at the moment turning to extracting intelligence about the relationship between ssf and pbfat. (a) What isthe correlation between ssf and pbfat? (b) Explain what yourvalue ofthe correlation means in one ortwo sentences. (c) Write down the equation of the regression line relating ssf (a fairly simple measure to take) to pbfat (a much more complex measure to take). (d) Explain what your value ofthe intercept means in one or two sentence s. (e) Explain what yourvalue ofthe slope means in one ortwo sentences. QUESTION 2 10 marks (a) What isthe predicted percentage of pbfat for a ssf of 70? (b) Carry out a hypothesistest to check whetherthere is evidence in the sample of a non-zero slope for the line relating ssfto pbfat. Your answer should include null and alternative hypotheses, test statrstrc,p-value and conclusion. (c) Write down a 95% confidence interval for the slope forthe line relating ssf to pbfat in all athlete s. (d) Explain what the confidence interval in (e) means in one or two sentence s. (e) Produce a scatter plot of residuals versus predicted values. Use the plot to comment on whether the re gressron Inference conditions of constant v ariance and no strong outliers hav been met by this data set. QUESTION 3 10 marks Coaches would now like to extract intelligence about whetherthere is evidence of a difference in average haematocrit levels (a blood marker, denoted hc) between male and female athletes, on the basis of the data collected. (a) Write down the null and alternativ hypothesesforthe researchers. (b) Why should this be an independent-samplestest and not a parre d-samplestest? Answer o one or two sentence s. (c) Use R to find the v alue of the test statistic, and the p-value, for this test. (d) At the 5% level, isthe null hypothesis rejected or not? Explain your answer in one ortwo sentences. (e) Write a conclusion to the test for the researchers. QUESTION 4 10 marks The relationship between sport and gender is also of interest. Use R to carry out a chi-squared test for the researchers. Your answer should include null and alternative hypothesis, a test statistic, p value, decision and conclusion that can be reported to the researchers. QUESTION 5 Sports managers are interested in whether ssf depends ont sport the athletes play. (a) Produce a well-labelled boxplot of ssf by sport. (b) Calculate the mean and the standard deviation of the tennis playersÂ’ ssf. Compare the location and spread of the ten sports in a short paragraph. (c) Check whether the condition of equal standard deviations between the ten groups has been met. Your answer should include some calculations. (d) Use an ANOVAto test the hypothesis of no difference in mean ssf between the five sports. Your answer should include a null and alternative hypothesis, test statistic, p value and conclusion. Use a= 0.05.