1.7. Concepts Review#

Here we review some conceptual points from the lecture.

Please try to answer each question yourself before clicking to reveal the answer

You can discuss these points with your tutor at the computer-based tutorial session (these sessions are for discussing concepts as well as for developing Python skills)

  1. The fertility rate (mean number of children per adult woman) varies in Western Europe between a low of 1.3 (Italy and Spain) and a high of 1.9 (Ireland).

    For each woman, the number of children is a whole number, such as 0, 1, or 2.

    Does it make sense to measure a mean number of children per adult woman when the mean is not a whole number? Why or why not?

  2. Give an example of a variable for which the mode applies, but not the mean or the median.
  3. A unimodal data set has mean = 50, median = 30 and mode = 10. Sketch and describe the data distribution
  4. The inter quartile range (IQR) is a robust measure of spread. Why is IQR robust? What is meant by robust and why is this a desirable quality?
  5. An architect is designing a multi-story carpark. She needs to calculate the expected weight of the cars on each floor. 200 cars can park on each floor. Which measure of average car weight is more useful, the mean or the median?
  6. The same architect needs to decide on the length of parking bays. She decides to choose one length for ‘standard’ bays that will fit most cars, and another length for ‘extra long’ bays that will fit almost any car. What statistics does she need?
  7. Give an example of a variable having a distribution that you expect to be a) Approximately symmetric b) Skewed to the right c) Skewed to the left d) Bimodal e) Skewed to the right, with a mode and median of 0 but a mean greater than 0 In each case justify your choice (one short sentence)
  8. A teacher summarizes grades on the midterm exam as follows:
    • Min 26
    • Q1 67
    • Median 80
    • Q3 87
    • Max 100
    • Mean 76
    • Mode 100
    • Standard deviation 76
    • IQR 20
    She incorrectly recorded one of the numbers (which has an impossible value). Which one do you think it is? Why?
  9. An exam is graded on the scale 1-100 and the mean score is 76. Which value is the more plausible for the standard deviation: -20, 0 , 10, or 50? Why?
  10. A researcher measures weight (in stone) and height (in inches) for men. She calculates the correlation and covariance. She then decides to convert her data to metric units, kilograms and centimetres. One kilogram is 0.157 stone and one centimetre is 0.39 inches. What will happen to the correlation and covariance?
  11. What are the assumptions of Pearson's r?

    rViolations

  12. Which features of each illustration above violate the assumptions of Pearson's r?
  13. What is heteroscedasticity?
  14. Explain why heteroscedasticity is common in real datasets