Quick access through the institutional single sign-on
Skip this for now
|
Public Access Here

Logging You In!
Skip this for now
|
Man Met Access Here

# Testing for Normal Distribution

Testing for Normal Distribution & Homogeneity

When conducting bivariate or multivariate tests, to check our data is normally distributed, we utilise Q-Q plots. We can run Q-Q plots for normal distribution and then run a Levene’s Test for homogeneity. If we wanted to test the hypothesis: There will be a gender difference in Life Satisfaction in Great Britain,

To produce Q-Q plots for the above example we will insert the following commands into R Studio.

qqnorm(opintfd\$’Life Sat'[opintfd\$Gender==”Male”], main=”Male”)

qqnorm(opintfd\$`Life Sat`[opintfd\$Gender==”Female”], main=”Female”)

You need analyse Q-Q plots for each independent variable or case within the variable to determine whether the data is normal (or not). In this case we only needed to produce and insert two commands as Gender is a dichotomous variable however if we were looking at a variable with more than two groups each individual case would need a command to produce the respective Q-Q plot for the data. The figures shown below show the plots for our data.

Data is deemed approximately normally distributed if the dots roughly follow a line (although they don’t have to sit on the line).

Data would be deemed not-normal if it formed an approximate S-shape.

Reviewing our Q-Q plots above we can see that our data is approximately normal because it follows along a line, we can see for both Males and Females there is a slight tail or skew to the left, but it is not great.

Interpreting Q-Q plots gets easier with time and sometimes it is just up to your own judgement, if you clean your data properly prior to assumption testing the removal of outliers should make it easier to establish normality.

Below are two examples of skewed Q-Q plots for reference, these would be deemed not-normal and would violate the parametric assumptions for normality.

Interpreting Q:Q Plots

Review the outputs below and present your findings for the following assumption testing

OPTIONAL

Parametric Assumption Testing

Now complete the Normality parametric assumption for the hypothesis ‘ Life Satisfaction will differ according to ethnicity in Great Britain.

OPTIONAL