*Reviewed and revised 26 August 2015*

**SUMMARY**

- Data can be described by different distributions, which has implications for how the data is analysed using statistical methods
- Data transformation can be performed to allow different types of statistical tests to be used
- Important distributions
- Normal distribution
- Standard normal distribution
- Binomial distribution
- Poisson distribution

**NORMAL DISTRIBUTION**

- an observation that is normally distributed within a population has a norm with random independent factors causing variation from that norm
- most values cluster around the norm with fewer & fewer values towards the tails
- extreme values do exist
- variation is random, so there is equal spread of values above & below the norm
- mean = median = mode
- normal distribution can be plotted to illustrate the frequency of observations or the probability of an observation arising
- the curve is bell shaped, symmetrical and theoretically of infinite size with tails that never react the x axis
- the mean and sd (standard deviation) of a sample likely to be close to the mean and sd of the population from which it was sampled
- the smaller the sample the less likely it will have ‘normal’ geometry -> less likely the mean and standard deviation will match those of the population
- to determine whether a sample is normally distributed you can plot the data and ‘eye-ball’ the pattern
- also you can calculate the mean & sd of the observed data & from the frequencies of the values which would be expected with these parameters
- comparison will tell you whether the sample is ‘normal’
- population mean = the average value in a real population

**STANDARD NORMAL DISTRIBUTION**

- a transformation of the points on a normal distribution into multiples of the standard deviation or standard error from the population mean
- these multiples are termed z values & their distribution is sometimes referred to as z transformation

**DATA TRANSFORMATION**

- allows you to use parametric tests on data that has a skewed distribution by first converting them to a near normal distribution
- can use square root of data or logs

**BINOMIAL DISTRIBUTION**

- describes the probability of different proportions of a binary outcome arising in a fixed number of observations

ie. the probabilities of different proportions of heads arising during sets of coin tosses - the most likely proportion (the norm) in the population = the population proportion (pie)

ie. heads in tossed coins = 0.5 - as sample size increases, it becomes more likely that the proportion of a particular observation within the sample will be the same or similarly to that of the population proportion (sigma)

ie. the more times you throw the coin the more likely the proportion of heads = 0.5 - the larger the sample -> the closer the binomial distribution is to the normal distribution
- the total of all outcomes must = 1.0
- the probability of a specific proportion arising in a sample is calculated using the binomial formula
- the inputs are (1) the proportion you are seeking, (2) the population proportion, (3) the sample number
- as the binomial distribution can be approximated to a normal distribution, hypothesis tests such as the normal approximation test can be carried out to determine the probability of a particular proportion arising in a binomial distribution

**POISSONS DISTRIBUTION**

- describes probability of a number of events occurring in a fixed time period or in a region of space
- events must be random and independent of each other
- the probability is calculated from an exponential formula and depends on prior knowledge of one parameter only -> the mean number of occurrences per unit time period (or unit region of space)
- ie. if the number of adverse incidents in OT occur in a 2 year period is known, what is the probability of 5 incidents happening in one day?

## Leave a Reply