A lot of students have difficulty understanding the concept of Standard Error (SE). Many believe it is the same as standard deviation. I will attempt to clarify the concept of standard error through this post.
In order to understand standard error, we must first know a little bit about sampling.
Let us assume that we wish to know the average height of adult men in India. Ideally, we should examine each adult Indian man, and then compute the average height from the data thus collected. As you can imagine, such an exercise will be very expensive in terms of both time and money.
How does one get around the problem of resource constraints, then? By taking a sample. Sampling is the procedure of obtaining a sample from a population, and provides us with an estimate of the true population value.
In simple terms, instead of examining the entire population, we examine a small portion (called a sample). From the sample, we will obtain a value- the sample mean. In our example, let us assume we merely examined 1000 adult men from different states of India. The average (mean) height of these 1000 men will constitute the sample mean.
Now you must be wondering how taking a sample will help us ascertain the actual population value? That is where inferential statistics come in. Inferential statistics deal with the drawing of inference(s) about the population from a sample. They are able to do so because of what is called the Central Limit Theorem, that deals with a normal distribution.
Since we are dealing with a sample and a larger population (from which the sample was derived), we need to distinguish between the values of the population and the sample. Statisticians do this by assigning different symbols to the population values (which we are trying to estimate), and the sample values (which we have obtained from the sample).
Usually, the population mean is represented by a capital X bar, and standard deviation by the Greek letter sigma (σ). The sample mean is usually represented by a small x bar.
Suppose that we took not one, but several equal sized samples from the same population (in our example, the population is adult men in India), each sample would have a sample mean.
The sample means would then be designated as (x bar)1, (x bar)2, (x bar)3, and so on. If the sample means alone were taken for calculation, we could compute the mean of the sample means as well. This mean of sample means would be termed ‘Mean of means’.
Similarly, we would be able to compute the standard deviation of the sample means. The standard deviation of the sample means is what is called ‘Standard Error’.
It is important as it has a relationship with the population standard deviation- the standard error is smaller than the population standard deviation. The square root of the sample size gives the fraction by which the standard error is smaller than the population standard deviation (Standard Error = (Standard Deviation/ √sample size).
This standard error, helps us compute confidence intervals- that indicate how confident we are that the population value is contained within the limits of the confidence interval.
Thus, not only are we able to obtain an estimate of the population value, but also determine the probability that we have obtained a reasonably close estimate from our sample.
One line summary:
The standard error is the standard deviation of sample means.
Link to a lecture by Dr. Marcello Pagano (Harvard) on sampling distributions: