One of the commonest concerns of researchers is that of sample size calculation for studies.
Often, researchers conduct a study among a convenient number of subjects without calculating the required sample size in advance.
Usually, the main consideration is cost- more subjects means higher costs. When faced with such a limiting factor, many researchers choose to first conduct the study with the available resources, then check whether they managed to ‘capture’ something. In order to check this, they run a test of significance.
Many times, there is no statistically significant finding to be obtained in such circumstances.
Let’s consider an example to understand why not calculating the sample size in advance can have such undesirable consequences:
Let us consider the two greatest batsmen of our generation- Sachin Tendulkar and Brian Lara. For long, many have contested who the greatest of them is. Now, we have decided to conduct a scientific experiment to settle the matter once and for all. We will compare and contrast the best years of both batsmen- Tendulkar’s best year and Lara’s best year, in One Day Internationals (ODIs) and Test matches (Tests).
The decision to compare only one year each is not based on any statistical calculation, rather personal preference. We are hoping that comparing one year will help settle the matter.
The Null Hypothesis (H0): Both batsmen have similar batting averages.
The Alternative Hypothesis (Ha): The batting averages of the batsmen are different.
Remember, we will compare only one year each from their illustrious careers. Do you think it is likely to detect a significant difference between them?
I don’t think so. There is very little separating the two greats, so we are unlikely to detect a difference by merely comparing one year. To really settle the matter, we need to compare their performances against each opponent at each venue, in every format of the game. If we then find that one has a significantly higher batting average, it will be convincing.
Let us suppose Tendulkar was being compared with Brad Pitt. Considering that Brad Pitt is not an accomplished cricketer, would it take much effort to determine who was better? Probably not.
What is the difference between the two scenarios?
The first scenario deals with equals. Since they are so evenly matched, one would require many more observations to detect a difference between them (assuming such a difference truly exists). If we haphazardly took some statistics from a given match or year, we would risk committing an error.
The second scenario deals with non-equals. We expect the contest to be one-sided because the two men have vastly different cricketing abilities. Therefore, only a few observations would be required to detect a significant difference between them.
To put it another way, let me ask you a few questions:
How many matches between Brad Pitt and Tendulkar would you watch before deciding on the better batsman?
Would you be satisfied with watching the same number of matches between Tendulkar and Lara to decide the better batsman?
If your answer to the second question was “No”, you have just understood the concept of ‘power’.
A small study may not be able to detect any difference simply because it does not have the power to detect that difference (assuming such a difference truly exists). By not performing sample size calculations beforehand, we are, in effect, increasing the chances of making an error.
No matter what statistical wizardry the data are subjected to subsequently, the lack of power cannot be overcome/ compensated for.
Thus, we end up in a wholly unpleasant situation: we’ve just conducted a study at some expense, but not found anything statistically significant because the sample size was inadequate, and due to that fatal flaw, cannot hope to publish the study findings. All this effort in vain.