What are research hypotheses and why do we need them?
As you’ll know, the broad starting process to developing new studies in quantitative research is to:
- Review the literature on a topic,
- Identify a gap or rationale for running a new study, and
- Develop a testable hypothesis based on the new study idea (some tips are below!)
Hypotheses must be testable to allow us to scientifically assess whether a phenomenon has occurred in a given sample. (If the hypothesis can’t actually be “disproven” in this way through statistical testing, it’s usually not considered to be a scientific hypothesis [for more information on this, see the “falsifiability” debate by Popper.]) There are two main types of hypotheses, as follows:
- Alternative hypotheses (hypotheses which say there will be an effect)
- Null hypotheses (hypotheses which say there will not be effect)
In many ways, alternative and null hypotheses are just the opposite of each other. For example, saying “that there will be a relationship between depression and anxiety in a sample of undergraduate students” supposes there will be an effect (i.e., is an alternative hypothesis). Instead, saying “there will not be a relationship between depression and anxiety in a sample of psychiatric patients” supposes there will be no effect (i.e., is a null hypothesis).
Where do p values come in?
In testing our hypothesis with statistical tests, we use p values to make an inference about whether our alternative hypothesis is supported or unsupported. We rely on making inferences in psychology because we’re never going to be able to recruit the whole population into a study to really understand what is actually true; we rely on recruiting a sub-sample of the population into our psychology studies. The difference between a sub-sample and the population is considered as “error” (e.g., sampling error).
- If we do not find an effect (e.g., at p > .05), then we retain the null hypothesis as this assumes no effect.
- If we do find an effect (e.g., at p < .05), then we reject the null hypothesis as this again assumes no effect.
- We never retain the alternative hypothesis, because there’s always some chance the null hypothesis is true (due to sampling error, as above).
Using p values and statistical significance
Next, to go into the p value in slightly more statistical terms and break it down a bit more… The ‘p‘ in ‘p value’ in psychology means the probability of your finding not having randomly occurred by chance. We usually turn the probability value into a % to make it slightly more interpretable. So in psychology, we use an arbitrary cut off value of .05 (or 5%), where if our finding is p less than .05 (p < .05), then we have a “statistically significant” result. In other words, we have a finding that is 5% likely to have not occurred due to chance.
We usually turn the probability value into a % to make it slightly more interpretable. So in psychology, we generally use an arbitrary cut off value of .05 (or 5%). If our finding is p less than .05 (p < .05), then we have a “statistically significant” result. In other words, we have a finding in our sub-sample that is 5% likely to have not occurred due to chance/randomly.
Consider another example. Let’s say that I explore the correlation between depression scores and anxiety scores in a sample of 500 undergraduate students. The result is p = .03, meaning it is statistically significant (the value is less than a value of .05). Turning this into a % means that there is 3% chance that the null hypothesis is true, and 97% chance that the finding is not due to chance. More specifically, a sub-sample result with a p value as low as this would likely occur only 3% of the time.
Tips for generating your own hypotheses
- Be specific on what your independent and dependent variables are (including any covariates)
- Be clear whether you are looking for a difference, a relationship, or a prediction between your variables
- Aim for one hypothesis per statistical test you are planning to run
- Some templates….
| Statistical test | Example written hypothesis |
| Correlation | “There will be a relationship between Variable X and Variable Y.” |
| Regression | “Variable X will predict Variable Y.” |
| t-test | “There will be a difference between the two levels of Variable X on Variable Y.” |
| One-way ANOVA | “There will be a difference between the three+ levels of Variable X on Variable Y.” |
| 2*2 ANOVA | “There will be a difference between the levels of Variable X on Variable Y. There will be a difference between the levels of Variable Z on Variable Y. There will be an interaction between Variable X and Variable Z on Variable Y.” |
I hope this helps for now!
Thanks for reading,
Rosie
