When we want to study a population two things can happen. The most common is that we do not know the theoretical model of it. However, surely we can observe it, take a sample and describe it. And the question is, using the information obtained from a small part of the population, can you infer the behavior of the whole population? This is in charge of inferential statistics.
Thus, inferential statistics in psychology allow to validate or refute the conjectures of descriptive statistics. This is, both to validate a possible model for the population, and to estimate parameters of that model.
In this way, we could say that inferential statistics is the part of statistics that deals with generalizing results from the results obtained in a sample. For this, it is based on probability distributions and facilitates an error, which we can interpret as a confidence measure, associated with the results.
The goal of inferential statistics is simply to generate models and predictions associated with the phenomena taking into account that the observations are random. Its use focuses on creating patterns on the data, on the one hand, and on the other, drawing inferences about the population studied.
These inferences can take several forms:
- Form of yes / no answers (hypothesis test).
- Estimates of numerical characteristics (estimation).
- Forecasts of future observations.
- Association descriptions (correlation).
- Modeling relationships between Sam variables (regression analysis).
Characteristics of inferential statistics
Extrapolation and generalization
Inferential statistics deals with extrapolating data from a population. This is, so to speak, how to make generalizations about this. Its method of action is to take data on a sample of a population (usually because the cost of taking data from the entire population would be very high). The problem is in that step of the sample to the population the error appears.
Thus, the inferential statistic establishes conclusions on which we can trust up to a certain point in relation to the population to which the sample belongs. These are conclusions associated with a margin of confidence. This margin will depend on different variables, such as the relationship between sample size and population or the variability that exists in the population of the variables studied.
Validity and realism in observations
It is considered the most valid and realistic type of statistics for the exchange of information among researchers.
Parts of inferential statistics
As we have introduced before, inferential statistics acts by estimating parameters and contrasting hypotheses.
The estimation of parameters
The estimation of parameters consists of looking for the most probable values of a parameter in the population (for example, the average). By not knowing the population as a whole, a value can not be specified beyond an interval (confidence interval).
This interval will be accompanied by the probability that the parameter is in it, that is, the confidence level. Or, its complementary (probability of error). In addition, within this confidence interval one of the values is considered as an optimal estimate. This is, the best possible estimate.
Let’s say we want to estimate the population average in a variable such as body mass. We obtain a sample of the population in which the value will be similar to that of the sample. However, the larger the sample we have obtained from the population, the more certain it is that the value obtained will be similar to that of the population.
Thus, if we obtain a sample of 500 people from a population of 100,000, we will obtain an average of the body mass that will be closer to the population average than if we obtained a sample of 200 people (Law of large numbers). Furthermore, it is curious that it is just as likely that the value of the population is greater or less than that of the sample. This is so because we consider that the variable is drawn along the continuous “body mass” following a normal distribution.
How do we respond to the question of what is the value of a parameter?
To estimate what is the value of, for example, the mean in a population, a single number will be defined in the descriptive statistics. However, the inferential statistic will use three numbers. These are:
- The optimal estimate.
- The estimation error.
The confidence level (or the probability of error):
These three numbers will form the confidence interval. This is an interval in which we have a certain level of security (“confidence level”) that the real value of the population is included. Its upper and lower limits are obtained, when we refer to the mean, by adding and subtracting the estimation error from the value of the optimal estimate.
The hypothesis contrast
The second part of the inferential statistic consists of the hypothesis contrast. That is, to determine if an affirmation is true or not in the population, in probabilistic terms. The most frequent types of contrasts are:
- Comparison of samples. Ex: our hypothesis may be that tall people have a lower body mass index than low people.
- Association between variables. Ex: our hypothesis may be that the body mass index and height are two related variables.
Thus, it seems obvious the need for inferential statistics in the field of psychology (in the examples, we can change the body mass by intelligence, memory, attention …). When making inferences we estimate how the characteristics of a population will be in general.