A common way to measure the effect size is Cohen's d, which is defined as follows:
According to this expression, the effect size is the difference of the means with respect to the pooled standard deviation of both groups. Because we can get a posterior distribution of means and standard deviations, we can compute a posterior distribution of Cohen's d values. Of course, if we just need or want one single value, we can compute the mean of that posterior distribution and get a single Cohen's d value. Generally, when computing a pooled standard deviation, we take into account the sample size of each group explicitly, but the previous formula is omitting the sample size of both groups. The reason for this is that we are getting the values of the standard deviation from the posterior and thus we are already accounting for the standard deviations' uncertainty.
Cohen's d is a way to measure the effect size, where the difference of the means are standardized by considering the pooled standard deviations of both groups.
Cohen's d introduces the variability of each group by using their standard deviations. This is really important, as differences of one when you have a standard deviation of 0.1 seems large compared to the same difference when the standard deviation is 10. Also, a change of units from one group to another could be explained by every individual data point changing exactly units or by half of them not changing and the other half changing units, and by many other combinations. Thus, including the intrinsic variations of groups is a way to put the differences in context. Re-scaling (standardizing) the differences helps us make sense of the importance of the different between groups, even when we are not very familiar with the scale used for the measurements.
Even when the differences of means are standardized, we may still need to calibrate ourselves based on the context of a given problem to be able to say if a given value is big, small, medium, and so on. Fortunately, this calibration can be acquired with enough practice. Just as an example, if we are used to performing several analyses for more or less the same type of problems, we can get used to a Cohen's d of say 1, so when we get a Cohen's d of say 2, we know we have something important (or someone made a mistake somewhere!). If you do not have this practice yet, you can ask a domain expert for their valuable input. A very nice web page to explore what different values of Cohen's d look like is http://rpsychologist.com/d3/cohend. On that page, you will also find other ways to express an effect size; some of them could be more intuitive, such as the probability of superiority, which we will discuss next.