7.1 The Rules of Thumb for Sample Size Needed for SEM

Although determination of appropriate sample size is a critical issue in SEM, unfortunately, there is no consensus in the literatureregarding what would be the appropriate sample size for SEM. Some evidence exists that simple SEM models could be meaningfully tested even if sample size is quite small (Hoyle, 1999; Hoyle and Kenny, 1999; Marsh and Hau, 1999), but usually, N = 100–150 is considered the minimum sample size for conducting SEM (Tinsley and Tinsley, 1987; Anderson and Gerbing, 1988; Ding, Velicer, and Harlow, 1995; Tabachnick and Fidell, 2001). Some researchers consider an even larger sample size for SEM, for example, N = 200 (Hoogland and Boomsma 1998; Boomsma and Hoogland, 2001; Kline, 2005). Simulation studies show that with normally distributed indicator variables and no missing data, a reasonable sample size for a simple CFA model is about N = 150 (Muthén and Muthén, 2002). For multi-group modeling, the rule of thumb is 100 cases/observations per group (Kline, 2005).

Sample size is often considered in light of the number of observed variables. For normally distributed data, Bentler and Chou (1987) suggest a ratio as low as 5 cases per variable would be sufficient when latent variables have multiple indicators. A widely accepted rule of thumb is 10 cases/observations per indicator variable in setting a lower bound of an adequate sample size (Nunnally, 1967).

Very often attention is given to the ratio of (N:q) of cases/observations (N) to number of free parameters (q) being estimated in a model for determination of sample size. A higher N:q ratio is preferred. A rule of thumb is at least 5 cases/observations per free parameters in a model (i.e., N:q ≥ 5) (Bentler and Chou, 1987;Bentler, 1995). With strongly kurtotic data the minimum sample size should be at least 10 times the number of free parameters (i.e., N:q ≥ 10) (Hoogland and Boomsma, 1998). Kline (1998) suggests that the N:q ratio should be in the range of 10 or even 20.

Sample size determination is also considered depending upon the number of indicator variables per latent variable/factor. According to some researchers (Marsh et al., 1998; Marsh and Hau, 1999), more observed indicators per factor may compensate for small sample size, and a larger sample size may compensate for few indicators per factor. It is considered that a sample size of N = 50 would be sufficient for a CFA model with 6–12 indicator variables per factor, while sample size should be at least N = 100 for a model with 3–4 indicators per factor (Boomsma, 1985; Marsh and Hau, 1999). However, if there were only 2 indicators per factor in a CFA model, the needed sample size should be at least N ≥ 400 (Marsh and Hau, 1999; Boomsma and Hoogland, 2001). However, according to our experience, with a large number of indicators/items per factor, it is often difficult to validate the factorial structure of a scale in real research because a lot of error terms are likely to be correlated with each other due to a variety of reasons. Usually a CFA model with many indicators per factor does not fit data well unless some error variances or cross-factor loadings are specified in the model.

Determination of sample size needed for SEM is complicated. There is no absolute standard in regard to an adequate sample size and no rule of thumb that applies to all situations in SEM (Muthén and Muthén, 2002). In addition to the number of free parameters needed to be estimated and the number of indicators per latent variables, sample size needed for SEM is also dependent on many other factors that are related to data characteristics and the model being tested, such as reliability of the observed indicators (Gerbing and Anderson, 1985; Velicer and Fava, 1998), study design (e.g., cross-sectional vs. longitudinal; Muthén and Muthén, 2002), degree of data multivariate normality (West, Finch, and Curran, 1995; Anderson, 1996), handling of missing data (Brown, 1994), model complexity (Kline, 1998), and the model estimators (e.g., ML, MLR, WLSMV) (Fan, Thompson, and Wang, 1999). One should be cautious in simply trusting the rules of thumb given in the literature. Instead, some model-based approaches, such as Satorra and Saris's method (1985) and Monte Carlo simulation (Muthén and Muthén, 2002), as well as methods based on model fit indices [e.g., MacCallum, Browne, and Sugawara's method (1996) and Kim's method (2005)], have been increasingly used to conduct power analysis and estimate sample size for specific SEM models. In these approaches either statistical power1 is estimated given a sample size and significance level (e.g., img = 0.05) or sample size needed to reach a certain power (e.g., 0.80) is estimated. In this chapter we demonstrate how to apply Satorra and Saris's method, Monte Carlo simulation, MacCallum, Browne, and Sugawara's method, and Kim's method to conduct power analysis and estimated sample size for specific SEM models.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.248.8