Skip to contents

Full Structural Models

  • A fully latent or full structural model has two parts:

    • Measurement model: the latent variable with it’s measured variables (CFA)
    • Structural model: the relationship between other variables and the measurement model
  • The structural model may be a second order latent, like we did last week

  • The structural model could include other predictors

  • Or you could simply convert from correlations to a specific prediction direction as part of your structural model

Example Model

Full Structural Models

  • A reminder from earlier this semester:

    • Reflective indicators: we assume the latent variables are the cause, so they are exogenous
    • Formative indicators: we assume the latent variables are the criterion, so they are endogenous

Full Structural Models

  • Example of formative indicator

    • Income, education level, and occupation all predict your SES
    • Stress caused by outside factors
  • Other names people use for these:

    • Composite causes
    • MIMIC – multiple indicators, multiple causes models

Example Model

Identification

  • Identification rules of thumb:

    • Latent variables should have four indicators
    • Latent variables have three indicators AND Error variances do not covary
    • Latent variables have two indicators AND Error variances do not covary and Loadings are set to equal each other.

Structural Model Identification

  • Scaling is also required to identify the structural part

  • 2+ emitted paths rule

    • Composite variable must have direct effects on two other endogenous variables

Things to Consider

  • Parceling

    • When you have large structural models, they can be very complex to fit to a Full SEM if each latent variable has lots of indicators (items).
    • Parceling occurs when you create subsets of items to be able to get the model to run and to balance out the number of indicators on each latent.
    • At the moment, this topic is still pretty controversial.

How to Model

  • Test each CFA piece separately to make sure they run.

    • CFAs that are bad, do not suddenly make good models when the structural component is added!
  • Slowly add structural paths to see if you can get the full model to work.

    • If not, try parceling.
    • Drop non-significant paths.

How to Model

  • As you add the structural components, you should not see a big change in the loadings to the indicators

    • If you do, it means the model is not invariant
    • Causes interpretation difficulties

When to Stop?

  • We have discussed using modificationindices() and other tricks to improve model fit

  • In theory, we could add all paths until the model is “perfect”

  • What should the stopping rule be?

    • Based on theory
    • Fit indices do not greatly improve
    • Parsimony

Example Model

Example Model: Setup

library(lavaan)
library(semPlot)

family.cor <- lav_matrix_lower2full(c(1.00, 
                                      .74,  1.00,   
                                      .27,  .42,    1.00,   
                                      .31,  .40,    .79,    1.00,   
                                      .32,  .35,    .66,    .59,    1.00))
family.sd <- c(32.94,   22.75, 13.39,   13.68,  14.38)
rownames(family.cor) <- 
  colnames(family.cor) <-
  names(family.sd) <- c("father", "mother", "famo", "problems", "intimacy")

family.cov <- cor2cov(family.cor, family.sd)

Example Model: Build the CFA

  • First, we are going to test the measurement model – just the CFAs with a covariance between them.
  • Then, we are going to change it to a full SEM, predicting the direction of the relationship between latents.
  • We should ensure the measurement model does not change significantly.

Example Model: Build the CFA

family.model <- '
adjust =~ problems + intimacy
family =~ father + mother + famo'

Example Model: Analyze the CFA

family.fit <- cfa(model = family.model,
                  sample.cov = family.cov,
                  sample.nobs = 203)
#> Warning: lavaan->lav_object_post_check():  
#>    covariance matrix of latent variables is not positive definite ; use 
#>    lavInspect(fit, "cov.lv") to investigate.

Example Model: Deal with Error

inspect(family.fit, "cov.lv")
#>         adjust  family
#> adjust 129.870        
#> family 152.798 160.332
inspect(family.fit, "cor.lv")
#>        adjust family
#> adjust  1.000       
#> family  1.059  1.000

Example Model: Analyze the CFA

family.fit <- cfa(model = family.model,
                  sample.cov = family.cor,
                  sample.nobs = 203)

Example Model: Summarize the Model

summary(family.fit, 
        rsquare = TRUE, 
        standardized = TRUE,
        fit.measures = TRUE)
#> lavaan 0.6-19 ended normally after 23 iterations
#> 
#>   Estimator                                         ML
#>   Optimization method                           NLMINB
#>   Number of model parameters                        11
#> 
#>   Number of observations                           203
#> 
#> Model Test User Model:
#>                                                       
#>   Test statistic                               197.939
#>   Degrees of freedom                                 4
#>   P-value (Chi-square)                           0.000
#> 
#> Model Test Baseline Model:
#> 
#>   Test statistic                               533.051
#>   Degrees of freedom                                10
#>   P-value                                        0.000
#> 
#> User Model versus Baseline Model:
#> 
#>   Comparative Fit Index (CFI)                    0.629
#>   Tucker-Lewis Index (TLI)                       0.073
#> 
#> Loglikelihood and Information Criteria:
#> 
#>   Loglikelihood user model (H0)              -1270.160
#>   Loglikelihood unrestricted model (H1)      -1171.191
#>                                                       
#>   Akaike (AIC)                                2562.321
#>   Bayesian (BIC)                              2598.766
#>   Sample-size adjusted Bayesian (SABIC)       2563.915
#> 
#> Root Mean Square Error of Approximation:
#> 
#>   RMSEA                                          0.489
#>   90 Percent confidence interval - lower         0.432
#>   90 Percent confidence interval - upper         0.548
#>   P-value H_0: RMSEA <= 0.050                    0.000
#>   P-value H_0: RMSEA >= 0.080                    1.000
#> 
#> Standardized Root Mean Square Residual:
#> 
#>   SRMR                                           0.186
#> 
#> Parameter Estimates:
#> 
#>   Standard errors                             Standard
#>   Information                                 Expected
#>   Information saturated (h1) model          Structured
#> 
#> Latent Variables:
#>                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
#>   adjust =~                                                             
#>     problems          1.000                               0.807    0.809
#>     intimacy          0.901    0.134    6.741    0.000    0.727    0.729
#>   family =~                                                             
#>     father            1.000                               0.789    0.790
#>     mother            1.143    0.112   10.165    0.000    0.901    0.903
#>     famo              0.630    0.092    6.847    0.000    0.497    0.498
#> 
#> Covariances:
#>                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
#>   adjust ~~                                                             
#>     family            0.384    0.071    5.441    0.000    0.604    0.604
#> 
#> Variances:
#>                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
#>    .problems          0.344    0.092    3.720    0.000    0.344    0.345
#>    .intimacy          0.466    0.084    5.571    0.000    0.466    0.468
#>    .father            0.373    0.062    6.035    0.000    0.373    0.375
#>    .mother            0.183    0.066    2.758    0.006    0.183    0.184
#>    .famo              0.748    0.079    9.530    0.000    0.748    0.752
#>     adjust            0.651    0.126    5.157    0.000    1.000    1.000
#>     family            0.622    0.104    5.974    0.000    1.000    1.000
#> 
#> R-Square:
#>                    Estimate
#>     problems          0.655
#>     intimacy          0.532
#>     father            0.625
#>     mother            0.816
#>     famo              0.248

Example Model: Improve the Model?

modificationindices(family.fit, sort = T)
#>         lhs op    rhs      mi    epc sepc.lv sepc.all sepc.nox
#> 16   adjust =~   famo 136.963  1.467   1.184    1.187    1.187
#> 26   father ~~ mother 136.963  1.785   1.785    6.825    6.825
#> 22 problems ~~   famo  58.596  0.371   0.371    0.732    0.732
#> 27   father ~~   famo  25.310 -0.283  -0.283   -0.536   -0.536
#> 15   adjust =~ mother  25.310 -0.767  -0.619   -0.621   -0.621
#> 25 intimacy ~~   famo  13.685  0.183   0.183    0.310    0.310
#> 14   adjust =~ father   7.800 -0.370  -0.299   -0.299   -0.299
#> 28   mother ~~   famo   7.800 -0.179  -0.179   -0.482   -0.482
#> 20 problems ~~ father   6.001 -0.103  -0.103   -0.287   -0.287
#> 24 intimacy ~~ mother   4.674 -0.096  -0.096   -0.329   -0.329
#> 21 problems ~~ mother   2.723 -0.076  -0.076   -0.302   -0.302
#> 23 intimacy ~~ father   0.117  0.014   0.014    0.034    0.034

Example Model: Improve the Model?

family.model2 <- '
adjust =~ problems + intimacy
family =~ father + mother + famo
father ~~ mother'

family.fit2 <- cfa(model = family.model2,
                  sample.cov = family.cov,
                  sample.nobs = 203)
#> Warning: lavaan->lav_object_post_check():  
#>    covariance matrix of latent variables is not positive definite ; use 
#>    lavInspect(fit, "cov.lv") to investigate.

inspect(family.fit2, "cor.lv")
#>        adjust family
#> adjust  1.000       
#> family  1.038  1.000

Example Model: Diagram the Model

semPaths(family.fit, 
         whatLabels="std", 
         layout="tree", 
         edge.label.cex = 1)

Example Model: Build Full SEM

predict.model <- '
adjust =~ problems + intimacy
family =~ father + mother + famo
adjust~family'

Example Model: Analyze Full SEM

predict.fit <- sem(model = predict.model,
                   sample.cov = family.cor,
                   sample.nobs = 203)

Example Model: Summarize Full SEM

summary(predict.fit, 
        rsquare = TRUE, 
        standardized = TRUE,
        fit.measures = TRUE)
#> lavaan 0.6-19 ended normally after 20 iterations
#> 
#>   Estimator                                         ML
#>   Optimization method                           NLMINB
#>   Number of model parameters                        11
#> 
#>   Number of observations                           203
#> 
#> Model Test User Model:
#>                                                       
#>   Test statistic                               197.939
#>   Degrees of freedom                                 4
#>   P-value (Chi-square)                           0.000
#> 
#> Model Test Baseline Model:
#> 
#>   Test statistic                               533.051
#>   Degrees of freedom                                10
#>   P-value                                        0.000
#> 
#> User Model versus Baseline Model:
#> 
#>   Comparative Fit Index (CFI)                    0.629
#>   Tucker-Lewis Index (TLI)                       0.073
#> 
#> Loglikelihood and Information Criteria:
#> 
#>   Loglikelihood user model (H0)              -1270.160
#>   Loglikelihood unrestricted model (H1)      -1171.191
#>                                                       
#>   Akaike (AIC)                                2562.321
#>   Bayesian (BIC)                              2598.766
#>   Sample-size adjusted Bayesian (SABIC)       2563.915
#> 
#> Root Mean Square Error of Approximation:
#> 
#>   RMSEA                                          0.489
#>   90 Percent confidence interval - lower         0.432
#>   90 Percent confidence interval - upper         0.548
#>   P-value H_0: RMSEA <= 0.050                    0.000
#>   P-value H_0: RMSEA >= 0.080                    1.000
#> 
#> Standardized Root Mean Square Residual:
#> 
#>   SRMR                                           0.186
#> 
#> Parameter Estimates:
#> 
#>   Standard errors                             Standard
#>   Information                                 Expected
#>   Information saturated (h1) model          Structured
#> 
#> Latent Variables:
#>                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
#>   adjust =~                                                             
#>     problems          1.000                               0.807    0.809
#>     intimacy          0.901    0.134    6.741    0.000    0.727    0.729
#>   family =~                                                             
#>     father            1.000                               0.789    0.790
#>     mother            1.143    0.112   10.165    0.000    0.901    0.903
#>     famo              0.630    0.092    6.847    0.000    0.497    0.498
#> 
#> Regressions:
#>                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
#>   adjust ~                                                              
#>     family            0.618    0.092    6.705    0.000    0.604    0.604
#> 
#> Variances:
#>                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
#>    .problems          0.344    0.092    3.720    0.000    0.344    0.345
#>    .intimacy          0.466    0.084    5.571    0.000    0.466    0.468
#>    .father            0.373    0.062    6.035    0.000    0.373    0.375
#>    .mother            0.183    0.066    2.758    0.006    0.183    0.184
#>    .famo              0.748    0.079    9.530    0.000    0.748    0.752
#>    .adjust            0.414    0.095    4.364    0.000    0.636    0.636
#>     family            0.622    0.104    5.974    0.000    1.000    1.000
#> 
#> R-Square:
#>                    Estimate
#>     problems          0.655
#>     intimacy          0.532
#>     father            0.625
#>     mother            0.816
#>     famo              0.248
#>     adjust            0.364

Example Model: Diagram Full SEM

semPaths(predict.fit, 
         whatLabels="std", 
         layout="tree", 
         edge.label.cex = 1)

Example 2: Composite Variables

Example 2: Setup

family.cor <- lav_matrix_lower2full(c(1.00, 
                                     .42,   1.00, 
                                    -.43,   -.50,   1.00, 
                                    -.39,   -.43,   .78,    1.00,   
                                    -.24,   -.37,   .69,    .73,    1.00, 
                                    -.31,   -.33,   .63,    .87,    .72,    1.00,   
                                    -.25,   -.25,   .49,    .53,    .60,    .59,    1.00, 
                                     -.25,  -.26,   .42,    .42,    .44,    .45,    .77,    1.00,   
                                     -.16,  -.18,   .23,    .36,    .38,    .38,    .59,    .58, 1.00))

family.sd <- c(13.00,   13.50,  13.10,  12.50,  13.50,  14.20,  9.50,   11.10,  8.70)

rownames(family.cor) <- 
  colnames(family.cor) <-
  names(family.sd) <- c("parent_psych","low_SES","verbal",
                        "reading","math","spelling","motivation","harmony","stable")

family.cov <- cor2cov(family.cor, family.sd)

Example 2: Build the Model

  • How to define a composite variable?
  • We have been using =~ to define a latent variable that predicts manifest variables.
  • Use <~ to create a composite variable that is predicted by the manifest variables.

Example 2: Build the Model

composite.model <- '
risk <~ low_SES + parent_psych + verbal
achieve =~ reading + math + spelling
adjustment =~ motivation + harmony + stable
risk =~ achieve + adjustment
'

Example 2: Analyze the Model

composite.fit <- sem(model = composite.model, 
                      sample.cov = family.cov, 
                      sample.nobs = 158)

Example 2: Summarize the Model

summary(composite.fit, 
        rsquare = TRUE, 
        standardized = TRUE,
        fit.measures = TRUE)
#> lavaan 0.6-19 ended normally after 78 iterations
#> 
#>   Estimator                                         ML
#>   Optimization method                           NLMINB
#>   Number of model parameters                        16
#> 
#>   Number of observations                           158
#> 
#> Model Test User Model:
#>                                                       
#>   Test statistic                                94.965
#>   Degrees of freedom                                23
#>   P-value (Chi-square)                           0.000
#> 
#> Model Test Baseline Model:
#> 
#>   Test statistic                               852.585
#>   Degrees of freedom                                33
#>   P-value                                        0.000
#> 
#> User Model versus Baseline Model:
#> 
#>   Comparative Fit Index (CFI)                    0.912
#>   Tucker-Lewis Index (TLI)                       0.874
#> 
#> Loglikelihood and Information Criteria:
#> 
#>   Loglikelihood user model (H0)              -3270.643
#>   Loglikelihood unrestricted model (H1)      -3223.160
#>                                                       
#>   Akaike (AIC)                                6573.286
#>   Bayesian (BIC)                              6622.287
#>   Sample-size adjusted Bayesian (SABIC)       6571.640
#> 
#> Root Mean Square Error of Approximation:
#> 
#>   RMSEA                                          0.141
#>   90 Percent confidence interval - lower         0.112
#>   90 Percent confidence interval - upper         0.171
#>   P-value H_0: RMSEA <= 0.050                    0.000
#>   P-value H_0: RMSEA >= 0.080                    1.000
#> 
#> Standardized Root Mean Square Residual:
#> 
#>   SRMR                                           0.089
#> 
#> Parameter Estimates:
#> 
#>   Standard errors                             Standard
#>   Information                                 Expected
#>   Information saturated (h1) model          Structured
#> 
#> Latent Variables:
#>                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
#>   achieve =~                                                            
#>     reading           1.000                              12.183    0.978
#>     math              0.843    0.062   13.545    0.000   10.265    0.763
#>     spelling          1.030    0.053   19.583    0.000   12.546    0.886
#>   adjustment =~                                                         
#>     motivation        1.000                               8.632    0.912
#>     harmony           1.089    0.092   11.861    0.000    9.402    0.850
#>     stable            0.654    0.074    8.828    0.000    5.642    0.651
#>   risk =~                                                               
#>     achieve           1.000                               0.794    0.794
#>     adjustment        0.457    0.074    6.206    0.000    0.511    0.511
#> 
#> Composites:
#>                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
#>   risk <~                                                               
#>     low_SES          -0.033    0.050   -0.647    0.518   -0.003   -0.045
#>     parent_psych     -0.055    0.050   -1.095    0.274   -0.006   -0.074
#>     verbal            0.698    0.055   12.576    0.000    0.072    0.942
#> 
#> Variances:
#>                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
#>    .reading           6.831    3.848    1.775    0.076    6.831    0.044
#>    .math             75.732    9.120    8.304    0.000   75.732    0.418
#>    .spelling         42.954    6.357    6.757    0.000   42.954    0.214
#>    .motivation       15.162    4.920    3.082    0.002   15.162    0.169
#>    .harmony          34.034    6.723    5.062    0.000   34.034    0.278
#>    .stable           43.376    5.381    8.060    0.000   43.376    0.577
#>     risk              0.000                               0.000    0.000
#>    .achieve          54.931    7.452    7.372    0.000    0.370    0.370
#>    .adjustment       55.023    8.469    6.497    0.000    0.738    0.738
#> 
#> R-Square:
#>                    Estimate
#>     reading           0.956
#>     math              0.582
#>     spelling          0.786
#>     motivation        0.831
#>     harmony           0.722
#>     stable            0.423
#>     achieve           0.630
#>     adjustment        0.262

Example 2: Improve the Model?

modificationindices(composite.fit, sort = T)
#>             lhs op          rhs     mi     epc sepc.lv sepc.all sepc.nox
#> 29         risk =~     spelling 22.091  -0.607  -5.874   -0.415   -0.415
#> 39      reading ~~         math 22.090 -27.301 -27.301   -1.200   -1.200
#> 54         risk ~~      achieve 15.658  42.911      NA       NA       NA
#> 56      achieve ~~   adjustment 15.658  19.593   0.356    0.356    0.356
#> 18         risk ~~         risk 15.658  42.911   0.000    0.000    0.000
#> 55         risk ~~   adjustment 15.658  19.593      NA       NA       NA
#> 37   adjustment =~         math 12.595   0.345   2.979    0.221    0.221
#> 33      achieve =~   motivation  8.857   0.140   1.710    0.181    0.181
#> 48     spelling ~~   motivation  8.351   8.974   8.974    0.352    0.352
#> 40      reading ~~     spelling  8.135  25.450  25.450    1.486    1.486
#> 28         risk =~         math  8.135   0.379   3.665    0.272    0.272
#> 45         math ~~   motivation  7.454  10.793  10.793    0.319    0.319
#> 38   adjustment =~     spelling  7.118   0.208   1.794    0.127    0.127
#> 44         math ~~     spelling  6.189  15.229  15.229    0.267    0.267
#> 27         risk =~      reading  6.189   0.320   3.090    0.248    0.248
#> 51   motivation ~~      harmony  4.044 -27.644 -27.644   -1.217   -1.217
#> 32         risk =~       stable  4.044  -0.138  -1.331   -0.153   -0.153
#> 41      reading ~~   motivation  3.494  -4.224  -4.224   -0.415   -0.415
#> 30         risk =~   motivation  2.929   0.122   1.183    0.125    0.125
#> 53      harmony ~~       stable  2.929  10.495  10.495    0.273    0.273
#> 43      reading ~~       stable  1.885   3.814   3.814    0.222    0.222
#> 36   adjustment =~      reading  1.618  -0.073  -0.634   -0.051   -0.051
#> 34      achieve =~      harmony  1.128  -0.059  -0.717   -0.065   -0.065
#> 46         math ~~      harmony  0.939  -4.694  -4.694   -0.092   -0.092
#> 31         risk =~      harmony  0.121  -0.028  -0.269   -0.024   -0.024
#> 52   motivation ~~       stable  0.121  -2.010  -2.010   -0.078   -0.078
#> 50     spelling ~~       stable  0.070   1.008   1.008    0.023    0.023
#> 49     spelling ~~      harmony  0.033  -0.687  -0.687   -0.018   -0.018
#> 42      reading ~~      harmony  0.021  -0.400  -0.400   -0.026   -0.026
#> 35      achieve =~       stable  0.017   0.007   0.082    0.009    0.009
#> 47         math ~~       stable  0.009   0.448   0.448    0.008    0.008
#> 58      low_SES  ~       verbal  0.000   0.000   0.000    0.000    0.000
#> 57      low_SES  ~ parent_psych  0.000   0.000   0.000    0.000    0.000
#> 60 parent_psych  ~       verbal  0.000   0.000   0.000    0.000    0.000
#> 59 parent_psych  ~      low_SES  0.000   0.000   0.000    0.000    0.000
#> 62       verbal  ~ parent_psych  0.000   0.000   0.000    0.000    0.000
#> 26       verbal ~~       verbal  0.000   0.000   0.000    0.000    0.000
#> 61       verbal  ~      low_SES  0.000   0.000   0.000    0.000    0.000

Example 2: Diagram the Model

semPaths(composite.fit, 
         whatLabels="std", 
         layout="tree",
         edge.label.cex = 1)

Summary

  • In this lecture you’ve learned:

    • How to build on our previous work of measurement models
    • How to build a composite variable and run those models
    • How to examine for invariance when building a full SEM