Home‎ > ‎Applied Analysis Notes‎ > ‎

Applied Analysis Note 3

Getting factor scores using a fixed set of parameter estimates.

Context: You want to use Mplus both as a parameter estimating machine and as a factor score generating machine. But, you want to apply parameter estimates to a wider sample than used in the parameter estimating procedure. In the following example, we estimate item parameters in a baseline sample, and then use those parameter estimates as fixed values including all observed (longitudinal) observations.

Note: it might be easier in the context of a longitudinal study to estimate a single measurement model for all observations (all time points) in a vertically stacked data set, and then standardize the factor score to some meaningful metric. But in case you really want to do it the hard way:

Example Stata and Stata/Runmplus code (click here to see log with complete Mplus codes)
---------------------------------------
      capture log close
      log using c:\trash\fragment.log , text replace
      * note: newid is a unique identifier that
      * combines projid+time
      * modify the model generation process to incorporate other parameters.
      use c:\trash\ros.dta, clear
      _return drop _all
      * ESTIMATE THE BASELINE MEASUREMENT MODEL
      runmplus mmy* newid if obs==1, idvar(newid) cat(all) ///
          model(g by mmy1-mmy28*; g@1;) est(mlr) log(off)
      _return hold estimate
      _return restore estimate, hold
      * RUNMPLUS STORES PARAMETER ESTIMATES IN A 
      * MATRIX r(estimate).  HERE WE EXTRACT THOSE 
      * PARAMETER ESTIMATES AND BUILD A NEW Mplus 
      * MODEL STATEMENT
      mat b=r(estimate)
      foreach y of varlist mmy* {
         mat l=b["g_by_`y'",1]
         local l=l[1,1]
         local model "`model' g by `y'@`l'; "
         local j=0
         levelsof `y' , clean 
         foreach t in `r(levels)' {
            if `j'>0 {
               mat t=b["thresholds_`y'$`j'",1]
               local t=t[1,1]
               local model "`model' [`y'$`j' @ `t'];"
            }
            local j=`j'+1
         }
      }
      * Note that we will freely estimate the mean and 
      * variance of the factor. It is neither reasonable nor
      * interesting to assume that the mean and variances 
      * are constant over time
      local model "`model' g*`variances_g'; [g*];"
      di "`model'"
      * NOW WE RUN THE MODEL WITH MOSTLY FIXED PARAMETERS 
      * USING RUNMPLUS WITH THE SAVEDATA/FSCORES OPTION 
      runmplus mmy* newid , idvar(newid) cat(all) model(`model') log(off) est(mlr) ///
         savelog(c:\trash\trash) savedata(file is c:\trash\trash.dat; save is fscores;) 
      * NOW WE GO THROUGH THE MACHINATIONS NECESSARY TO BRING
      * THE FACTOR SCORE ESTIMATES BACK INTO THE WORKING STATA FILE
      preserve
      runmplus_load_savedata , out(c:/trash/trash.out) clear
      keep newid g
      sort newid
      tempfile f1
      save `f1'
      restore
      merge newid using `f1' , sort
      table _merge
      drop _merge
      su g
      table obs, c(mean g sd g) f(%8.3f)
      * NOW WE CAN USE THE FACTOR SCORE ESTIMATES
      * IN ANOTHER MODEL, LIKE A LATENT GROWTH CURVE MODEL
      * I SELECT ONLY OBSERVATIONS UP TO AND INCLUDING 10
      * BECAUSE DATA ARE MISSING AT THE FAR OBSERVATION
      * POINTS A RANDOM EFFECTS MODEL (TYPE=RANDOM WITH TSCORES)
      * WOULD BE BETTER FOR THESE DATA
      keep if obs<=10
      keep projid obs g
      reshape wide g , i(projid) j(obs)
      local model="i s | "
      foreach i of numlist 1/10 {
         local j=`i'-1
         local model = "`model' g`i'@`j'"
      }
      local model "`model'; retest by g2-g10@1; retest@0; [retest*]; i s with retest@0; "
      runmplus g* , model(`model') coverage(.02)
      
      log close
      
      
Tags: factor analysis, factor scores, item response theory, runmplus, savedata, fscores, runmplus_load_savedata
Comments