Computing model-implied or expected scores in a growth modeling context
When estimating a growth curve model (e.g., i s | y1@0 y2@1 y3@2 y4@3; i s on x;), one may invoke the Mplus SAVEDATA/FSCORES output command and option and save individual level estimates of level (i) and slope (s) parameters. These can be used to estimate model-implied or expected values for y for each person at each time point. The general formula is
Predicted_y = i + s*twhere t corresponds to the time steps (in this example: 0,1,2,3).
The general formula applies regardless of whether or how many x's are in the model, what their scale is (e.g., if they are centered), or what their regression estimates are. As long as the x's influence latent variables only, and not the y's directly, the factor scores are indifferent to the presence of x's in the Mplus model. That implies that the relationship between the latent variables (i, s) and the x's viewed in the Mplus output will be mostly preserved when looking at the factor score estimates. The relationships will not be exact, however, due to factor score indeterminancy, skewness in the observed variables, and missing data.
Below is a STATA program that calls runmplus that shows this is true.
First, the results are summarized -------------------------------------NO CENTERING----------------------------------------------------------------- | t (TIME) x | 0 1 2 3 Total----------+------------------------------------------------------ Group 0 | -0.002417 0.197629 0.389425 0.592540 0.294294 mean observed y | -0.002323 0.195289 0.392901 0.590514 0.294095 mean model-implied y (py=i+s*t) | -0.000094 0.002340 -0.003477 0.002027 0.000199 mean residual (y-py) | -0.197706 0.002340 0.194136 0.397251 0.099005 mean residual for wrong way++++ py2=(ci+`g11'*cx)+(cs+`g21'*cx) | Group 1 | 0.016746 2.194302 3.419218 4.599193 2.557365 mean observed y | 0.325799 1.818557 3.311315 4.804073 2.564936 mean model-implied y (py=i+s*t) | -0.309052 0.375745 0.107903 -0.204881 -0.007571 mean residual (y-py) | -3.4e+00 -1.2e+00 -0.022339 1.157636 -0.884192 mean residual for wrong way ++++ py2=(ci+`g11'*cx)+(cs+`g21'*cx) | Total | 0.007165 1.195965 1.904321 2.595866 1.425829 mean observed y | 0.161738 1.006923 1.852108 2.697294 1.429516 mean model-implied y (py=i+s*t) | -0.154573 0.189042 0.052213 -0.101427 -0.003686 mean residual (y-py) | -1.8e+00 -0.622458 0.085898 0.777444 -0.392594 mean residual for wrong way ++++ py2=(ci+`g11'*cx)+(cs+`g21'*cx)-----------------------------------------------------------------WITH CENTERING on X----------------------------------------------------------------- | t cx | 0 1 2 3 Total----------+------------------------------------------------------ Group-.5 | -0.002417 0.197629 0.389425 0.592540 0.294294 mean observed y | -0.002323 0.195290 0.392902 0.590515 0.294096 mean model-implied y (py=i+s*t) | -0.000095 0.002339 -0.003478 0.002025 0.000198 mean residual (y-py) | 0.613793 0.813839 1.005635 1.208750 0.910504 mean residual for wrong way py2=(ci+`g11'*cx)+(cs+`g21'*cx) | Group .5 | 0.016746 2.194302 3.419218 4.599193 2.557365 mean observed y | 0.325799 1.818558 3.311317 4.804076 2.564937 mean model-implied y (py=i+s*t) | -0.309052 0.375744 0.107901 -0.204884 -0.007573 mean residual (y-py) | -2.6e+00 -0.435756 0.789160 1.969135 -0.072693 mean residual for wrong way py2=(ci+`g11'*cx)+(cs+`g21'*cx) | Total | 0.007165 1.195965 1.904321 2.595866 1.425829 mean observed y | 0.161738 1.006924 1.852110 2.697296 1.429517 mean model-implied y (py=i+s*t) | -0.154573 0.189042 0.052212 -0.101429 -0.003687 mean residual (y-py) | -0.999759 0.189042 0.897398 1.588943 0.418906 mean residual for wrong way py2=(ci+`g11'*cx)+(cs+`g21'*cx)--------------------------------------------------------------------------------------------------------------------------------------*** Experiments with LGM using MPLUS AND STATA*** BUILT OFF THE lgm power calculation mplus/stata PROGRAM *** ESTIMATE LGM MODELS FOR TWO GROUPS FOLLOWING TRUE MODELS LISTED BELOWlocal model1 "i s | y1@0 y2@1 y3@2 y4@3; [i@0 s@.2]; i@.5; s@.1; i with s@0; y1-y4@.5;"local model2 "i s | y1@0 y2@1 y3@2 y4@3; [i@1 s@1.2]; i@.5; s@.1; i with s@0; y1-y4@.5;"* The first step is to get a means and covariance matrix for a sample with the above* growth models as the true model. The procedure for doing so is based on the LGM power * calculation procedure described on STATMODEL.COM Then we use STATA to generate* data from the implied means and covariance matrix, separately for each group (corresponding* to true population models described above). The we run a growth model with the two groups * combined, with raw and centered covariate x (corresponding to true population model above), * and then calculate predicted y's various ways. We calculate predicted y with and without taking* into consideration the observed x value and regression weight of i or s on x, from models with* x centered and x not centered.* The result is the formula above (py=i+s*t) produces the right result (lowest residual) * and the residual is essentially the same for the raw-x and centered-x models.tempname null capture file close `null'file open `null' using c:/trash/null.dat , write replace#d ;file write `null' "0 0 0 0 " _n "1" _n "0 1" _n "0 0 1" _n "0 0 0 1" _n "" _n ;#d crfile close `null'set more offtempfile f1 f2foreach i of numlist 1/2 {runmplus , ///data(file = c:\trash\null.dat ; type = means covariance; nobservations = 10000;) ///variable(names are y1-y4;) ///model(`model`i'') ///standardized residual savelog(c:\trash\step1) log(off)* pull out model implied meansqui infix str line 1-90 using c:\trash\step1.out , clearformat line %90sgen linenum=_ngen keep=.replace keep=1 if trim(line)=="Model Estimated Means/Intercepts/Thresholds"su linenum if keep==1drop if _n<=`r(min)'replace keep=1 if trim(line)=="Residuals for Means/Intercepts/Thresholds"replace linenum=_nsu linenum if keep==1drop if _n>=`r(min)'keep linereplace line=substr(line,15,75)strparse line , g(m)keep m*gen i=_nreshape long m , i(i) j(j)drop i jgen M=real(m)drop mdrop if M==.tempname truecapture file close `true'file open `true' using c:/trash/trash1.do, write replacefile write `true' "drawnorm y1 y2 y3 y4 , clear n(10000) cstorage(lower) means("while _N>0 {local thud=Mfile write `true' "`thud' "drop if _n==1}file write `true' ") cov("* pull out model implied correlation coefficientsqui infix str line 1-90 using c:\trash\step1.out , clearformat line %90sgen linenum=_ngen keep=.replace keep=1 if trim(line)=="Model Estimated Covariances/Correlations/Residual Correlations"su linenum if keep==1drop if _n<=`r(min)'replace keep=1 if trim(line)=="Residuals for Covariances/Correlations/Residual Correlations"replace linenum=_nsu linenum if keep==1drop if _n>=`r(min)'keep linestrparse line , g(c)keep c*gen i=_nreshape long c , i(i) j(j)drop i jgen C=real(c)drop cdrop if C==.while _N>0 {local thud=Cfile write `true' "`thud' " drop if _n==1}file write `true' ") " _nfile close `true'do c:/trash/trash1.dosave `f`i'' , replace}use `f1' , cleargen x=0append using `f2'replace x=1 if x==.gen id=_nqui {runmplus y1-y4 x id , /// idvariable(id) /// model(i s | y1@0 y2@1 y3@2 y4@3 ; i s on x ; ) /// savedata(save=fscores; file=c:\trash\trash.dat) savelogfile(c:\trash\trash)mat B=r(estimate)mat g11=B["i_on_x",1]mat g21=B["s_on_x",1]local g11=g11[1,1]local g21=g21[1,1]preserverunmplus_load_savedata , out(c:/trash/trash.out) clearreshape long y , i(id) j(obs)gen t=obs-1 * this is the right waygen py=i+s*t * this is the wrong waygen py2=(i+`g11'*x)+(s+`g21'*x)gen resid=y-pygen resid2=y-py2noisily di "NO CENTERING"noisily table x t , c(mean y mean py mean resid mean resid2) f(%8.6f)sort id tsave `f1' , replacerestore* with centeringgen cx=x-.5runmplus y1-y4 cx id , /// idvariable(id) /// model(ci cs | y1@0 y2@1 y3@2 y4@3 ; ci cs on cx ; ) /// savedata(save=fscores; file=c:\trash\trash.dat) savelogfile(c:\trash\trash) mat B=r(estimate) mat g11=B["ci_on_cx",1]mat g21=B["cs_on_cx",1]local g11=g11[1,1]local g21=g21[1,1]runmplus_load_savedata , out(c:/trash/trash.out) clearreshape long y , i(id) j(obs)gen t=obs-1 * this is the right waygen cpy=ci+cs*t * this is the wrong waygen cpy2=(ci+`g11'*cx)+(cs+`g21'*cx)gen cresid=y-cpygen cresid2=y-cpy2noisily di "WITH CENTERING on X"noisily table cx t , c(mean y mean cpy mean cresid mean cresid2) f(%8.6f)keep id t cpy cresid cpy2 cresid2 cx ci cssort id tsave `f2' , replaceuse `f1' , clearmerge id t using `f2'}