IBM® SPSS® Amos™ 21 User’s Guide James L.
Note: Before using this information and the product it supports, read the information in the “Notices” section on page 631. This edition applies to IBM® SPSS® Amos™ 21 and to all subsequent releases and modifications until otherwise indicated in new editions. Microsoft product screenshots reproduced with permission from Microsoft Corporation. Licensed Materials - Property of IBM © Copyright IBM Corp. 1983, 2012. U.S.
Contents Part I: Getting Started 1 Introduction 1 Featured Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 About the Tutorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 About the Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 About the Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Other Sources of Information . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Acknowledgements . . . . . . . . . . . . .
Setting Up Optional Output . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Performing the Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Viewing Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 To View Text Output . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 To View Graphics Output . . . . . . . . . . . . . . . . . . . . . . . . 19 Printing the Path Diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Copying the Path Diagram . .
2 Testing Hypotheses 41 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .41 About the Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .41 Parameters Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . .41 Constraining Variances . . . . . . . . . . . . . . . . . . . . . . . . . .42 Specifying Equal Parameters. . . . . . . . . . . . . . . . . . . . . . .43 Constraining Covariances . . . . . . . . . . . . . . . . . . . . . . . .
4 Conventional Linear Regression 67 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 About the Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Analysis of the Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 Specifying the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 Fixing Regression Weights . . . . . . . . . . . . .
Model B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .93 Results for Model B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .94 Testing Model B against Model A. . . . . . . . . . . . . . . . . . . . . . .96 Modeling in VB.NET. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .98 Model A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .98 Model B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Modeling in VB.NET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Model A . . . . . . . . . Model B . . . . . . . . . Model C . . . . . . . . . Fitting Multiple Models. 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Nonrecursive Model . . . . 123 124 125 126 129 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . .
An Alternative to Analysis of Covariance 145 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 Analysis of Covariance and Its Alternative . . . . . . . . . . . . . . . . 145 About the Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 Analysis of Covariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 Model A for the Olsson Data. . . . . . . . . . . . . . . . . . . . . . . . . 147 Identification. . . . . . . . . . . . . . . . . . .
Model B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 Text Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 Graphics Output. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Modeling in VB.NET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Model A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Model B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Multiple Model Input . . . . . . . .
12 Simultaneous Factor Analysis for Several Groups 195 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 About the Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 Model A for the Holzinger and Swineford Boys and Girls . . . . . . . . 196 Naming the Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 Specifying the Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 Results for Model A . . . . . . . . . . . . . . . . . . .
Multiple Model Input. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 Mean Structure Modeling in VB.NET . . . . . . . . . . . . . . . . . . . 217 Model A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 Model B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218 Fitting Multiple Models. . . . . . . . . . . . . . . . . . . . . . . . . 219 14 Regression with an Explicit Intercept 221 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Modeling in VB.NET. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238 Model A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238 Model B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 Fitting Multiple Models . . . . . . . . . . . . . . . . . . . . . . . . . 240 16 Sörbom’s Alternative to Analysis of Covariance 241 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 Assumptions . . . . . . . . . . . . . . . . . . .
Results for Model Z . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 Modeling in VB.NET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262 Model A . . . . . . . . . Model B . . . . . . . . . Model C . . . . . . . . . Model D . . . . . . . . . Model E . . . . . . . . . Fitting Multiple Models. Models X, Y, and Z . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Graphics Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287 Text Output. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287 Model B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290 Output from Models A and B. . . . . . . . . . . . . . . . . . . . . . . . . 291 Modeling in VB.NET. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292 Model A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292 Model B . . . . . . . . . . .
About the Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312 About the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312 Text Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315 Modeling in VB.NET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318 22 Specification Search 319 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319 About the Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Performing the Specification Search . . . . . . . . . . . . . . . . . 346 Using BIC to Compare Models . . . . . . . . . . . . . . . . . . . . . 347 Viewing the Scree Plot . . . . . . . . . . . . . . . . . . . . . . . . . 348 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348 23 Exploratory Factor Analysis by Specification Search 349 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349 About the Data. . . . . . . . . . . . . . . . . . . .
Customizing the Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 369 Model 24b: Comparing Factor Means . . . . . . . . . . . . . . . . . . . 370 Specifying the Model. . . . . . . . . . . . Removing Constraints . . . . . . . . . . . Generating the Cross-Group Constraints Fitting the Models. . . . . . . . . . . . . . Viewing the Output . . . . . . . . . . . . . 25 Multiple-Group Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Assessing Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . 396 Diagnostic Plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398 Bivariate Marginal Posterior Plots . . . . . . . . . . . . . . . . . . . . . 404 Credible Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407 Changing the Confidence Level . . . . . . . . . . . . . . . . . . . . 407 Learning More about Bayesian Estimation . . . . . . . . . . . . . . . .
29 Estimating a User-Defined Quantity in Bayesian SEM 437 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437 About the Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437 The Stability of Alienation Model . . . . . . . . . . . . . . . . . . . . . 437 Numeric Custom Estimands. . . . . . . . . . . . . . . . . . . . . . . . . 443 Dragging and Dropping . . . . . . . . . . . . . . . . . . . . . . . . 447 Dichotomous Custom Estimands . . . . . . . . . . . .
Posterior Predictive Distributions . . . . . . . . . . . . . . . . . . . . . . 481 Imputation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484 General Inequality Constraints on Data Values . . . . . . . . . . . . . . 488 33 Ordered-Categorical Data 489 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489 About the Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489 Specifying the Data File . . . . .
Specifying the Data File . . . . . . . . . . . . . . . . . . . . . . . . . . . 542 Specifying the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545 Constraining the Parameters . . . . . . . . . . . . . . . . . . . . . 546 Fitting the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 548 Classifying Individual Cases . . . . . . . . . . . . . . . . . . . . . . . . 551 Latent Structure Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 553 Label Switching . . . . . .
Other Aspects of the Analysis in Addition to Model Specification . . . 588 Defining Program Variables that Correspond to Model Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 588 Part III: Appendices A Notation 591 B Discrepancy Functions 593 C Measures of Fit 597 Measures of Parsimony . . . . . . . . . . . . . . . . . . . . . . . . . . . 598 NPAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 598 DF . . . . . . . . . . . . . . . . . . . . . . . . .
Comparisons to a Baseline Model . . . . . . . . . . . . . . . . . . . . . 608 NFI RFI IFI . TLI CFI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Notices 631 Bibliography 635 Index 647 xxv
Chapter 1 Introduction IBM SPSS Amos implements the general approach to data analysis known as structural equation modeling (SEM), also known as analysis of covariance structures, or causal modeling. This approach includes, as special cases, many wellknown conventional techniques, including the general linear model and common factor analysis. Input: 1 visperc 1 Output: e1 .70 visperc e1 .43 spatial cubes 1 e2 spatial .65 cubes e2 .54 lozenges 1 .74 e3 lozenges .
2 Chapter 1 Structural equation modeling (SEM) is sometimes thought of as esoteric and difficult to learn and use. This is incorrect. Indeed, the growing importance of SEM in data analysis is largely due to its ease of use. SEM opens the door for nonstatisticians to solve estimation and hypothesis testing problems that once would have required the services of a specialist. IBM SPSS Amos was originally designed as a tool for teaching this powerful and fundamentally simple method.
3 Introduction models. It provides a test of univariate normality for each observed variable as well as a test of multivariate normality and attempts to detect outliers. IBM SPSS Amos accepts a path diagram as a model specification and displays parameter estimates graphically on a path diagram. Path diagrams used for model specification and those that display parameter estimates are of presentation quality.
4 Chapter 1 Example 9 and those that follow demonstrate advanced techniques that have so far not been used as much as they deserve. These techniques include: Simultaneous analysis of data from several different populations. Estimation of means and intercepts in regression equations. Maximum likelihood estimation in the presence of missing data. Bootstrapping to obtain estimated standard errors and confidence intervals.
5 Introduction Structural Equation Modeling: A Multidisciplinary Journal contains methodological articles as well as applications of structural modeling. It is published by Taylor and Francis (http://www.tandf.co.uk). Carl Ferguson and Edward Rigdon established an electronic mailing list called Semnet to provide a forum for discussions related to structural modeling. You can find information about subscribing to Semnet at www.gsu.edu/~mkteer/semnet.html.
Chapter 2 Tutorial: Getting Started with Amos Graphics Introduction Remember your first statistics class when you sweated through memorizing formulas and laboriously calculating answers with pencil and paper? The professor had you do this so that you would understand some basic statistical concepts. Later, you discovered that a calculator or software program could do all of these calculations in a split second. This tutorial is a little like that early statistics class.
8 Chapter 2 demonstrates the menu path. For information about the toolbar buttons and keyboard shortcuts, see the online help. About the Data Hamilton (1990) provided several measurements on each of 21 states. Three of the measurements will be used in this tutorial: Average SAT score Per capita income expressed in $1,000 units Median education for residents 25 years of age or older You can find the data in the Tutorial directory within the Excel 8.0 workbook Hamilton.
9 Tutorial: Getting Started with Amos Graphics The following path diagram shows a model for these data: This is a simple regression model where one observed variable, SAT, is predicted as a linear combination of the other two observed variables, Education and Income. As with nearly all empirical data, the prediction will not be perfect. The variable Other represents variables other than Education and Income that affect SAT. Each single-headed arrow represents a regression weight.
10 Chapter 2 Creating a New Model E From the menus, choose File > New. Your work area appears. The large area on the right is where you draw path diagrams. The toolbar on the left provides one-click access to the most frequently used buttons. You can use either the toolbar or menu commands for most operations.
11 Tutorial: Getting Started with Amos Graphics Specifying the Data File The next step is to specify the file that contains the Hamilton data. This tutorial uses a Microsoft Excel 8.0 (*.xls) file, but Amos supports several common database formats, including SPSS Statistics *.sav files. If you launch Amos from the Add-ons menu in SPSS Statistics, Amos automatically uses the file that is open in SPSS Statistics. E From the menus, choose File > Data Files. E In the Data Files dialog box, click File Name.
12 Chapter 2 E In the drawing area, move your mouse pointer to the right of the three rectangles and click and drag to draw the ellipse. The model in your drawing area should now look similar to the following: Naming the Variables E In the drawing area, right-click the top left rectangle and choose Object Properties from the pop-up menu. E Click the Text tab. E In the Variable name text box, type Education. E Use the same method to name the remaining variables.
13 Tutorial: Getting Started with Amos Graphics Your path diagram should now look like this: Drawing Arrows Now you will add arrows to the path diagram, using the following model as your guide: E From the menus, choose Diagram > Draw Path. E Click and drag to draw an arrow between Education and SAT. E Use this method to add each of the remaining single-headed arrows. E From the menus, choose Diagram > Draw Covariances. E Click and drag to draw a double-headed arrow between Income and Education.
14 Chapter 2 Constraining a Parameter To identify the regression model, you must define the scale of the latent variable Other. You can do this by fixing either the variance of Other or the path coefficient from Other to SAT at some positive value. The following shows you how to fix the path coefficient at unity (1). E In the drawing area, right-click the arrow between Other and SAT and choose Object Properties from the pop-up menu. E Click the Parameters tab. E In the Regression weight text box, type 1.
15 Tutorial: Getting Started with Amos Graphics Altering the Appearance of a Path Diagram You can change the appearance of your path diagram by moving and resizing objects. These changes are visual only; they do not affect the model specification. To Move an Object E From the menus, choose Edit > Move. E In the drawing area, click and drag the object to its new location. To Reshape an Object or Double-Headed Arrow E From the menus, choose Edit > Shape of Object.
16 Chapter 2 To Undo an Action E From the menus, choose Edit > Undo. To Redo an Action E From the menus, choose Edit > Redo. Setting Up Optional Output Some of the output in Amos is optional. In this step, you will choose which portions of the optional output you want Amos to display after the analysis. E From the menus, choose View > Analysis Properties. E Click the Output tab. E Select the Minimization history, Standardized estimates, and Squared multiple correlations check boxes.
17 Tutorial: Getting Started with Amos Graphics E Close the Analysis Properties dialog box.
18 Chapter 2 Performing the Analysis The only thing left to do is perform the calculations for fitting the model. Note that in order to keep the parameter estimates up to date, you must do this every time you change the model, the data, or the options in the Analysis Properties dialog box. E From the menus, click Analyze > Calculate Estimates. E Because you have not yet saved the file, the Save As dialog box appears. Type a name for the file and click Save. Amos calculates the model estimates.
19 Tutorial: Getting Started with Amos Graphics To View Graphics Output E Click the Show the output path diagram button . E In the Parameter Formats pane to the left of the drawing area, click Standardized estimates.
20 Chapter 2 Your path diagram now looks like this: The value 0.49 is the correlation between Education and Income. The values 0.72 and 0.11 are standardized regression weights. The value 0.60 is the squared multiple correlation of SAT with Education and Income. E In the Parameter Formats pane to the left of the drawing area, click Unstandardized estimates. Your path diagram should now look like this: Printing the Path Diagram E From the menus, choose File > Print. The Print dialog box appears.
21 Tutorial: Getting Started with Amos Graphics E Click Print. Copying the Path Diagram Amos Graphics lets you easily export your path diagram to other applications such as Microsoft Word. E From the menus, choose Edit > Copy (to Clipboard). E Switch to the other application and use the Paste function to insert the path diagram. Amos Graphics exports only the diagram; it does not export the background. Copying Text Output E In the Amos Output window, select the text you want to copy.
Example 1 Estimating Variances and Covariances Introduction This example shows you how to estimate population variances and covariances. It also discusses the general format of Amos input and output. About the Data Attig (1983) showed 40 subjects a booklet containing several pages of advertisements. Then each subject was given three memory performance tests. Test recall cued place Explanation The subject was asked to recall as many of the advertisements as possible.
24 Example 1 Bringing In the Data E From the menus, choose File > New. E From the menus, choose File > Data Files. E In the Data Files dialog box, click File Name. E Browse to the Examples folder. If you performed a typical installation, the path is C:\Program Files\IBM\SPSS\Amos\21\Examples\. E In the Files of type list, select Excel 8.0 (*.xls), select UserGuide.xls, and then click Open. E In the Data Files dialog box, click OK. Amos displays a list of worksheets in the UserGuide workbook.
25 Estimating Variances and Covariance s As you scroll across the worksheet, you will see all of the test variables from the Attig study. This example uses only the following variables: recall1 (recall pretest), recall2 (recall posttest), place1 (place recall pretest), and place2 (place recall posttest). E After you review the data, close the data window. E In the Data Files dialog box, click OK.
26 Example 1 E Create two more duplicate rectangles until you have four rectangles side by side. Tip: If you want to reposition a rectangle, choose Edit > Move from the menus and drag the rectangle to its new position. Naming the Variables E From the menus, choose View > Variables in Dataset. The Variables in Dataset dialog box appears. E Click and drag the variable recall1 from the list to the first rectangle in the drawing area.
27 Estimating Variances and Covariance s Changing the Font E Right-click a variable and choose Object Properties from the pop-up menu. The Object Properties dialog box appears. E Click the Text tab and adjust the font attributes as desired. Establishing Covariances If you leave the path diagram as it is, Amos Graphics will estimate the variances of the four variables, but it will not estimate the covariances between them.
28 Example 1 Performing the Analysis E From the menus, choose Analyze > Calculate Estimates. Because you have not yet saved the file, the Save As dialog box appears. E Enter a name for the file and click Save. Viewing Graphics Output E Click the Show the output path diagram button . Amos displays the output path diagram with parameter estimates.
29 Estimating Variances and Covariance s In the output path diagram, the numbers displayed next to the boxes are estimated variances, and the numbers displayed next to the double-headed arrows are estimated covariances. For example, the variance of recall1 is estimated at 5.79, and that of place1 at 33.58. The estimated covariance between these two variables is 4.34. Viewing Text Output E From the menus, choose View > Text Output.
30 Example 1 observation on an approximately normally distributed random variable centered around the population covariance with a standard deviation of about 1.16, that is, if the assumptions in the section “Distribution Assumptions for Amos Models” on p. 35 are met. For example, you can use these figures to construct a 95% confidence interval on the population covariance by computing 2.56 ± 1.96 × 1.160 = 2.56 ± 2.27 .
31 Estimating Variances and Covariance s example, Runyon and Haber, 1980, p. 226) is 2.509 with 38 degrees of freedom ( p = 0.016 ). In this example, both p values are less than 0.05, so both tests agree in rejecting the null hypothesis at the 0.05 level. However, in other situations, the two p values might lie on opposite sides of 0.05. You might or might not regard this as especially serious—at any rate, the two tests can give different results. There should be no doubt about which test is better.
32 Example 1 The Number of distinct sample moments referred to are sample means, variances, and covariances. In most analyses, including the present one, Amos ignores means, so that the sample moments are the sample variances of the four variables, recall1, recall2, place1, and place2, and their sample covariances. There are four sample variances and six sample covariances, for a total of 10 sample moments.
33 Estimating Variances and Covariance s Optional Output So far, we have discussed output that Amos generates by default. You can also request additional output. Calculating Standardized Estimates You may be surprised to learn that Amos displays estimates of covariances rather than correlations. When the scale of measurement is arbitrary or of no substantive interest, correlations have more descriptive meaning than covariances. Nevertheless, Amos and similar programs insist on estimating covariances.
34 Example 1 Rerunning the Analysis Because you have changed the options in the Analysis Properties dialog box, you must rerun the analysis. E From the menus, choose Analyze > Calculate Estimates. E Click the Show the output path diagram button. E In the Parameter Formats pane to the left of the drawing area, click Standardized estimates. Viewing Correlation Estimates as Text Output E From the menus, choose View > Text Output.
35 Estimating Variances and Covariance s E In the tree diagram in the upper left pane of the Amos Output window, expand Estimates, Scalars, and then click Correlations. Distribution Assumptions for Amos Models Hypothesis testing procedures, confidence intervals, and claims for efficiency in maximum likelihood or generalized least-squares estimation depend on certain assumptions. First, observations must be independent.
36 Example 1 The (conditional) expected values of the random variables depend linearly on the values of the fixed variables. A typical example of a fixed variable would be an experimental treatment, classifying respondents into a study group and a control group, respectively. It is all right that treatment is non-normally distributed, as long as the other exogenous variables are normally distributed for study and control cases alike, and with the same conditional variance-covariance matrix.
37 Estimating Variances and Covariance s E Enter the VB.NET code for specifying and fitting the model in place of the ‘Your code goes here comment. The following figure shows the program editor after the complete program has been entered. Note: The Examples directory contains all of the pre-written examples.
38 Example 1 To open the VB.NET file for the present example: E From the Program Editor menus, choose File > Open. E Select the file Ex01.vb in the \Amos\21\Examples\ directory. The following table gives a line-by-line explanation of the program. Program Statement Dim Sem As New AmosEngine Sem.TextOutput Sem.BeginGroup … Sem.AStructure("recall1") Sem.AStructure("recall2") Sem.AStructure("place1") Sem.AStructure("place2") Sem.FitModel() Sem.
39 Estimating Variances and Covariance s Generating Additional Output Some AmosEngine methods generate additional output. For example, the Standardized method displays standardized estimates. The following figure shows the use of the Standardized method: Modeling in C# Writing an Amos program in C# is similar to writing one in VB.NET. To start a new C# program, in the built-in program editor of Amos: E Choose File > New C# Program (rather than File > New VB Program). E Choose File > Open to open Ex01.
40 Example 1 Other Program Development Tools The built-in program editor in Amos is used throughout this user’s guide for writing and executing Amos programs. However, you can use the development tool of your choice. The Examples folder contains a VisualStudio subfolder where you can find Visual Studio VB.NET and C# solutions for Example 1.
Example 2 Testing Hypotheses Introduction This example demonstrates how you can use Amos to test simple hypotheses about variances and covariances. It also introduces the chi-square test for goodness of fit and elaborates on the concept of degrees of freedom. About the Data We will use Attig’s (1983) spatial memory data, which were described in Example 1. We will also begin with the same path diagram as in Example 1.
42 Example 2 You can fill these boxes yourself instead of letting Amos fill them. Constraining Variances Suppose you want to set the variance of recall1 to 6 and the variance of recall2 to 8. E In the drawing area, right-click recall1 and choose Object Properties from the pop-up menu. E Click the Parameters tab. E In the Variance text box, type 6. E With the Object Properties dialog box still open, click recall2 and set its variance to 8.
43 Testing Hypotheses E Close the dialog box. The path diagram displays the parameter values you just specified. This is not a very realistic example because the numbers 6 and 8 were just picked out of the air. Meaningful parameter constraints must have some underlying rationale, perhaps being based on theory or on previous analyses of similar data. Specifying Equal Parameters Sometimes you will be interested in testing whether two parameters are equal in the population.
44 Example 2 require both of the variances to have the same value without specifying ahead of time what that value is. Benefits of Specifying Equal Parameters Before adding any further constraints on the model parameters, let’s examine why we might want to specify that two parameters, like the variances of recall1 and recall2 or place1 and place2, are equal.
45 Testing Hypotheses Moving and Formatting Objects While a horizontal layout is fine for small examples, it is not practical for analyses that are more complex.
46 Example 2 You can use the following tools to rearrange your path diagram until it looks like the one above: To move objects, choose Edit > Move from the menus, and then drag the object to its new location. You can also use the Move button to drag the endpoints of arrows. To copy formatting from one object to another, choose Edit > Drag Properties from the menus, select the properties you wish to apply, and then drag from one object to another.
47 Testing Hypotheses E Review the data and close the data view. E In the Data Files dialog box, click OK. Performing the Analysis E From the menus, choose Analyze > Calculate Estimates. E In the Save As dialog box, enter a name for the file and click Save. Amos calculates the model estimates. Viewing Text Output E From the menus, choose View > Text Output. E To view the parameter estimates, click Estimates in the tree diagram in the upper left pane of the Amos Output window.
48 Example 2 You can see that the parameters that were specified to be equal do have equal estimates. The standard errors here are generally smaller than the standard errors obtained in Example 1. Also, because of the constraints on the parameters, there are now positive degrees of freedom. E Now click Notes for Model in the upper left pane of the Amos Output window. While there are still 10 sample variances and covariances, the number of parameters to be estimated is only seven.
49 Testing Hypotheses E From the menus, choose Analyze > Calculate Estimates. Amos recalculates the model estimates. Covariance Matrix Estimates E To see the sample variances and covariances collected into a matrix, choose View > Text Output from the menus. E Click Sample Moments in the tree diagram in the upper left corner of the Amos Output window.
50 Example 2 The following is the sample covariance matrix: E In the tree diagram, expand Estimates and then click Matrices. The following is the matrix of implied covariances: Note the differences between the sample and implied covariance matrices. Because the model imposes three constraints on the covariance structure, the implied variances and covariances are different from the sample values. For example, the sample variance of place1 is 33.58, but the implied variance is 27.53.
51 Testing Hypotheses Displaying Covariance and Variance Estimates on the Path Diagram As in Example 1, you can display the covariance and variance estimates on the path diagram. E Click the Show the output path diagram button. E In the Parameter Formats pane to the left of the drawing area, click Unstandardized estimates. Alternatively, you can request correlation estimates in the path diagram by clicking Standardized estimates.
52 Example 2 Notice the word \format in the bottom line of the figure caption. Words that begin with a backward slash, like \format, are called text macros. Amos replaces text macros with information about the currently displayed model. The text macro \format will be replaced by the heading Model Specification, Unstandardized estimates, or Standardized estimates, depending on which version of the path diagram is displayed.
53 Testing Hypotheses maximum likelihood estimates, and there is no reason to expect them to resemble the implied covariances. The chi-square statistic is an overall measure of how much the implied covariances differ from the sample covariances. Chi-square = 6.276 Degrees of freedom = 3 Probability level = 0.099 In general, the more the implied covariances differ from the sample covariances, the bigger the chi-square statistic will be.
54 Example 2 E In the Figure Caption dialog box, enter a caption that includes the \cmin, \df, and \p text macros, as follows: When Amos displays the path diagram containing this caption, it appears as follows:
55 Testing Hypotheses Modeling in VB.
56 Example 2 This table gives a line-by-line explanation of the program: Program Statement Dim Sem As New AmosEngine Sem.TextOutput Sem.Standardized() Sem.ImpliedMoments() Sem.SampleMoments() Sem.ResidualMoments() Sem.BeginGroup … Sem.AStructure("recall1 (v_recall)") Sem.AStructure("recall2 (v_recall)") Sem.AStructure("place1 (v_place)") Sem.AStructure("place2 (v_place)") Sem.AStructure("recall1 <> place1 (cov_rp)") Sem.AStructure("recall2 <> place2 (cov_rp)") Sem.FitModel() Sem.
57 Testing Hypotheses E To perform the analysis, from the menus, choose File > Run. Timing Is Everything The AStructure lines must appear after BeginGroup; otherwise, Amos will not recognize that the variables named in the AStructure lines are observed variables in the attg_yng.sav dataset. In general, the order of statements matters in an Amos program. In organizing an Amos program, AmosEngine methods can be divided into three general groups1.
Example 3 More Hypothesis Testing Introduction This example demonstrates how to test the null hypothesis that two variables are uncorrelated, reinforces the concept of degrees of freedom, and demonstrates, in a concrete way, what is meant by an asymptotically correct test. About the Data For this example, we use the group of older subjects from Attig’s (1983) spatial memory study and the two variables age and vocabulary. We will use data formatted as a tab-delimited text file.
60 Example 3 E In the Files of type list, select Text (*.txt), select Attg_old.txt, and then click Open. E In the Data Files dialog box, click OK. Testing a Hypothesis That Two Variables Are Uncorrelated Among Attig’s 40 old subjects, the sample correlation between age and vocabulary is –0.09 (not very far from 0).
61 More Hypothesis Testing model specified by the simple path diagram above specifies that the covariance (and thus the correlation) between age and vocabulary is 0. The second method of constraining a covariance parameter is the more general procedure introduced in Example 1 and Example 2. E From the menus, choose Diagram > Draw Covariances. E Click and drag to draw an arrow that connects vocabulary and age. E Right-click the arrow and choose Object Properties from the pop-up menu.
62 Example 3 E From the menus, choose Analyze > Calculate Estimates. The Save As dialog box appears. E Enter a name for the file and click Save. Amos calculates the model estimates. Viewing Text Output E From the menus, choose View > Text Output. E In the tree diagram in the upper left pane of the Amos Output window, click Estimates.
63 More Hypothesis Testing The three sample moments are the variances of age and vocabulary and their covariance. The two distinct parameters to be estimated are the two population variances. The covariance is fixed at 0 in the model, not estimated from the sample information. Viewing Graphics Output E Click the Show the output path diagram button. E In the Parameter Formats pane to the left of the drawing area, click Unstandardized estimates.
64 Example 3 The usual t statistic for testing this null hypothesis is 0.59 ( df = 38 , p = 0.56 two-sided). The probability level associated with the t statistic is exact. The probability level of 0.555 of the chi-square statistic is off, owing to the fact that it does not have an exact chi-square distribution in finite samples. Even so, the probability level of 0.555 is not bad.
65 More Hypothesis Testing Modeling in VB.NET Here is a program for performing the analysis of this example: The AStructure method constrains the covariance, fixing it at a constant 0. The program does not refer explicitly to the variances of age and vocabulary. The default behavior of Amos is to estimate those variances without constraints. Amos treats the variance of every exogenous variable as a free parameter except for variances that are explicitly constrained by the program.
Example 4 Conventional Linear Regression Introduction This example demonstrates a conventional regression analysis, predicting a single observed variable as a linear combination of three other observed variables. It also introduces the concept of identifiability. About the Data Warren, White, and Fuller (1974) studied 98 managers of farm cooperatives.
68 Example 4 Here are the sample variances and covariances: Warren5v also contains the sample means. Raw data are not available, but they are not needed by Amos for most analyses, as long as the sample moments (that is, means, variances, and covariances) are provided. In fact, only sample variances and covariances are required in this example. We will not need the sample means in Warren5v for the time being, and Amos will ignore them.
69 Conventional Linear Regression The single-headed arrows represent linear dependencies. For example, the arrow leading from knowledge to performance indicates that performance scores depend, in part, on knowledge. The variable error is enclosed in a circle because it is not directly observed. Error represents much more than random fluctuations in performance scores due to measurement error.
70 Example 4 E Draw three double-headed arrows that connect the observed exogenous variables (knowledge, satisfaction, and value). Your path diagram should look like this: Identification In this example, it is impossible to estimate the regression weight for the regression of performance on error, and, at the same time, estimate the variance of error.
71 Conventional Linear Regression Setting a regression weight equal to 1 for every error variable can be tedious. Fortunately, Amos Graphics provides a default solution that works well in most cases. E Click the Add a unique variable to an existing variable button. E Click an endogenous variable. Amos automatically attaches an error variable to it, complete with a fixed regression weight of 1. Clicking the endogenous variable repeatedly changes the position of the error variable.
72 Example 4 Viewing the Text Output Here are the maximum likelihood estimates: Amos does not display the path performance <— error because its value is fixed at the default value of 1. You may wonder how much the other estimates would be affected if a different constant had been chosen. It turns out that only the variance estimate for error is affected by such a change. The following table shows the variance estimate that results from various choices for the performance <— error regression weight.
73 Conventional Linear Regression of the same factor. Extending this, the product of the squared regression weight and the error variance is always a constant. This is what we mean when we say the regression weight (together with the error variance) is unidentified. If you assign a value to one of them, the other can be estimated, but they cannot both be estimated at the same time.
74 Example 4 The standardized regression weights and the correlations are independent of the units in which all variables are measured; therefore, they are not affected by the choice of identification constraints. Squared multiple correlations are also independent of units of measurement. Amos displays a squared multiple correlation for each endogenous variable. Note: The squared multiple correlation of a variable is the proportion of its variance that is accounted for by its predictors.
75 Conventional Linear Regression Here is the standardized solution: Viewing Additional Text Output E In the tree diagram in the upper left pane of the Amos Output window, click Variable Summary.
76 Example 4 Endogenous variables are those that have single-headed arrows pointing to them; they depend on other variables. Exogenous variables are those that do not have singleheaded arrows pointing to them; they do not depend on other variables. Inspecting the preceding list will help you catch the most common (and insidious) errors in an input file: typing errors.
77 Conventional Linear Regression Modeling in VB.NET The model in this example consists of a single regression equation. Each single-headed arrow in the path diagram represents a regression weight. Here is a program for estimating those regression weights: The four lines that come after Sem.BeginGroup correspond to the single-headed arrows in the Amos Graphics path diagram. The (1) in the last AStructure line fixes the error regression weight at a constant 1.
78 Example 4 the specification of many models, especially models that have parameters. The differences between specifying a model in Amos Graphics and specifying one programmatically are as follows: Amos Graphics is entirely WYSIWYG (What You See Is What You Get). If you draw a two-headed arrow (without constraints) between two exogenous variables, Amos Graphics will estimate their covariance.
79 Conventional Linear Regression Note that in the AStructure line above, each predictor variable (on the right side of the equation) is associated with a regression weight to be estimated. We could make these regression weights explicit through the use of empty parentheses as follows: Sem.AStructure("performance = ()knowledge + ()value + ()satisfaction + error(1)") The empty parentheses are optional. By default, Amos will automatically estimate a regression weight for each predictor.
Example 5 Unobserved Variables Introduction This example demonstrates a regression analysis with unobserved variables. About the Data The variables in the previous example were surely unreliable to some degree. The fact that the reliability of performance is unknown presents a minor problem when it comes to interpreting the fact that the predictors account for only 39.9% of the variance of performance.
82 Example 5 Here is a list of the input variables: Variable name Description 1performance 2performance 1knowledge 2knowledge 1value 2value 1satisfaction 2satisfaction past_training 12-item subtest of Role Performance 12-item subtest of Role Performance 13-item subtest of Knowledge 13-item subtest of Knowledge 15-item subtest of Value Orientation 15-item subtest of Value Orientation 5-item subtest of Role Satisfaction 6-item subtest of Role Satisfaction degree of formal education For this example, we w
83 Unobserved Variables Model A The following path diagram presents a model for the eight subtests: error3 1 1knowledge 1 knowledge error4 1 2knowledge error9 1 error5 1 1value 1 1 value error6 error7 1 1 1 error1 performance 2value 1satisfaction 1performance 2performance 1 error2 1 satisfaction error8 1 2satisfaction Example 5: Model A Regression with unobserved variables Job performance of farm managers Warren, White and Fuller (1974) Standardized estimates Four ellipses in t
84 Example 5 error3 1 1knowledge 1 knowledge error4 error5 1 2knowledge 1 1value 1 1 value error6 error7 1 1 1 error1 performance 2performance 2value 1satisfaction 1performance 1 error2 1 satisfaction error8 1 2satisfaction Consider, for instance, the knowledge submodel: The scores of the two split-half subtests, 1knowledge and 2knowledge, are hypothesized to depend on the single underlying, but not directly observed variable, knowledge.
85 Unobserved Variables knowledge error9 1 value performance satisfaction The structural part of the current model is the same as the one in Example 4. It is only in the measurement model that this example differs from the one in Example 4. Identification With 13 unobserved variables in this model, it is certainly not identified. It will be necessary to fix the unit of measurement of each unobserved variable by suitable constraints on the parameters.
86 Example 5 Changing the Orientation of the Drawing Area E From the menus, choose View > Interface Properties. E In the Interface Properties dialog box, click the Page Layout tab. E Set Paper Size to one of the “Landscape” paper sizes, such as Landscape - A4. E Click Apply.
87 Unobserved Variables Creating the Path Diagram Now you are ready to draw the model as shown in the path diagram on page 83. There are a number of ways to do this. One is to start by drawing the measurement model first. Here, we draw the measurement model for one of the latent variables, knowledge, and then use it as a pattern for the other three. E Draw an ellipse for the unobserved variable knowledge. E From the menus, choose Diagram > Draw Indicator Variable. E Click twice inside the ellipse.
88 Example 5 Rotating Indicators The indicators appear by default above the knowledge ellipse, but you can change their location. E From the menus, choose Edit > Rotate. E Click the knowledge ellipse. Each time you click the knowledge ellipse, its indicators rotate 90° clockwise. If you click the ellipse three times, its indicators will look like this: Duplicating Measurement Models The next step is to create measurement models for value and satisfaction. E From the menus, choose Edit > Select All.
89 Unobserved Variables Your path diagram should now look like this: E Create a fourth copy for performance, and position it to the right of the original. E From the menus, choose Edit > Reflect.
90 Example 5 Entering Variable Names E Right-click each object and select Object Properties from the pop-up menu E In the Object Properties dialog box, click the Text tab, and enter a name into the Variable Name text box. Alternatively, you can choose View > Variables in Dataset from the menus and then drag variable names onto objects in the path diagram. Completing the Structural Model There are only a few things left to do to complete the structural model.
91 Unobserved Variables The hypothesis that Model A is correct is accepted. Chi-square = 10.335 Degrees of freedom = 14 Probability level = 0.737 The parameter estimates are affected by the identification constraints.
92 Example 5 Standardized estimates, on the other hand, are not affected by the identification constraints. To calculate standardized estimates: E From the menus, choose View > Analysis Properties. E In the Analysis Properties dialog box, click the Output tab. E Enable the Standardized estimates check box.
93 Unobserved Variables Viewing the Graphics Output The path diagram with standardized parameter estimates displayed is as follows: error3 error4 error8 .52 value .40 .63 .80 1satisfaction .56 2satisfaction error9 .66 .75 .86 performance .82 -.08 error7 knowledge .62 .06 error6 .56 1value .40 2value Chi-square = 10.335 (14 df) p = .737 .73 .54 error5 .53 1knowledge .38 2knowledge .90 .73 1performance .67 2performance error1 error2 .13 satisfaction .
94 Example 5 split exactly in half. As a result, 2satisfaction is 20% longer than 1satisfaction. Assuming that the tests differ only in length leads to the following conclusions: The regression weight for regressing 2satisfaction on satisfaction should be 1.2 times the weight for regressing 1satisfaction on satisfaction. Given equal variances for error7 and error8, the regression weight for error8 should be 1.2 = 1.095445 times as large as the regression weight for error7.
95 Unobserved Variables The chi-square statistic has also increased but not by much. It indicates no significant departure of the data from Model B. Chi-square = 26.967 Degrees of freedom = 22 Probability level = 0.212 If Model B is indeed correct, the associated parameter estimates are to be preferred over those obtained under Model A. The raw parameter estimates will not be presented here because they are affected too much by the choice of identification constraints.
96 Example 5 Here are the standardized estimates and squared multiple correlations displayed on the path diagram: Chi-square = 26.967 (22 df) p = .212 .44 error3 1knowledge .66 knowledge .44 error4 2knowledge .66 1value error7 1satisfaction value 2satisfaction performance 1performance error1 .70 .84 2performance error2 .11 .79 satisfaction .67 .58 .38 .69 .62 error8 .70 .67 .84 -.08 2value error9 .69 .47 error6 .09 .57 .47 error5 .53 .
97 Unobserved Variables distribution with degrees of freedom equal to the difference between the degrees of freedom of the competing models. In this example, the difference in degrees of freedom is 8 (that is, 22 – 14). Model B imposes all of the parameter constraints of Model A, plus an additional 8. In summary, if Model B is correct, the value 16.632 comes from a chi-square distribution with eight degrees of freedom.
98 Example 5 Modeling in VB.NET Model A The following program fits Model A: Sub Main() Dim Sem As New AmosEngine Try Sem.TextOutput() Sem.Standardized() Sem.Smc() Sem.BeginGroup(Sem.AmosDir & "Examples\Warren9v.wk1") Sem.AStructure("1performance <--- performance (1)") Sem.AStructure("2performance <--- performance") Sem.AStructure("1knowledge <--- knowledge (1)") Sem.AStructure("2knowledge <--- knowledge") Sem.AStructure("1value <--- value (1)") Sem.AStructure("2value <--- value") Sem.
99 Unobserved Variables Model B The following program fits Model B: Sub Main() Dim Sem As New AmosEngine Try Sem.TextOutput() Sem.Standardized() Sem.Smc() Sem.BeginGroup(Sem.AmosDir & "Examples\Warren9v.wk1") Sem.AStructure("1performance <--- performance (1)") Sem.AStructure("2performance <--- performance (1)") Sem.AStructure("1knowledge <--- knowledge (1)") Sem.AStructure("2knowledge <--- knowledge (1)") Sem.AStructure("1value <--- value (1)") Sem.AStructure("2value <--- value (1)") Sem.
Example 6 Exploratory Analysis Introduction This example demonstrates structural modeling with time-related latent variables, the use of modification indices and critical ratios in exploratory analyses, how to compare multiple models in a single analysis, and computation of implied moments, factor score weights, total effects, and indirect effects. About the Data Wheaton et al. (1977) reported a longitudinal study of 932 persons over the period from 1966 to 1971.
102 Example 6 variances and covariances, as needed for the analysis. We will not use the sample means in the analysis. Model A for the Wheaton Data Jöreskog and Sörbom (1984) proposed the model shown on p. 103 for the Wheaton data, referring to it as their Model A. The model asserts that all of the observed variables depend on underlying, unobserved variables.
103 Exploratory Analysis eps1 eps2 eps3 eps4 1 1 1 1 anomia67 powles67 1 powles71 1 1 zeta1 anomia71 67 alienation 71 alienation 1 zeta2 ses 1 educatio 1 delta1 SEI 1 delta2 Example 6: Model A Exploratory analysis Wheaton (1977) Model Specification Identification Model A is identified except for the usual problem that the measurement scale of each unobserved variable is indeterminate.
104 Example 6 The Wheaton data depart significantly from Model A. Chi-square = 71.544 Degrees of freedom = 6 Probability level = 0.000 Dealing with Rejection You have several options when a proposed model has to be rejected on statistical grounds: You can point out that statistical hypothesis testing can be a poor tool for choosing a model. Jöreskog (1967) discussed this issue in the context of factor analysis.
105 Exploratory Analysis Modification Indices You can test various modifications of a model by carrying out a separate analysis for each potential modification, but this approach is time-consuming. Modification indices allow you to evaluate many potential modifications in a single analysis. They provide suggestions for model modifications that are likely to pay off in smaller chisquare values. Using Modification Indices E From the menus, choose View > Analysis Properties.
106 Example 6 The column heading M.I. in this table is short for Modification Index. The modification indices produced are those described by Jöreskog and Sörbom (1984). The first modification index listed (5.905) is a conservative estimate of the decrease in chi-square that will occur if eps2 and delta1 are allowed to be correlated. The new chi-square statistic would have 5 ( = 6 – 1 ) degrees of freedom and would be no greater than 65.639 ( 71.544 – 5.905 ).
107 Exploratory Analysis The theoretical reasons for suspecting that eps1 and eps3 might be correlated apply to eps2 and eps4 as well. The modification indices also suggest allowing eps2 and eps4 to be correlated. However, we will ignore this potential modification and proceed immediately to look at the results of modifying Model A by allowing eps1 and eps3 to be correlated. The new model is Jöreskog and Sörbom’s Model B.
108 Example 6 Text Output The added covariance between eps1 and eps3 decreases the degrees of freedom by 1. The chi-square statistic is reduced by substantially more than the promised 40.911. Chi-square = 6.383 Degrees of freedom = 5 Probability level = 0.271 Model B cannot be rejected. Since the fit of Model B is so good, we will not pursue the possibility, mentioned earlier, of allowing eps2 and eps4 to be correlated.
109 Exploratory Analysis Note the large critical ratio associated with the new covariance path. The covariance between eps1 and eps3 is clearly different from 0. This explains the poor fit of Model A, in which that covariance was fixed at 0. Graphics Output for Model B The following path diagram displays the standardized estimates and the squared multiple correlations: .38 eps1 eps2 .76 .57 anomia67 anomia71 .58 67 alienation -.55 .72 .86 71 alienation zeta2 .50 -.20 ses .85 .73 powles71 .
110 Example 6 Misuse of Modification Indices In trying to improve upon a model, you should not be guided exclusively by modification indices. A modification should be considered only if it makes theoretical or common sense. A slavish reliance on modification indices without such a limitation amounts to sorting through a very large number of potential modifications in search of one that provides a big improvement in fit.
111 Exploratory Analysis E From the menus, choose View > Analysis Properties. E In the Analysis Properties dialog box, click the Output tab. E Enable the Critical ratios for differences check box. When Amos calculates critical ratios for parameter differences, it generates names for any parameters that you did not name during model specification. The names are displayed in the text output next to the parameter estimates. Here are the parameter estimates for Model B.
112 Example 6 The parameter names are needed for interpreting the critical ratios in the following table: Critical Ratios for Differences between Parameters (Default model) par_1 par_2 par_3 par_4 par_5 par_6 par_7 par_8 par_9 par_10 par_11 par_12 par_13 par_14 par_15 par_16 par_1 .000 .877 9.883 -4.429 -17.943 -22.343 3.903 8.955 8.364 7.781 11.106 3.826 10.425 4.697 3.393 14.615 par_2 par_3 par_4 par_5 par_6 .000 9.741 -5.931 -16.634 -26.471 3.689 8.866 7.872 8.040 11.705 3.336 9.659 4.906 3.
113 Exploratory Analysis par_1 and par_2 divided by the estimated standard error of this difference. These two parameters are the regression weights for powles71 <– 71_alienation and powles67 <– 67_alienation. Under the distribution assumptions stated on p. 35, the critical ratio statistic can be evaluated using a table of the standard normal distribution to test whether the two parameters are equal in the population. Since 0.877 is less in magnitude than 1.96, you would not reject, at the 0.
114 Example 6 Model C for the Wheaton Data Here is the path diagram for Model C from the file Ex06–c.
115 Exploratory Analysis Testing Model C As expected, Model C has an acceptable fit, with a higher probability level than Model B: Chi-square = 7.501 Degrees of freedom = 8 Probability level = 0.484 You can test Model C against Model B by examining the difference in chi-square values ( 7.501 – 6.383 = 1.118 ) and the difference in degrees of freedom ( 8 – 5 = 3 ). A chi-square value of 1.118 with 3 degrees of freedom is not significant.
116 Example 6 Multiple Models in a Single Analysis Amos allows for the fitting of multiple models in a single analysis. This allows Amos to summarize the results for all models in a single table. It also allows Amos to perform a chi-square test for nested model comparisons. In this example, Models A, B, and C can be fitted in a single analysis by noting that Models A and C can each be obtained by constraining the parameters of Model B. In the following path diagram from the file Ex06-all.
117 Exploratory Analysis Using the parameter names just introduced, Model A can be obtained from the most general model (Model B) by requiring cov1 = 0. E In the Models panel to the left of the path diagram, double-click Default Model. The Manage Models dialog box appears. E In the Model Name text box, type Model A: No Autocorrelation. E Double-click cov1 in the left panel. Notice that cov1 appears in the Parameter Constraints box. E Type cov1 =0 in the Parameter Constraints box.
118 Example 6 This completes the specification of Model A. E In the Manage Models dialog box, click New. E In the Model Name text box, type Model B: Most General. Model B has no constraints other than those in the path diagram, so you can proceed immediately to Model C. E Click New. E In the Model Name text box, type Model C: Time-Invariance.
119 Exploratory Analysis E In the Parameter Constraints box, type: Model A: No Autocorrelation Model C: Time-Invariance These lines tell Amos that Model D incorporates the constraints of both Model A and Model C. Now that we have set up the parameter constraints for all four models, the final step is to perform the analysis and view the output.
120 Example 6 The CMIN column contains the minimum discrepancy for each model. In the case of maximum likelihood estimation (the default), the CMIN column contains the chi-square statistic. The p column contains the corresponding upper-tail probability for testing each model. For nested pairs of models, Amos provides tables of model comparisons, complete with chi-square difference tests and their associated p values.
121 Exploratory Analysis Obtaining Optional Output The variances and covariances among the observed variables can be estimated under the assumption that Model C is correct. E From the menus, choose View > Analysis Properties. E In the Analysis Properties dialog box, click the Output tab. E Select Implied moments (a check mark appears next to it). E To obtain the implied variances and covariances for all the variables in the model except error variables, select All implied moments.
122 Example 6 The matrix of implied covariances for all variables in the model can be used to carry out a regression of the unobserved variables on the observed variables. The resulting regression weight estimates can be obtained from Amos by enabling the Factor score weights check box. Here are the estimated factor score weights for Model C: The table of factor score weights has a separate row for each unobserved variable, and a separate column for each observed variable.
123 Exploratory Analysis The first row of the table indicates that 67_alienation depends, directly or indirectly, on ses only. The total effect of ses on 67_alienation is –0.56. The fact that the effect is negative means that, all other things being equal, relatively high ses scores are associated with relatively low 67_alienation scores. Looking in the fifth row of the table, powles71 depends, directly or indirectly, on ses, 67_alienation, and 71_alienation.
124 Example 6 Model B The following program fits Model B. It is saved as Ex06–b.vb. Sub Main() Dim Sem As New AmosEngine Try Sem.TextOutput() Sem.Standardized() Sem.Smc() Sem.Crdiff() Sem.BeginGroup(Sem.AmosDir & "Examples\Wheaton.sav") Sem.AStructure("anomia67 <--- 67_alienation (1)") Sem.AStructure("anomia67 <--- eps1 (1)") Sem.AStructure("powles67 <--- 67_alienation") Sem.AStructure("powles67 <--- eps2 (1)") Sem.AStructure("anomia71 <--- 71_alienation (1)") Sem.AStructure("anomia71 <--- eps3 (1)") Sem.
125 Exploratory Analysis Model C The following program fits Model C. It is saved as Ex06–c.vb. Sub Main() Dim Sem As New AmosEngine Try Sem.TextOutput() Sem.Standardized() Sem.Smc() Sem.AllImpliedMoments() Sem.FactorScoreWeights() Sem.TotalEffects() Sem.BeginGroup(Sem.AmosDir & "Examples\Wheaton.sav") Sem.AStructure("anomia67 <--- 67_alienation (1)") Sem.AStructure("anomia67 <--- eps1 (1)") Sem.AStructure("powles67 <--- 67_alienation (path_p)") Sem.AStructure("powles67 <--- eps2 (1)") Sem.
126 Example 6 Fitting Multiple Models To fit all three models, A, B, and C in a single analysis, start with the following program, which assigns unique names to some parameters: Sub Main() Dim Sem As New AmosEngine Try Sem.TextOutput() Sem.Standardized() Sem.Smc() Sem.AllImpliedMoments() Sem.TotalEffects() Sem.FactorScoreWeights() Sem.Mods(4) Sem.Crdiff() Sem.BeginGroup(Sem.AmosDir & "Examples\Wheaton.sav") Sem.AStructure("anomia67 <--- 67_alienation (1)") Sem.AStructure("anomia67 <--- eps1 (1)") Sem.
127 Exploratory Analysis Sem.FitModel line, will fit the model four times, each time with a different set of parameter constraints: Sem.Model("Model A: No Autocorrelation", "cov1 = 0") Sem.Model("Model B: Most General", "") Sem.Model("Model C: Time-Invariance", _ "b_pow67 = b_pow71;var_a67 = var_a71;var_p67 = var_p71") Sem.Model("Model D: A and C Combined", _ "Model A: No Autocorrelation;Model C: Time-Invariance") Sem.
Example 7 A Nonrecursive Model Introduction This example demonstrates structural equation modeling with a nonrecursive model. About the Data Felson and Bohrnstedt (1979) studied 209 girls from sixth through eighth grade.
130 Example 7 Sample correlations, means, and standard deviations for these six variables are contained in the SPSS Statistics file, Fels_fem.sav. Here is the data file as it appears in the SPSS Statistics Data Editor: The sample means are not used in this example.
131 A Nonrecursive Model Perceived academic performance is modeled as a function of GPA and perceived attractiveness (attract). Perceived attractiveness, in turn, is modeled as a function of perceived academic performance, height, weight, and the rating of attractiveness by children from another city. Particularly noteworthy in this model is that perceived academic ability depends on perceived attractiveness, and vice versa.
132 Example 7 Regression Weights: (Group number 1 - Default model) academic <--- GPA attract <--- height attract <--- weight attract <--- rating attract <--- academic academic <--- attract Estimate .023 .000 -.002 .176 1.607 -.002 S.E. .004 .010 .001 .027 .349 .051 C.R. P 6.241 *** .050 .960 -1.321 .186 6.444 *** 4.599 *** -.039 .
133 A Nonrecursive Model Obtaining Standardized Estimates Before you perform the analysis, do the following: E From the menus, choose View > Analysis Properties. E In the Analysis Properties dialog box, click the Output tab. E Select Standardized estimates (a check mark appears next to it). E Close the dialog box. Standardized Regression Weights: (Group number 1 Default model) academic <--attract <--attract <--attract <--attract <--academic <--- GPA height weight rating academic attract Estimate .492 .
134 Example 7 E In the Analysis Properties dialog box, click the Output tab. E Select Squared multiple correlations (a check mark appears next to it). E Close the dialog box. Squared Multiple Correlations: (Group number 1 Default model) Estimate .402 .236 attract academic The squared multiple correlations show that the two endogenous variables in this model are not predicted very accurately by the other variables in the model.
135 A Nonrecursive Model Stability Index The existence of feedback loops in a nonrecursive model permits certain problems to arise that cannot occur in recursive models. In the present model, attractiveness depends on perceived academic ability, which in turn depends on attractiveness, which depends on perceived academic ability, and so on. This appears to be an infinite regress, and it is.
136 Example 7 Modeling in VB.NET The following program fits the model of this example. It is saved in the file Ex07.vb. Sub Main() Dim Sem As New AmosEngine Try Sem.TextOutput() Sem.Standardized() Sem.Smc() Sem.BeginGroup(Sem.AmosDir & "Examples\Fels_fem.sav") Sem.AStructure("academic <--- GPA") Sem.AStructure("academic <--- attract") Sem.AStructure("academic <--- error1 (1)") Sem.AStructure("attract <--- height") Sem.AStructure("attract <--- weight") Sem.AStructure("attract <--- rating") Sem.
Example 8 Factor Analysis Introduction This example demonstrates confirmatory common factor analysis. About the Data Holzinger and Swineford (1939) administered 26 psychological tests to 301 seventhand eighth-grade students in two Chicago schools. In the present example, we use scores obtained by the 73 girls from a single school (the Grant-White school).
138 Example 8 The file Grnt_fem.
139 Factor Analysis This model asserts that the first three tests depend on an unobserved variable called spatial. Spatial can be interpreted as an underlying ability (spatial ability) that is not directly observed. According to the model, performance on the first three tests depends on this ability. In addition, performance on each of these tests may depend on something other than spatial ability as well. In the case of visperc, for example, the unique variable err_v is also involved.
140 Example 8 is true that the lack of a unit of measurement for unobserved variables is an ever-present cause of non-identification. Fortunately, it is one that is easy to cure, as we have done repeatedly. But other kinds of under-identification can occur for which there is no simple remedy. Conditions for identifiability have to be established separately for individual models.
141 Factor Analysis Results of the Analysis Here are the unstandardized results of the analysis. As shown at the upper right corner of the figure, the model fits the data quite well. Chi-square = 7.853 (8 df) p = .448 23.30 spatial 1.00 .61 1.20 visperc cubes lozenges 1 1 1 23.87 err_v 11.60 err_c 28.28 err_l 7.32 9.68 verbal 1.00 1.33 2.23 paragrap sentence wordmean 1 1 1 2.83 err_p 7.97 err_s 19.
142 Example 8 Regression Weights: (Group number 1 - Default model) visperc <--- spatial cubes <--- spatial lozenges <--- spatial paragrap <--- verbal sentence <--- verbal wordmean <--- verbal Estimate 1.000 .610 1.198 1.000 1.334 2.234 S.E. C.R. P .143 .272 4.250 *** 4.405 *** .160 .263 8.322 *** 8.482 *** Label Standardized Regression Weights: (Group number 1 Default model) visperc <--cubes <--lozenges <--paragrap <--sentence <--wordmean <--- Estimate .703 .654 .736 .880 .827 .
143 Factor Analysis E Also select Squared multiple correlations if you want squared multiple correlation for each endogenous variable, as shown in the next graphic. E Close the dialog box. Squared Multiple Correlations: (Group number 1 Default model) wordmean sentence paragrap lozenges cubes visperc Estimate .708 .684 .774 .542 .428 .494 Viewing Standardized Estimates E In the Amos Graphics window, click the Show the output path diagram button.
144 Example 8 The squared multiple correlations can be interpreted as follows: To take wordmean as an example, 71% of its variance is accounted for by verbal ability. The remaining 29% of its variance is accounted for by the unique factor err_w. If err_w represented measurement error only, we could say that the estimated reliability of wordmean is 0.71. As it is, 0.71 is an estimate of a lower-bound on the reliability of wordmean.
Example 9 An Alternative to Analysis of Covariance Introduction This example demonstrates a simple alternative to an analysis of covariance that does not require perfectly reliable covariates. A better, but more complicated, alternative will be demonstrated in Example 16. Analysis of Covariance and Its Alternative Analysis of covariance is a technique that is frequently used in experimental and quasi-experimental studies to reduce the effect of pre-existing differences among treatment groups.
146 Example 9 here has been employed by Bentler and Woodward (1979) and others. Another approach, by Sörbom (1978), is demonstrated in Example 16. The Sörbom method is more general. It allows testing other assumptions of analysis of covariance and permits relaxing some of them as well. The Sörbom approach is comparatively complicated because of its generality.
147 An Alternative to Analysis of Covariance Correlations and standard deviations for the five measures are contained in the Microsoft Excel workbook UserGuide.xls, in the Olss_all worksheet. Here is the dataset: There are positive correlations between treatment and each of the posttests, which indicates that the trained students did better on the posttests than the untrained students. The correlations between treatment and each of the pretests are positive but relatively small.
148 Example 9 eps1 1 eps2 eps3 1 pre_syn pre_opp 1 eps4 1 post_syn post_opp 1 1 pre_verbal post_verbal 1 zeta treatment Example 9: Model A Olsson (1973) test coaching study Model Specification Similarly, the model asserts that post_syn and post_opp are imperfect measures of an unobserved ability called post_verbal, which might be thought of as verbal ability at the time of the posttest.
149 An Alternative to Analysis of Covariance Specifying Model A To specify Model A, draw a path diagram similar to the one on p. 148. The path diagram is saved as the file Ex09-a.amw. Results for Model A There is considerable empirical evidence against Model A: Chi-square = 33.215 Degrees of freedom = 3 Probability level = 0.000 This is bad news.
150 Example 9 Requesting modification indices with a threshold of 4 produces the following additional output: Modification Indices (Group number 1 - Default model) Covariances: (Group number 1 - Default model) eps2 <--> eps4 eps2 <--> eps3 eps1 <--> eps4 eps1 <--> eps3 M.I. 13.161 10.813 11.968 9.788 Par Change 3.249 -2.822 -3.228 2.798 According to the first modification index in the M.I. column, the chi-square statistic will decrease by at least 13.
151 An Alternative to Analysis of Covariance eps1 1 eps2 eps3 1 pre_syn 1 pre_opp eps4 1 post_syn post_opp 1 1 pre_verbal post_verbal 1 zeta treatment Example 9: Model B Olsson (1973) test coaching study Model Specification You may find your error variables already positioned at the top of the path diagram, with no room to draw the double-headed arrow. To fix the problem: E From the menus, choose Edit > Fit to Page.
152 Example 9 chi-square statistic that will occur if the corresponding constraint—and only that constraint—is removed. The following raw parameter estimates are difficult to interpret because they would have been different if the identification constraints had been different: Regression Weights: (Group number 1 - Default model) post_verbal <--- pre_verbal post_verbal <--- treatment pre_syn <--- pre_verbal pre_opp <--- pre_verbal post_syn <--- post_verbal post_opp <--- post_verbal Estimate S.E. .889 .
153 An Alternative to Analysis of Covariance .51 eps1 eps2 .86 pre_syn eps3 .71 pre_opp .93 .93 .86 .88 .84 post_verbal . .15 .70 post_syn post_opp .84 pre_verbal eps4 .86 28 zeta treatment Example 9: Model B Olsson (1973) test coaching study Standardized estimates In this example, we are primarily concerned with testing a particular hypothesis and not so much with parameter estimation.
154 Example 9 Drawing a Path Diagram for Model C To draw the path diagram for Model C: E Start with the path diagram for Model B. E Right-click the arrow that points from treatment to post_verbal and choose Object Properties from the pop-up menu. E In the Object Properties dialog box, click the Parameters tab and type 0 in the Regression weight text box. The path diagram for Model C is saved in the file Ex09-c.amw. Results for Model C Model C has to be rejected at any conventional significance level.
155 An Alternative to Analysis of Covariance Modeling in VB.NET Model A This program fits Model A. It is saved in the file Ex09–a.vb. Sub Main() Dim Sem As New AmosEngine Try Sem.TextOutput() Sem.Mods(4) Sem.Standardized() Sem.Smc() Sem.BeginGroup(Sem.AmosDir & "Examples\UserGuide.xls", "Olss_all") Sem.AStructure("pre_syn = (1) pre_verbal + (1) eps1") Sem.AStructure("pre_opp = pre_verbal + (1) eps2") Sem.AStructure("post_syn = (1) post_verbal + (1) eps3") Sem.
156 Example 9 Model C This program fits Model C. It is saved in the file Ex09–c.vb. Sub Main() Dim Sem As New AmosEngine Try Sem.TextOutput() Sem.Mods(4) Sem.Standardized() Sem.Smc() Sem.BeginGroup(Sem.AmosDir & "Examples\UserGuide.xls", "Olss_all") Sem.AStructure("pre_syn = (1) pre_verbal + (1) eps1") Sem.AStructure("pre_opp = pre_verbal + (1) eps2") Sem.AStructure("post_syn = (1) post_verbal + (1) eps3") Sem.AStructure("post_opp = post_verbal + (1) eps4") Sem.
157 An Alternative to Analysis of Covariance Fitting Multiple Models This program (Ex09-all.vb) fits all three models (A through C). Sub Main() Dim Sem As New AmosEngine Try Sem.TextOutput() Sem.Mods(4) Sem.Standardized() Sem.Smc() Sem.BeginGroup(Sem.AmosDir & "Examples\UserGuide.xls", "Olss_all") Sem.AStructure("pre_syn = (1) pre_verbal + (1) eps1") Sem.AStructure("pre_opp = pre_verbal + (1) eps2") Sem.AStructure("post_syn = (1) post_verbal + (1) eps3") Sem.
Example 10 Simultaneous Analysis of Several Groups Introduction This example demonstrates how to fit a model to two sets of data at once. Amos is capable of modeling data from multiple groups (or samples) simultaneously. This multigroup facility allows for many additional types of analyses, as illustrated in the next several examples. Analysis of Several Groups We return once again to Attig’s (1983) memory data from young and old subjects, which were used in Example 1 through Example 3.
160 Example 10 About the Data We will use Attig’s memory data from both young and old subjects. Following is a partial listing of the old subjects’ data found in the worksheet Attg_old located in the Microsoft Excel workbook UserGuide.xls: The young subjects’ data are in the Attg_yng worksheet. This example uses only the measures recall1 and cued1. Data for multigroup analysis can be organized in a variety of ways.
161 Simultaneous Analysis of Several Groups Conventions for Specifying Group Differences The main purpose of a multigroup analysis is to find out the extent to which groups differ.
162 Example 10 E Click File Name, select the Excel workbook UserGuide.xls that is in the Amos Examples directory, and click Open. E In the Select a Data Table dialog box, select the Attg_yng worksheet. E Click OK to close the Select a Data Table dialog box. E Click OK to close the Data Files dialog box. E From the menus, choose View > Variables in Dataset. E Drag observed variables recall1 and cued1 to the diagram.
163 Simultaneous Analysis of Several Groups E Connect recall1 and cued1 with a double-headed arrow. E To add a caption to the path diagram, from the menus, choose Diagram > Figure Caption and then click the path diagram at the spot where you want the caption to appear. E In the Figure Caption dialog box, enter a title that contains the text macros \group and \format.
164 Example 10 E Click OK to complete the model specification for the young group. E To add a second group, from the menus, choose Analyze > Manage Groups. E In the Manage Groups dialog box, change the name in the Group Name text box from Group number 1 to young subjects. E Click New to create a second group. E Change the name in the Group Name text box from Group number 2 to old subjects.
165 Simultaneous Analysis of Several Groups E Click Close. E From the menus, choose File > Data Files. The Data Files dialog box shows that there are two groups labeled young subjects and old subjects. E To specify the dataset for the old subjects, in the Data Files dialog box, select old subjects. E Click File Name, select the Excel workbook UserGuide.xls that is in the Amos Examples directory, and click Open. E In the Select a Data Table dialog box, select the Attg_old worksheet.
166 Example 10 E Click OK. Text Output Model A has zero degrees of freedom. Computation of degrees of freedom (Default model) Number of distinct sample moments: Number of distinct parameters to be estimated: Degrees of freedom (6 - 6): 6 6 0 Amos computed the number of distinct sample moments this way: The young subjects have two sample variances and one sample covariance, which makes three sample moments. The old subjects also have three sample moments, making a total of six sample moments.
167 Simultaneous Analysis of Several Groups To view parameter estimates for the young people in the Amos Output window: E Click Estimates in the tree diagram in the upper left pane. E Click young subjects in the Groups panel at the left side of the window. Covariances: (young subjects - Default model) recall1 <--> cued1 Estimate 3.225 S.E. .944 C.R. 3.416 P *** Label Variances: (young subjects - Default model) Estimate 5.787 4.210 recall1 cued1 S.E. 1.311 .953 C.R. 4.416 4.
168 Example 10 Click either the View Input or View Output button to see an input or output path diagram. Select either young subjects or old subjects in the Groups panel. Select either Unstandardized estimates or Standardized estimates in the Parameter Formats panel. Model B It is easy to see that the parameter estimates are different for the two groups.
169 Simultaneous Analysis of Several Groups E In the Variance text box, enter a name for the variance of recall1; for example, type var_rec. E Select All groups (a check mark will appear next to it). The effect of the check mark is to assign the name var_rec to the variance of recall1 in all groups. Without the check mark, var_rec would be the name of the variance for recall1 for the young group only.
170 Example 10 Text Output Because of the constraints imposed in Model B, only three distinct parameters are estimated instead of six. As a result, the number of degrees of freedom has increased from 0 to 3. Computation of degrees of freedom (Default model) Number of distinct sample moments: Number of distinct parameters to be estimated: Degrees of freedom (6 - 3): 6 3 3 Model B is acceptable at any conventional significance level. Chi-square = 4.588 Degrees of freedom = 3 Probability level = 0.
171 Simultaneous Analysis of Several Groups Graphics Output For Model B, the output path diagram is the same for both groups. Chi-square = 4.588 (3 df) p = .205 5.68 5.45 recall1 cued1 4.06 Modeling in VB.NET Model A Here is a program (Ex10-a.vb) for fitting Model A: Sub Main() Dim Sem As New AmosEngine Try Sem.TextOutput() Sem.BeginGroup(Sem.AmosDir & "Examples\UserGuide.xls", "Attg_yng") Sem.GroupName("young subjects") Sem.AStructure("recall1") Sem.AStructure("cued1") Sem.BeginGroup(Sem.
172 Example 10 The BeginGroup method is used twice in this two-group analysis. The first BeginGroup line specifies the Attg_yng dataset. The three lines that follow supply a name and a model for that group. The second BeginGroup line specifies the Attg_old dataset, and the following three lines supply a name and a model for that group. The model for each group simply says that recall1 and cued1 are two variables with unconstrained variances and an unspecified covariance.
173 Simultaneous Analysis of Several Groups Multiple Model Input Here is a program (Ex10-all.vb) for fitting both Models A and B.1 Sub Main() Dim Sem As New AmosEngine Try Sem.Standardized() Sem.TextOutput() Sem.BeginGroup(Sem.AmosDir & "Examples\UserGuide.xls", "Attg_yng") Sem.GroupName("young subjects") Sem.AStructure("recall1 (yng_rec)") Sem.AStructure("cued1 (yng_cue)") Sem.AStructure("recall1 <> cued1 (yng_rc)") Sem.BeginGroup(Sem.AmosDir & "Examples\UserGuide.xls", "Attg_old") Sem.
Example 11 Felson and Bohrnstedt’s Girls and Boys Introduction This example demonstrates how to fit a simultaneous equations model to two sets of data at once. Felson and Bohrnstedt’s Model Example 7 tested Felson and Bohrnstedt’s (1979) model for perceived attractiveness and perceived academic ability using a sample of 209 girls. Here, we take the same model and attempt to apply it simultaneously to the Example 7 data and to data from another sample of 207 boys.
176 Example 11 Notice that there are eight variables in the boys’ data file but only seven in the girls’ data file. The extra variable skills is not used in any model of this example, so its presence in the data file is ignored. Specifying Model A for Girls and Boys Consider extending the Felson and Bohrnstedt model of perceived attractiveness and academic ability to boys as well as girls.
177 Felson and Bohrnstedt’s Girls and Boys E In the Figure Caption dialog box, enter a title that contains the text macro \group. For example: In Example 7, where there was only one group, the group’s name didn’t matter. Accepting the default name Group number 1 was good enough. Now that there are two groups to keep track of, the groups should be given meaningful names. E From the menus, choose Analyze > Manage Groups. E In the Manage Groups dialog box, type girls for Group Name.
178 Example 11 E Click Close to close the Manage Groups dialog box. E From the menus, choose File > Data Files. E In the Data Files dialog box, double-click girls and select the data file Fels_fem.sav. E Then, double-click boys and select the data file Fels_mal.sav. E Click OK to close the Data Files dialog box.
179 Felson and Bohrnstedt’s Girls and Boys Text Output for Model A With two groups instead of one (as in Example 7), there are twice as many sample moments and twice as many parameters to estimate. Therefore, you have twice as many degrees of freedom as there were in Example 7. Computation of degrees of freedom (Default model) Number of distinct sample moments: Number of distinct parameters to be estimated: Degrees of freedom (42 - 38): 42 38 4 The model fits the data from both groups quite well.
180 Example 11 These parameter estimates are the same as in Example 7. Standard errors, critical ratios, and p values are also the same. The following are the unstandardized estimates for the boys: Regression Weights: (boys - Default model) academic <--- GPA attract <--- height attract <--- weight attract <--- rating attract <--- academic academic <--- attract Estimate S.E. C.R. .021 .003 6.927 .019 .010 1.967 -.003 .001 -2.484 .095 .030 3.150 1.386 .315 4.398 .063 .059 1.071 P *** .049 .013 .002 *** .
181 Felson and Bohrnstedt’s Girls and Boys Graphics Output for Model A For girls, this is the path diagram with unstandardized estimates displayed: 12.12 .02 GPA academic 1 .02 error1 1.82 8.43 -6.71 1.61 .00 .00 371.48 -.47 weight .00 1 attract .14 error2 -5.24 1.02 .1 8 .53 19.02 0 .
182 Example 11 Obtaining Critical Ratios for Parameter Differences E From the menus, choose View > Analysis Properties. E In the Analysis Properties dialog box, click the Output tab. E Select Critical ratios for differences. In this example, however, we will not use critical ratios for differences; instead, we will take an alternative approach to looking for group differences.
183 Felson and Bohrnstedt’s Girls and Boys variables to be group-invariant. For Model B, you need to constrain six regression weights in each group. E First, display the girls’ path diagram by clicking girls in the Groups panel at the left of the path diagram. E Right-click one of the single-headed arrows and choose Object Properties from the pop- up menu. E In the Object Properties dialog box, click the Parameters tab. E Enter a name in the Regression weight text box. E Select All groups.
184 Example 11 E Repeat this until you have named every regression weight. Always make sure to select (put a check mark next to) All groups. After you have named all of the regression weights, the path diagram for each sample should look something like this: GPA p1 academic p6 height 1 error1 p2 p3 rating p4 attract p5 weight Results for Model B Text Output Model B fits the data very well. Chi-square = 9.493 Degrees of freedom = 10 Probability level = 0.
185 Felson and Bohrnstedt’s Girls and Boys Comparing Model B against Model A gives a nonsignificant chi-square of 9.493 – 3.183 = 6.310 with 10 – 4 = 6 degrees of freedom. Assuming that Model B is indeed correct, the Model B estimates are preferable over the Model A estimates.
186 Example 11 The unstandardized parameter estimates for the boys are: Regression Weights: (boys - Default model) academic <--attract <--attract <--attract <--attract <--academic <--- GPA height weight rating academic attract Estimate S.E. C.R. .022 .002 9.475 .008 .007 1.177 -.003 .001 -2.453 .145 .020 7.186 1.448 .232 6.234 .018 .039 .469 P *** .239 .014 *** *** .
187 Felson and Bohrnstedt’s Girls and Boys Graphics Output The output path diagram for the girls is: 12.12 GPA .02 1 academic .02 error1 1.45 .00 .02 .53 19.02 1 .0 -6.71 height 371.48 -.47 weight .00 1 attract .14 error2 -5.24 1.02 rating .1 5 1.82 8.43 Example 11: Model B A nonrecursive, two-group model Felson and Bohrnstedt (1979) girls' data Unstandardized estimates And the output for the boys is: 16.24 GPA .02 academic 1 .02 error1 .00 .02 .51 42.09 1 .0 -15.
188 Example 11 Fitting Models A and B in a Single Analysis It is possible to fit both Model A and Model B in the same analysis. The file Ex11-ab.amw in the Amos Examples directory shows how to do this. Model C for Girls and Boys You might consider adding additional constraints to Model B, such as requiring every parameter to have the same value for boys as for girls.
189 Felson and Bohrnstedt’s Girls and Boys E To improve the appearance of the results, from the menus, choose Edit > Move and use the mouse to arrange the six rectangles in a single column like this: The Drag properties option can be used to put the rectangles in perfect vertical alignment. E From the menus, choose Edit > Drag properties. E In the Drag Properties dialog box, select height, width, and X-coordinate. A check mark will appear next to each one.
190 Example 11 E To even out the spacing between the rectangles, from the menus, choose Edit > Select All. E Then choose Edit > Space Vertically. There is a special button for drawing large numbers of double-headed arrows at once. With all six variables still selected from the previous step: E From the menus, choose Tools > Macro > Draw Covariances. Amos draws all possible covariance paths among the selected variables.
191 Felson and Bohrnstedt’s Girls and Boys E Label all variances and covariances with suitable names; for example, label them with letters a through u. In the Object Properties dialog box, always put a check mark next to All groups when you name a parameter. E From the menus, choose Analyze > Manage Models and create a second group for the boys. E Choose File > Data Files and specify the boys’ dataset (Fels_mal.sav) for this group. The file Ex11-c.amw contains the model specification for Model C.
192 Example 11 Modeling in VB.NET Model A The following program fits Model A. It is saved as Ex11-a.vb. Sub Main() Dim Sem As New AmosEngine Try Sem.TextOutput() Sem.BeginGroup(Sem.AmosDir & "Examples\Fels_fem.sav") Sem.GroupName("girls") Sem.AStructure("academic = GPA + attract + error1 (1)") Sem.AStructure _ ("attract = height + weight + rating + academic + error2 (1)") Sem.AStructure("error2 <--> error1") Sem.BeginGroup(Sem.AmosDir & "Examples\Fels_mal.sav") Sem.GroupName("boys") Sem.
193 Felson and Bohrnstedt’s Girls and Boys Model B The following program fits Model B, in which parameter labels p1 through p6 are used to impose equality constraints across groups. The program is saved in Ex11-b.vb. Sub Main() Dim Sem As New AmosEngine Try Sem.TextOutput() Sem.BeginGroup(Sem.AmosDir & "Examples\Fels_fem.sav") Sem.GroupName("girls") Sem.AStructure("academic = (p1) GPA + (p2) attract + (1) error1") Sem.
194 Example 11 Fitting Multiple Models The following program fits both Models A and B. The program is saved in the file Ex11-ab.vb. Sub Main() Dim Sem As New AmosEngine Try Sem.TextOutput() Sem.BeginGroup(Sem.AmosDir & "Examples\Fels_fem.sav") Sem.GroupName("girls") Sem.AStructure("academic = (g1) GPA + (g2) attract + (1) error1") Sem.AStructure("attract = " & _ "(g3) height + (g4) weight + (g5) rating + (g6) academic + (1) error2") Sem.AStructure("error2 <--> error1") Sem.BeginGroup(Sem.
Example 12 Simultaneous Factor Analysis for Several Groups Introduction This example demonstrates how to test whether the same factor analysis model holds for each of several populations, possibly with different parameter values for different populations (Jöreskog, 1971). About the Data We will use the Holzinger and Swineford (1939) data described in Example 8. This time, however, data from the 72 boys in the Grant-White sample will be analyzed along with data from the 73 girls studied in Example 8.
196 Example 12 Model A for the Holzinger and Swineford Boys and Girls Consider the hypothesis that the common factor analysis model of Example 8 holds for boys as well as for girls. The path diagram from Example 8 can be used as a starting point for this two-group model. By default, Amos Graphics assumes that both groups have the same path diagram, so the path diagram does not have to be drawn a second time for the second group.
197 Simultaneous Factor Analysis for Several Groups E While the Manage Groups dialog box is open, create another group by clicking New. E Then, type Boys in the Group Name text box. E Click Close to close the Manage Groups dialog box. Specifying the Data E From the menus, choose File > Data Files. E In the Data Files dialog box, double-click Girls and specify the data file grnt_fem.sav. E Then double-click Boys and specify the data file grnt_mal.sav. E Click OK to close the Data Files dialog box.
198 Example 12 Your path diagram should look something like this for the girls’ sample: 1 spatial visperc cubes lozenges 1 verbal paragrap sentence wordmean 1 1 1 1 1 1 err_v err_c err_l err_p err_s err_w Example 12: Model A Factor analysis: Girls' sample Holzinger and Swineford (1939) Model Specification The boys’ path diagram is identical. Note, however, that the parameter estimates are allowed to be different for the two groups.
199 Simultaneous Factor Analysis for Several Groups Model A is acceptable at any conventional significance level. If Model A had been rejected, we would have had to make changes in the path diagram for at least one of the two groups. Chi-square = 16.480 Degrees of freedom = 16 Probability level = 0.420 Graphics Output Here are the (unstandardized) parameter estimates for the 73 girls. They are the same estimates that were obtained in Example 8 where the girls alone were studied. Chi-square = 16.
200 Example 12 The corresponding output path diagram for the 72 boys is: Chi-square = 16.480 (16 df) p = .420 16.06 1.00 .45 spatial 1.51 visperc cubes lozenges 1 1 1 31.57 err_v 15.69 err_c 36.53 err_l 6.84 6.90 verbal 1.00 1.28 2.29 paragrap sentence wordmean 1 1 1 2.36 err_p 6.04 err_s 19.70 err_w Example 12: Model A Factor analysis: Boys' sample Holzinger and Swineford (1939) Unstandardized estimates Notice that the estimated regression weights vary little across groups.
201 Simultaneous Factor Analysis for Several Groups E Right-click the arrow that points from spatial to cubes and choose Object Properties from the pop-up menu. E In the Object Properties dialog box, click the Parameters tab. E Type cube_s in the Regression weight text box. E Select All groups. A check mark appears next to it. The effect of the check mark is to assign the same name to this regression weight in both groups.
202 Example 12 The path diagram for either of the two samples should now look something like this: 1 spatial verbal cube_s visperc cubes lozn_s lozenges 1 paragrap sent_v word_v sentence wordmean 1 1 1 1 1 1 err_v err_c err_l err_p err_s err_w Results for Model B Text Output Because of the additional constraints in Model B, four fewer parameters have to be estimated from the data, increasing the number of degrees of freedom by 4.
203 Simultaneous Factor Analysis for Several Groups Graphics Output Here are the parameter estimates for the 73 girls: Chi-square = 18.292 (20 df) p = .568 1.00 visperc 1 25.08 err_v 22.00 spatial .56 cubes 1.33 lozenges 1.00 paragrap 1 1 7.22 1 12.38 err_c 25.24 err_l 2.83 err_p 9.72 verbal 1.31 2.26 sentence wordmean 1 1 Example 12: Model B Factor analysis: Girls' sample Holzinger and Swineford (1939) Unstandardized estimates 8.12 err_s 19.
204 Example 12 Here are the parameter estimates for the 72 boys: Chi-square = 18.292 (20 df) p = .568 1.00 visperc 31.56 1 err_v 16.18 spatial .56 cubes 1.33 lozenges 1.00 paragrap 15.25 1 err_c 40.97 1 err_l 7.00 2.36 1 err_p 6.87 verbal 1.31 2.26 sentence wordmean 5.95 1 err_s 19.
205 Simultaneous Factor Analysis for Several Groups Boys’ sample Estimate Standard Error Estimate Standard Error b: cubes <--- spatial b: lozenges <--- spatial b: sentence <--- verbal b: wordmean <--- verbal b: spatial <---> verbal b: var(spatial) b: var(verbal) b: var(err_v) b: var(err_c) b: var(err_l) b: var(err_p) b: var(err_s) b: var(err_w) 0.450 1.510 1.275 2.294 6.840 16.058 6.904 31.571 15.693 36.526 2.364 6.035 19.697 0.176 0.461 0.171 0.308 2.370 7.516 1.622 6.982 2.904 11.532 0.726 1.
206 Example 12 Modeling in VB.NET Model A The following program (Ex12-a.vb) fits Model A for boys and girls: Sub Main() Dim Sem As New AmosEngine Try Sem.TextOutput() Sem.Standardized() Sem.Smc() Sem.BeginGroup(Sem.AmosDir & "Examples\Grnt_fem.sav") Sem.GroupName("Girls") Sem.AStructure("visperc = (1) spatial + (1) err_v") Sem.AStructure("cubes = spatial + (1) err_c") Sem.AStructure("lozenges = spatial + (1) err_l") Sem.AStructure("paragrap = (1) verbal + (1) err_p") Sem.
207 Simultaneous Factor Analysis for Several Groups Model B Here is a program for fitting Model B, in which some parameters are identically named so that they are constrained to be equal. The program is saved as Ex12-b.vb. Sub Main() Dim Sem As New AmosEngine Try Sem.TextOutput() Sem.Standardized() Sem.Smc() Sem.BeginGroup(Sem.AmosDir & "Examples\Grnt_fem.sav") Sem.GroupName("Girls") Sem.AStructure("visperc = (1) spatial + (1) err_v") Sem.AStructure("cubes = (cube_s) spatial + (1) err_c") Sem.
Example 13 Estimating and Testing Hypotheses about Means Introduction This example demonstrates how to estimate means and how to test hypotheses about means. In large samples, the method demonstrated is equivalent to multivariate analysis of variance. Means and Intercept Modeling Amos and similar programs are usually used to estimate variances, covariances, and regression weights, and to test hypotheses about those parameters.
210 Example 13 About the Data For this example, we will be using Attig’s (1983) memory data, which was described in Example 1. We will use data from both young and old subjects. The raw data for the two groups are contained in the Microsoft Excel workbook UserGuide.xls, in the Attg_yng and Attg_old worksheets. In this example, we will be using only the measures recall1 and cued1.
211 Estimating and Testing Hypotheses about Means E Select Estimate means and intercepts. Now the path diagram looks like this (the same path diagram for each group): ,var_rec ,var_cue recall1 cued1 cov_rc The path diagram now shows a mean, variance pair of parameters for each exogenous variable. There are no endogenous variables in this model and hence no intercepts. For each variable in the path diagram, there is a comma followed by the name of a variance.
212 Example 13 The behavior of Amos Graphics changes in several ways when you select (put a check mark next to) Estimate means and intercepts: Mean and intercept fields appear on the Parameters tab in the Object Properties dialog box. Constraints can be applied to means and intercepts as well as regression weights, variances, and covariances. From the menus, choosing Analyze > Calculate Estimates estimates means and intercepts—subject to constraints, if any.
213 Estimating and Testing Hypotheses about Means sample. So, taking both samples together, there are 10 sample moments. As for the parameters to be estimated, there are seven of them, namely var_rec (the variance of recall1), var_cue (the variance of cued1), cov_rc (the covariance between recall1 and cued1), the means of recall1 among young and old people (2), and the means of cued1 among young and old people (2).
214 Example 13 Except for the means, these estimates are the same as those obtained in Example 10, Model B. The estimated standard errors and critical ratios are also the same. This demonstrates that merely estimating means, without placing any constraints on them, has no effect on the estimates of the remaining parameters or their standard errors. Graphics Output The path diagram output for the two groups follows. Each variable has a mean, variance pair displayed next to it.
215 Estimating and Testing Hypotheses about Means E You can enter either a numeric value or a name in the Mean text box. For now, type the name mn_rec. E Select All groups. (A check mark appears next to it. The effect of the check mark is to assign the name mn_rec to the mean of recall1 in every group, requiring the mean of recall1 to be the same for all groups.) E After giving the name mn_rec to the mean of recall1, follow the same steps to give the name mn_cue to the mean of cued1.
216 Example 13 Results for Model B With the new constraints on the means, Model B has five degrees of freedom. Computation of degrees of freedom (Default model) Number of distinct sample moments: Number of distinct parameters to be estimated: Degrees of freedom (10 - 5): 10 5 5 Model B has to be rejected at any conventional significance level. Chi-square = 19.267 Degrees of freedom = 5 Probability level = 0.
217 Estimating and Testing Hypotheses about Means automatically compute the difference in chi-square values as well as the p value for testing Model B against Model A. Mean Structure Modeling in VB.NET Model A Here is a program (Ex13-a.vb) for fitting Model A. The program keeps the variance and covariance restrictions that were used in Example 10, Model B, and, in addition, places constraints on the means. Sub Main() Dim Sem As New AmosEngine Try Sem.TextOutput() Sem.ModelMeansAndIntercepts() Sem.
218 Example 13 exogenous variable has a mean of 0 unless you specify otherwise. You need to use the Model method once for each exogenous variable whose mean you want to estimate. It is easy to forget that Amos programs behave this way when you use ModelMeansAndIntercepts. Note: If you use the Sem.ModelMeansAndIntercepts method in an Amos program, then the Mean method must be called once for each exogenous variable whose mean you want to estimate.
219 Estimating and Testing Hypotheses about Means Fitting Multiple Models Both models A and B can be fitted by the following program. It is saved as Ex13-all.vb. Sub Main() Dim Sem As New AmosEngine Try Sem.TextOutput() Sem.ModelMeansAndIntercepts() Sem.BeginGroup(Sem.AmosDir & "Examples\UserGuide.xls", "Attg_yng") Sem.GroupName("young subjects") Sem.AStructure("recall1 (var_rec)") Sem.AStructure("cued1 (var_cue)") Sem.AStructure("recall1 <> cued1 (cov_rc)") Sem.Mean("recall1", "yng_rec") Sem.
Example 14 Regression with an Explicit Intercept Introduction This example shows how to estimate the intercept in an ordinary regression analysis. Assumptions Made by Amos Ordinarily, when you specify that some variable depends linearly on some others, Amos assumes that the linear equation expressing the dependency contains an additive constant, or intercept, but does not estimate it.
222 Example 14 About the Data We will once again use the data of Warren, White, and Fuller (1974), first used in Example 4. We will use the Excel worksheet Warren5v in UserGuide.xls found in the Examples directory. Here are the sample moments (means, variances, and covariances): Specifying the Model You can specify the regression model exactly as you did in Example 4. In fact, if you have already worked through Example 4, you can use that path diagram as a starting point for this example.
223 Regression with an Explicit Intercept knowledge 0, value performance 1 error satisfaction Example 14 Job Performance of Farm Managers Regression with an explicit intercept (Model Specification) Notice the string 0, displayed above the error variable. The 0 to the left of the comma indicates that the mean of the error variable is fixed at 0, a standard assumption in linear regression models.
224 Example 14 Computation of degrees of freedom (Default model) Number of distinct sample moments: Number of distinct parameters to be estimated: Degrees of freedom (14 - 14): 14 14 0 With 0 degrees of freedom, there is no hypothesis to be tested. Chi-square = 0.000 Degrees of freedom = 0 Probability level cannot be computed The estimates for regression weights, variances, and covariances are the same as in Example 4, and so are the associated standard error estimates, critical ratios, and p values.
225 Regression with an Explicit Intercept Graphics Output Below is the path diagram that shows the unstandardized estimates for this example. The intercept of –0.83 appears just above the endogenous variable performance. 1.38, .05 knowledge .26 .03 2.88, .12 .15 value .00 -.83 performance 0, .01 1 error -.01 2.46, .09 satisfaction .05 Example 14 Job Performance of Farm Managers Regression with an explicit intercept (Unstandardized estimates) Modeling in VB.
226 Example 14 The following program for the model of Example 14 gives all the same results, plus mean and intercept estimates. This program is saved as Ex14.vb. Sub Main() Dim Sem As New AmosEngine Try Sem.TextOutput() Sem.Standardized() Sem.Smc() Sem.ImpliedMoments() Sem.SampleMoments() Sem.ModelMeansAndIntercepts() Sem.BeginGroup( _ Sem.AmosDir & "Examples\UserGuide.xls", "Warren5v") Sem.AStructure( _ "performance = () + knowledge + value + satisfaction + error (1)") Sem.Mean("knowledge") Sem.
227 Regression with an Explicit Intercept Sub Main() Dim Sem As New AmosEngine Try Sem.TextOutput() Sem.Standardized() Sem.Smc() Sem.ImpliedMoments() Sem.SampleMoments() Sem.ModelMeansAndIntercepts() Sem.BeginGroup( _ Sem.AmosDir & "Examples\UserGuide.xls", "Warren5v") Sem.AStructure("performance <--- knowledge") Sem.AStructure("performance <--- value") Sem.AStructure("performance <--- satisfaction") Sem.AStructure("performance <--- error (1)") Sem.Intercept("performance") Sem.Mean("knowledge") Sem.
Example 15 Factor Analysis with Structured Means Introduction This example demonstrates how to estimate factor means in a common factor analysis of data from several populations. Factor Means Conventionally, the common factor analysis model does not make any assumptions about the means of any variables. In particular, the model makes no assumptions about the means of the common factors.
230 Example 15 The identification status of the factor analysis model is a difficult subject when estimating factor means. In fact, Sörbom’s accomplishment was to show how to constrain parameters so that the factor analysis model is identified and so that differences in factor means can be estimated. We will follow Sörbom’s guidelines for achieving model identification in the present example. About the Data We will use the Holzinger and Swineford (1939) data from Example 12.
231 Factor Analysis with Structured Means E In the Object Properties dialog box, click the Parameters tab. E Enter a parameter name, such as int_vis, in the Intercept text box. E Select All groups, so that the intercept is named int_vis in both groups. E Proceed in the same way to give names to the five remaining intercepts. As Sörbom showed, it is necessary to fix the factor means in one of the groups at a constant. We will fix the means of the boys’ spatial and verbal factors at 0.
232 Example 15 The boys’ path diagram should look like this: int_vis 1 0, spatial cube_s lozn_s visperc 1 int_cub 1 cubes int_loz 1 lozenges int_par 1 1 0, verbal paragrap 0, err_v 0, err_c 0, err_l 0, err_p int_sen sent_v word_v sentence 1 int_wrd 1 wordmean 0, err_s 0, err_w Understanding the Cross-Group Constraints The cross-group constraints on intercepts and regression weights may or may not be satisfied in the populations.
233 Factor Analysis with Structured Means the difference between the boys’ mean and the girls’ mean will be the same, no matter which mean you fix and no matter what value you fix for it. Results for Model A Text Output There is no reason to reject Model A at any conventional significance level. Chi-square = 22.593 Degrees of freedom = 24 Probability level = 0.
234 Example 15 Here are the boys’ estimates: 30.14 1 1.00 visperc 0, 15.75 spatial 0, 31.87 err_v 25.12 .56 1 cubes 0, 15.31 err_c 16.60 1.37 6.98 lozenges 1 9.45 1.00 0, 7.03 verbal paragrap 1 0, 40.71 err_l 0, 2.35 err_p 18.26 1.28 2.21 sentence 1 16.22 1 wordmean 0, 6.02 err_s 0, 20.33 err_w Girls have an estimated mean spatial ability of –1.07. We fixed the mean of boys’ spatial ability at 0. Thus, girls’ mean spatial ability is estimated to be 1.
235 Factor Analysis with Structured Means Here are the girls’ factor mean estimates from the text output: Means: (Girls - Default model) spatial verbal Estimate -1.066 .956 S.E. .881 .521 C.R. P -1.209 .226 1.836 .066 Label mn_s mn_v The girls’ mean spatial ability has a critical ratio of –1.209 and is not significantly different from 0 ( p = 0.226 ). In other words, it is not significantly different from the boys’ mean. Turning to verbal ability, the girls’ mean is estimated 0.
236 Example 15 E Click New. E Type Model B in the Model Name text box. E Type the constraints mn_s = 0 and mn_v = 0 in the Parameter Constraints text box. E Click Close. Now when you choose Analyze > Calculate Estimates, Amos will fit both Model A and Model B. The file Ex15-all.amw contains this two-model setup.
237 Factor Analysis with Structured Means Results for Model B If we did not have Model A as a basis for comparison, we would now accept Model B, using any conventional significance level. Chi-square = 30.624 Degrees of freedom = 26 Probability level = 0.243 Comparing Models A and B An alternative test of Model B can be obtained by assuming that Model A is correct and testing whether Model B fits significantly worse than Model A. A chi-square test for this comparison is given in the text output.
238 Example 15 Modeling in VB.NET Model A The following program fits Model A. It is saved as Ex15-a.vb. Sub Main() Dim Sem As New AmosEngine Try Sem.TextOutput() Sem.Standardized() Sem.Smc() Sem.ModelMeansAndIntercepts() Sem.BeginGroup(Sem.AmosDir & "Examples\Grnt_fem.sav") Sem.GroupName("Girls") Sem.AStructure("visperc = (int_vis) + (1) spatial + (1) err_v") Sem.AStructure("cubes = (int_cub) + (cube_s) spatial + (1) err_c") Sem.AStructure("lozenges = (int_loz) + (lozn_s) spatial + (1) err_l") Sem.
239 Factor Analysis with Structured Means Model B The following program fits Model B. In this model, the factor means are fixed at 0 for both boys and girls. The program is saved as Ex15-b.vb. Sub Main() Dim Sem As New AmosEngine Try Dim dataFile As String = Sem.AmosDir & "Examples\userguide.xls" Sem.TextOutput() Sem.Standardized() Sem.Smc() Sem.ModelMeansAndIntercepts() Sem.BeginGroup(dataFile, "grnt_fem") Sem.GroupName("Girls") Sem.AStructure("visperc = (int_vis) + (1) spatial + (1) err_v") Sem.
240 Example 15 Fitting Multiple Models The following program (Ex15-all.vb) fits both models A and B. Sub Main() Dim Sem As New AmosEngine Try Sem.TextOutput() Sem.Standardized() Sem.Smc() Sem.ModelMeansAndIntercepts() Sem.BeginGroup(Sem.AmosDir & "Examples\Grnt_fem.sav") Sem.GroupName("Girls") Sem.AStructure("visperc = (int_vis) + (1) spatial + (1) err_v") Sem.AStructure("cubes = (int_cub) + (cube_s) spatial + (1) err_c") Sem.AStructure("lozenges = (int_loz) + (lozn_s) spatial + (1) err_l") Sem.
Example 16 Sörbom’s Alternative to Analysis of Covariance Introduction This example demonstrates latent structural equation modeling with longitudinal observations in two or more groups, models that generalize traditional analysis of covariance techniques by incorporating latent variables and autocorrelated residuals (compare to Sörbom, 1978), and how assumptions employed in traditional analysis of covariance can be tested.
242 Example 16 About the Data We will again use the Olsson (1973) data introduced in Example 9. The sample means, variances, and covariances from the 108 experimental subjects are in the Microsoft Excel worksheet Olss_exp in the workbook UserGuide.xls. The sample means, variances, and covariances from the 105 control subjects are in the worksheet Olss_cnt. Both datasets contain the customary unbiased estimates of variances and covariances.
243 Sörbom’s Alternative to Analysis of Covariance Changing the Default Behavior E From the menus, choose View > Analysis Properties. E In the Analysis Properties dialog box, click the Bias tab. The default setting used by Amos yields results that are consistent with missing data modeling (discussed in Example 17 and Example 18).
244 Example 16 0, 0, 0, eps1 1 pre_syn 1 a_opp1 pre_opp 1 eps4 1 1 a_syn1 0, eps3 eps2 a_syn2 post_syn opp_v1 a_opp2 post_opp 1 pre_verbal opp_v2 post_verbal 0, 1 0 0, zeta Example 16: Model A An alternative to ANCOVA Olsson (1973): control condition.
245 Sörbom’s Alternative to Analysis of Covariance as verbal ability at the beginning of the study, and post_verbal is interpreted as verbal ability at the conclusion of the study. This is Sörbom’s measurement model. The structural model specifies that post_verbal depends linearly on pre_verbal. The labels opp_v1 and opp_v2 require the regression weights in the measurement model to be the same for both groups.
246 Example 16 We also get the following message that provides further evidence that Model A is wrong: The following variances are negative. (control - Default model) zeta -2.868 Can we modify Model A so that it will fit the data while still permitting a meaningful comparison of the experimental and control groups? It will be helpful here to repeat the analysis and request modification indices. To obtain modification indices: E From the menus, choose View > Analysis Properties.
247 Sörbom’s Alternative to Analysis of Covariance Model B The largest modification index obtained with Model A suggests adding a covariance between eps2 and eps4 in the experimental group. The modification index indicates that the chi-square statistic will drop by at least 10.508 if eps2 and eps4 are allowed to have a nonzero covariance. The parameter change statistic of 4.700 indicates that the covariance estimate will be positive if it is allowed to take on any value.
248 Example 16 For Model B, the path diagram for the control group is: 0 0, eps1 1 pre_opp opp_v1 pre_verbal 0, eps4 1 a_opp1 1 0, eps3 1 1 a_syn1 pre_syn 0, 0, eps2 a_syn2 post_syn a_opp2 post_opp 1 opp_v2 post_verbal 1 0 Example 16: Model B An alternative to ANCOVA Olsson (1973): control condition.
249 Sörbom’s Alternative to Analysis of Covariance For the experimental group, the path diagram is: 0, eps1 1 eps3 1 a_opp1 pre_opp 1 0, eps4 1 1 a_syn1 pre_syn 0, 0, eps2 opp_v1 pre_verbal pre_diff, a_syn2 post_syn a_opp2 post_opp 1 opp_v2 post_verbal effect 1 0, zeta Example 16: Model B An alternative to ANCOVA Olsson (1973): experimental condition. Model Specification Results for Model B In moving from Model A to Model B, the chi-square statistic dropped by 17.
250 Example 16 Modification Indices (control - Default model) Covariances: (control - Default model) M.I. eps2 <--> eps4 4.727 eps1 <--> eps4 4.086 Par Change 2.141 -2.384 Variances: (control - Default model) M.I. Par Change Regression Weights: (control - Default model) M.I. Par Change Means: (control - Default model) M.I. Par Change Intercepts: (control - Default model) M.I. Par Change The largest modification index (4.727) suggests allowing eps2 and eps4 to be correlated in the control group.
251 Sörbom’s Alternative to Analysis of Covariance Results for Model C Finally, we have a model that fits. Chi-square = 2.797 Degrees of freedom = 4 Probability level = 0.592 From the point of view of statistical goodness of fit, there is no reason to reject Model C. It is also worth noting that all the variance estimates are positive. The following are the parameter estimates for the 105 control subjects: 6.22 0, 10.04 eps1 0, 5.63 0, 12.12 eps3 eps2 1 1 1 18.63 0, 12.36 eps4 1 19.91 20.
252 Example 16 7.34 0, 2.19 eps1 0, 7.51 0, 12.39 eps3 eps2 1 1 1 18.63 0, 17.07 eps4 1 19.91 20.38 21.21 pre_syn pre_opp post_syn post_opp 1.00 .88 1.00 .90 pre_verbal 1.87, 47.46 .85 post_verbal 1 3.71 0, 8.86 zeta Example 16: Model C An alternative to ANCOVA Olsson (1973): experimental condition. Unstandardized estimates Most of these parameter estimates are not very interesting, although you may want to check and make sure that the estimates are reasonable.
253 Sörbom’s Alternative to Analysis of Covariance 0, eps1 0, 0, 1 pre_syn 1 a_opp1 a_syn2 pre_opp 1 eps4 1 1 a_syn1 0, eps3 eps2 post_syn opp_v1 1 pre2post pre_verbal a_opp2 post_opp opp_v2 post_verbal 1 pre_diff, effect 0, zeta Example 16: Model D An alternative to ANCOVA Olsson (1973): experimental condition.
254 Example 16 Testing Model D against Model C gives a chi-square value of 1.179 (= 3.976 – 2.797) with 1 (that is, 5 – 4) degree of freedom. Again, you would accept the hypothesis of equal regression weights (Model D). With equal regression weights, the comparison of treated and untreated subjects now turns on the difference between their intercepts. Here are the parameter estimates for the 105 control subjects: 6.33 0, 9.49 0, 5.78 0, 11.92 eps1 1 0, 12.38 eps3 eps2 eps4 1 1 18.62 1 19.
255 Sörbom’s Alternative to Analysis of Covariance intercept for the experimental group is significantly different from the intercept for the control group (which is fixed at 0). Model E Another way of testing the difference in post_verbal intercepts for significance is to repeat the Model D analysis with the additional constraint that the intercept be equal across groups.
256 Example 16 Comparison of Sörbom’s Method with the Method of Example 9 Sörbom’s alternative to analysis of covariance is more difficult to apply than the method of Example 9. On the other hand, Sörbom’s method is superior to the method of Example 9 because it is more general. That is, you can duplicate the method of Example 9 by using Sörbom’s method with suitable parameter constraints. We end this example with three additional models called X, Y, and Z.
257 Sörbom’s Alternative to Analysis of Covariance The following is the path diagram for Model X for the control group: c_s1o1 v_s1 pre_syn c_ c_s1s2 v_o1 pre_opp s1 o 2 o1 s2 c_ v_s2 c_o1o2 v_o2 post_syn post_opp c_s2o2 Example 16: Model X Group-invariant covariance structure Olsson (1973): control condition Model Specification The path diagram for the experimental group is identical.
258 Example 16 Apart from the correlation between eps2 and eps4, Model D required that eps1, eps2, eps3, eps4, and zeta be uncorrelated among themselves and with every other exogenous variable. These new constraints amount to requiring that the variances and covariances of all exogenous variables be the same for both groups.
259 Sörbom’s Alternative to Analysis of Covariance Here is the path diagram for the control group: c_e2e4 0, v_e1 eps1 0, v_e3 0, v_e2 eps3 eps2 1 1 1 a_syn1 pre_syn 1 a_opp1 pre_opp 1 0, v_e4 eps4 a_syn2 post_syn opp_v1 pre_verbal a_opp2 post_opp 1 pre2post 0, v_v1 opp_v2 post_verbal 1 0 0, v_z zeta Example 16: Model Y An alternative to ANCOVA Olsson (1973): control condition. Model Specification Results for Model Y We must reject Model Y. Chi-square = 31.
260 Example 16 covariances of the exogenous variables) imply the assumptions of Model X (equal covariances for the observed variables). Models X and Y are therefore nested models, and it is possible to carry out a conditional test of Model Y under the assumption that Model X is true. Of course, it will make sense to do that test only if Model X really is true, and we have already concluded it is not. Nevertheless, let’s go through the motions of testing Model Y against Model X.
261 Sörbom’s Alternative to Analysis of Covariance Here is the path diagram for Model Z for the experimental group: c_e2e4 0, v_e1 eps1 0, v_e3 0, v_e2 eps3 eps2 1 1 1 a_syn1 pre_syn 1 a_opp1 a_syn2 pre_opp 1 0, v_e4 eps4 post_syn opp_v1 1 pre2post pre_verbal a_opp2 post_opp opp_v2 post_verbal 1 pre_diff, v_v1 0 0, v_z zeta Example 16: Model Z An alternative to ANCOVA Olsson (1973): experimental condition.
262 Example 16 Model Z also has to be rejected when compared to Model Y (χ2 = 84.280 – 31.816 = 52.464, df = 13 – 12 = 1). Within rounding error, this is the same difference in chi-square values and degrees of freedom as in Example 9, when Model C was compared to Model B. Modeling in VB.NET Model A The following program fits Model A. It is saved as Ex16-a.vb. Sub Main() Dim Sem As New AmosEngine Try Dim dataFile As String = Sem.AmosDir & "Examples\UserGuide.xls" Sem.TextOutput() Sem.Mods(4) Sem.
263 Sörbom’s Alternative to Analysis of Covariance Model B To fit Model B, start with the program for Model A and add the line Sem.AStructure("eps2 <---> eps4") to the model specification for the experimental group. Here is the resulting program for Model B. It is saved as Ex16-b.vb. Sub Main() Dim Sem As New AmosEngine Try Dim dataFile As String = Sem.AmosDir & "Examples\UserGuide.xls" Sem.TextOutput() Sem.Mods(4) Sem.Standardized() Sem.Smc() Sem.ModelMeansAndIntercepts() Sem.
264 Example 16 Model C The following program fits Model C. The program is saved as Ex16-c.vb. Sub Main() Dim Sem As New AmosEngine Try Dim dataFile As String = Sem.AmosDir & "Examples\UserGuide.xls" Sem.TextOutput() Sem.Mods(4) Sem.Standardized() Sem.Smc() Sem.ModelMeansAndIntercepts() Sem.BeginGroup(dataFile, "Olss_cnt") Sem.GroupName("control") Sem.AStructure("pre_syn = (a_syn1) + (1) pre_verbal + (1) eps1") Sem.AStructure( _ "pre_opp = (a_opp1) + (opp_v1) pre_verbal + (1) eps2") Sem.
265 Sörbom’s Alternative to Analysis of Covariance Model D The following program fits Model D. The program is saved as Ex16-d.vb. Sub Main() Dim Sem As New AmosEngine Try Dim dataFile As String = Sem.AmosDir & "Examples\UserGuide.xls" Sem.TextOutput() Sem.Mods(4) Sem.Standardized() Sem.Smc() Sem.ModelMeansAndIntercepts() Sem.BeginGroup(dataFile, "Olss_cnt") Sem.GroupName("control") Sem.AStructure("pre_syn = (a_syn1) + (1) pre_verbal + (1) eps1") Sem.
266 Example 16 Model E The following program fits Model E. The program is saved as Ex16-e.vb. Sub Main() Dim Sem As New AmosEngine Try Dim dataFile As String = Sem.AmosDir & "Examples\UserGuide.xls" Sem.TextOutput() Sem.Mods(4) Sem.Standardized() Sem.Smc() Sem.ModelMeansAndIntercepts() Sem.BeginGroup(dataFile, "Olss_cnt") Sem.GroupName("control") Sem.AStructure("pre_syn = (a_syn1) + (1) pre_verbal + (1) eps1") Sem.AStructure( _ "pre_opp = (a_opp1) + (opp_v1) pre_verbal + (1) eps2") Sem.
267 Sörbom’s Alternative to Analysis of Covariance Fitting Multiple Models The following program fits all five models, A through E. The program is saved as Ex16-a2e.vb. Sub Main() Dim Sem As New AmosEngine Try Dim dataFile As String = Sem.AmosDir & "Examples\UserGuide.xls" Sem.TextOutput() Sem.Mods(4) Sem.Standardized() Sem.Smc() Sem.ModelMeansAndIntercepts() Sem.BeginGroup(dataFile, "Olss_cnt") Sem.GroupName("control") Sem.AStructure("pre_syn = (a_syn1) + (1) pre_verbal + (1) eps1") Sem.
268 Example 16 Models X, Y, and Z VB.NET programs for Models X, Y, and Z will not be discussed here. The programs can be found in the files Ex16-x.vb, Ex16-y.vb, and Ex16-z.vb.
Example 17 Missing Data Introduction This example demonstrates the analysis of a dataset in which some values are missing. Incomplete Data It often happens that data values that were anticipated in the design of a study fail to materialize. Perhaps a subject failed to participate in part of a study. Or maybe a person filling out a questionnaire skipped a couple of questions.
270 Example 17 exclude only persons whose incomes you do not know. Similarly, in computing the sample covariance between age and income, you would exclude an observation only if age is missing or if income is missing. This approach to missing data is sometimes called pairwise deletion. A third approach is data imputation, replacing the missing values with some kind of guess, and then proceeding with a conventional analysis appropriate for complete data.
271 Missing Data The resulting dataset is in the SPSS Statistics file Grant_x.sav. Below are the first few cases in that file. A period (.) represents a missing value. Amos recognizes the periods in SPSS Statistics datasets and treats them as missing data. Amos recognizes missing data in many other data formats as well. For instance, in an ASCII dataset, two consecutive delimiters indicate a missing value.
272 Example 17 1 spatial visperc cubes lozenges 1 verbal paragraph sentence wordmean 1 1 1 1 1 1 err_v err_c err_l err_p err_s err_w Example 17, Model A Factor analysis with missing data Holzinger and Swineford (1939): Girls' sample Model Specification After specifying the data file to be Grant_x.sav and drawing the above path diagram: E From the menus, choose View > Analysis Properties. E In the Analysis Properties dialog box, click the Estimation tab.
273 Missing Data large number of parameters. In addition, some missing data value patterns can make it impossible in principle to fit the saturated model even if it is possible to fit your model. With incomplete data, Amos Graphics tries to fit the saturated and independence models in addition to your model. If Amos fails to fit the independence model, then fit measures that depend on the fit of the independence model, such as CFI, cannot be computed.
274 Example 17 Regression Weights: (Group number 1 - Default model) Estimate S.E. C.R. 1.000 .511 .153 3.347 1.047 .316 3.317 1.000 1.259 .194 6.505 2.140 .326 6.572 visperc <--- spatial cubes <--- spatial lozenges <--- spatial paragrap <--- verbal sentence <--- verbal wordmean <--- verbal P Label *** *** *** *** Intercepts: (Group number 1 - Default model) Estimate S.E. 28.885 .913 24.998 .536 15.153 1.133 18.097 1.055 10.987 .468 18.864 .636 visperc cubes lozenges wordmean paragrap sentence C.R.
275 Missing Data Graphics Output Here is the path diagram showing the standardized estimates and the squared multiple correlations for the endogenous variables: Chi square = 11.547 df = 8 p = .173 .61 .78 visperc err_v .49 spatial .70 cubes err_c .47 .69 .45 lozenges err_l .79 .89 paragraph err_p .69 verbal .83 sentence err_s .69 .
276 Example 17 This section outlines three steps necessary for computing the likelihood ratio chisquare statistic: Fitting the factor model Fitting the saturated model Computing the likelihood ratio chi-square statistic and its p value First, the three steps are performed by three separate programs. After that, the three steps will be combined into a single program. Fitting the Factor Model (Model A) The following program fits the confirmatory factor model (Model A). It is saved as Ex17-a.vb.
277 Missing Data intercepts do not have to appear in the model unless you want to estimate them or constrain them. The fit of Model A is summarized as follows: Function of log likelihood = 1375.133 Number of parameters = 19 The Function of log likelihood value is displayed instead of the chi-square fit statistic that you get with complete data.
278 Example 17 The following program fits the saturated model (Model B). The program is saved as Ex17-b.vb. Sub Main() Dim Saturated As New AmosEngine Try 'Set up and estimate Saturated model: Saturated.Title("Example 17 b: Saturated Model") Saturated.TextOutput() Saturated.AllImpliedMoments() Saturated.ModelMeansAndIntercepts() Saturated.BeginGroup(Saturated.AmosDir & "Examples\Grant_x.sav") Saturated.Mean("visperc") Saturated.Mean("cubes") Saturated.Mean("lozenges") Saturated.Mean("paragrap") Saturated.
279 Missing Data The following are the unstandardized parameter estimates for the saturated Model B: Means: (Group number 1 - Model 1) Estimate S.E. C.R. .910 31.756 visperc 28.883 .540 46.592 cubes 25.154 lozenges 14.962 1.101 13.591 .466 23.572 paragrap 10.976 .632 29.730 sentence 18.802 wordmean 18.263 1.061 17.211 P *** *** *** *** *** *** Label Covariances: (Group number 1 - Model 1) Estimate S.E. C.R. 17.484 4.614 3.789 visperc <--> cubes visperc <--> lozenges 31.173 9.232 3.
280 Example 17 The AllImpliedMoments method in the program displays the following table of estimates: Implied (for all variables) Covariances (Group number 1 - Model 1) wordmean sentence paragrap lozenges cubes wordmean 73.974 sentence 29.577 25.007 paragrap 23.616 13.470 13.570 lozenges 29.655 10.544 9.287 67.901 cubes 3.470 1.678 2.739 17.036 16.484 visperc 14.665 14.382 8.453 31.173 17.484 Implied (for all variables) Means (Group number 1 - Model 1) wordmean sentence paragrap lozenges cubes 18.263 18.
281 Missing Data Computing the Likelihood Ratio Chi-Square Statistic and P Instead of consulting a chi-square table, you can use the ChiSquareProbability method to find the probability that a chi-square value as large as 11.547 would have occurred with a correct factor model. The following program shows how the ChiSquareProbability method is used. The program is saved as Ex17-c.vb. Sub Main() Dim ChiSquare As Double, P As Double Dim Df As Integer ChiSquare = 1375.133 - 1363.
282 Example 17 The p value is 0.173; therefore, we accept the hypothesis that Model A is correct at the 0.05 level. As the present example illustrates, in order to test a model with incomplete data, you have to compare its fit to that of another, alternative model. In this example, we wanted to test Model A, and it was necessary also to fit Model B as a standard against which Model A could be compared. The alternative model has to meet two requirements. First, you have to be satisfied that it is correct.
Example 18 More about Missing Data Introduction This example demonstrates the analysis of data in which some values are missing by design and then explores the benefits of intentionally collecting incomplete data. Missing Data Researchers do not ordinarily like missing data. They typically take great care to avoid these gaps whenever possible. But sometimes it is actually better not to observe every variable on every occasion.
284 Example 18 About the Data For this example, the Attig data (introduced in Example 1) was modified by eliminating some of the data values and treating them as missing. A portion of the modified data file for young people, Atty_mis.sav, is shown below as it appears in the SPSS Statistics Data Editor. The file contains scores of Attig’s 40 young subjects on the two vocabulary tests v_short and vocab. The variable vocab is the WAIS vocabulary score.
285 More about Missing Data Of course, no sensible person deletes data that have already been collected. In order for this example to make sense, imagine this pattern of missing data arising in the following circumstances. Suppose that vocab is the best vocabulary test you know of. It is highly reliable and valid, and it is the vocabulary test that you want to use. Unfortunately, it is an expensive test to administer.
286 Example 18 E In the Analysis Properties dialog box, click the Estimation tab. E Select Estimate means and intercepts (a check mark appears next to it). E While the Analysis Properties dialog box is open, click the Output tab. E Select Standardized estimates and Critical ratios for differences. Because this example focuses on group differences in the mean of vocab, it will be useful to have names for the mean of the young group and the mean of the old group.
287 More about Missing Data Results for Model A Graphics Output Here are the two path diagrams containing means, variances, and covariances for the young and old subjects respectively: 56.89, 83.32 vocab 7.95, 15.35 v_short 65.00, 115.06 vocab 10.03, 10.77 v_short 32.92 31.54 Example 18: Model A Incompletely observed data. Attig (1983) young subjects Unstandardized estimates Example 18: Model A Incompletely observed data.
288 Example 18 The parameter estimates and standard errors for young subjects are: Means: (young subjects - Default model) vocab v_short Estimate 56.891 7.950 S.E. 1.765 .627 C.R. 32.232 12.673 P *** *** Label m1_yng par_4 Covariances: (young subjects - Default model) vocab <--> v_short Estimate 32.916 S.E. 8.694 C.R. 3.786 P *** Label par_3 Correlations: (young subjects - Default model) vocab <--> v_short Estimate .920 Variances: (young subjects - Default model) vocab v_short Estimate 83.
289 More about Missing Data 62, is about 4.21. Although the standard errors just mentioned are only approximations, they still provide a rough basis for comparison. In the case of the young subjects, using the information contained in the v_short scores reduces the standard error of the estimated vocab mean by about 21%. In the case of the old subjects, the standard error was reduced by about 49%.
290 Example 18 The first two rows and columns, labeled m1_yng and m1_old, refer to the group means of the vocab test. The critical ratio for the mean difference is 2.901, according to which the means differ significantly at the 0.05 level; the older population scores higher on the long test than the younger population. Another test of the hypothesis of equal vocab group means can be obtained by refitting the model with equality constraints imposed on these two means. We will do that next.
291 More about Missing Data E To specify Model B, click New. E In the Model Name text box, change Model Number 2 to Model B. E Type m1_old = m1_yng in the Parameter Constraints text box. E Click Close. A path diagram that fits both Model A and Model B is saved in the file Ex18-b.amw. Output from Models A and B E To see fit measures for both Model A and Model B, click Model Fit in the tree diagram in the upper left pane of the Amos Output window.
292 Example 18 Modeling in VB.NET Model A The following program fits Model A. It estimates means, variances, and covariances of both vocabulary tests in both groups of subjects, without constraints. The program is saved as Ex18-a.vb. Sub Main() Dim Sem As New AmosEngine Try Sem.TextOutput() Sem.Crdiff() Sem.ModelMeansAndIntercepts() Sem.BeginGroup(Sem.AmosDir & "Examples\atty_mis.sav") Sem.GroupName("young_subjects") Sem.Mean("vocab", "m1_yng") Sem.Mean("v_short") Sem.BeginGroup(Sem.
293 More about Missing Data Model B Here is a program for fitting Model B. In this program, the same parameter name (mn_vocab) is used for the vocab mean of the young group as for the vocab mean of the old group. In this way, the young group and old group are required to have the same vocab mean. The program is saved as Ex18-b.vb. Sub Main() Dim Sem As New AmosEngine Try Sem.TextOutput() Sem.Crdiff() Sem.ModelMeansAndIntercepts() Sem.BeginGroup(Sem.AmosDir & "Examples\atty_mis.sav") Sem.
Example 19 Bootstrapping Introduction This example demonstrates how to obtain robust standard error estimates by the bootstrap method. The Bootstrap Method The bootstrap (Efron, 1982) is a versatile method for estimating the sampling distribution of parameter estimates. In particular, the bootstrap can be used to find approximate standard errors. As we saw in earlier examples, Amos automatically displays approximate standard errors for the parameters it estimates.
296 Example 19 The bootstrap has its own shortcomings, including the fact that it can require fairly large samples. For readers who are new to bootstrapping, we recommend the Scientific American article by Diaconis and Efron (1983). The present example demonstrates the bootstrap with a factor analysis model, but, of course, you can use the bootstrap with any model. Incidentally, don’t forget that Amos can solve simple estimation problems like the one in Example 1.
297 Bootstrapping E Click the Bootstrap tab. E Select Perform bootstrap. E Type 500 in the Number of bootstrap samples text box. Monitoring the Progress of the Bootstrap You can monitor the progress of the bootstrap algorithm by watching the Computation summary panel at the left of the path diagram. Results of the Analysis The model fit is, of course, the same as in Example 8. Chi-square = 7.853 Degrees of freedom = 8 Probability level = 0.448 The parameter estimates are also the same as in Example 8.
298 Example 19 Regression Weights: (Group number 1 - Default model) Estimate S.E. C.R. 1.000 .610 .143 4.250 1.198 .272 4.405 1.000 1.334 .160 8.322 2.234 .263 8.482 visperc <--- spatial cubes <--- spatial lozenges <--- spatial paragrap <--- verbal sentence <--- verbal wordmean <--- verbal P Label *** *** *** *** Standardized Regression Weights: (Group number 1 Default model) visperc <--cubes <--lozenges <--paragrap <--sentence <--wordmean <--- spatial spatial spatial verbal verbal verbal Estimate .
299 Bootstrapping The bootstrap output begins with a table of diagnostic information that is similar to the following: 0 bootstrap samples were unused because of a singular covariance matrix. 0 bootstrap samples were unused because a solution was not found. 500 usable bootstrap samples were obtained. It is possible that one or more bootstrap samples will have a singular covariance matrix, or that Amos will fail to find a solution for some bootstrap samples.
300 Example 19 Scalar Estimates (Group number 1 - Default model) Regression Weights: (Group number 1 - Default model) Parameter SE SE-SE Mean Bias visperc <--- spatial .000 .000 1.000 .000 cubes <--- spatial .140 .004 .609 -.001 lozenges <--- spatial .373 .012 1.216 .018 paragrap <--- verbal .000 .000 1.000 .000 sentence <--- verbal .176 .006 1.345 .011 wordmean <--- verbal .254 .008 2.246 .011 SE-Bias .000 .006 .017 .000 .008 .
301 Bootstrapping The first column, labeled S.E., contains bootstrap estimates of standard errors. These estimates may be compared to the approximate standard error estimates obtained by maximum likelihood. The second column, labeled S.E.-S.E., gives an approximate standard error for the bootstrap standard error estimate itself. The column labeled Mean represents the average parameter estimate computed across bootstrap samples.
Example 20 Bootstrapping for Model Comparison Introduction This example demonstrates the use of the bootstrap for model comparison. Bootstrap Approach to Model Comparison The problem addressed by this method is not that of evaluating an individual model in absolute terms but of choosing among two or more competing models. Bollen and Stine (1992), Bollen (1982), and Stine (1989) suggested the possibility of using the bootstrap for model selection in analysis of moment structures.
304 Example 20 About the Data The present example uses the combined male and female data from the Grant-White high school sample of the Holzinger and Swineford (1939) study, previously discussed in Examples 8, 12, 15, 17, and 19. The 145 combined observations are given in the file Grant.sav. Five Models Five measurement models will be fitted to the six psychological tests. Model 1 is a factor analysis model with one factor.
305 Bootstrapping for Model Comparison Model 2 is an unrestricted factor analysis with two factors. Note that fixing two of the regression weights at 0 does not constrain the model but serves only to make the model identified (Anderson, 1984; Bollen and Jöreskog, 1985; Jöreskog, 1979).
306 Example 20 The remaining two models provide customary points of reference for evaluating the fit of the previous models. In the saturated model, the variances and covariances of the observed variables are unconstrained.
307 Bootstrapping for Model Comparison You would not ordinarily fit the saturated and independence models separately, since Amos automatically reports fit measures for those two models in the course of every analysis. However, it is necessary to specify explicitly the saturated and independence models in order to get bootstrap results for those models. Five separate bootstrap analyses must be performed, one for each model. For each of the five analyses: E From the menus, choose View > Analysis Properties.
308 Example 20 E Click the Numerical tab and limit the number of iterations to a realistic figure (such as 40) in the Iteration limit field. Amos Graphics input files for the five models have been saved with the names Ex20-1.amw, Ex20-2.amw, Ex20-2r.amw, Ex20-sat.amw, and Ex20-ind.amw. Text Output E In viewing the text output for Model 1, click Summary of Bootstrap Iterations in the tree diagram in the upper left pane of the Amos Output window.
309 Bootstrapping for Model Comparison implied moments obtained from fitting Model 1 to the b-th bootstrap sample. Thus, C ML ( α̂b, a ) is a measure of how much the population moments differ from the moments estimated from the b-th bootstrap sample using Model 1. ML discrepancy (implied vs pop) (Default model) 48.268 52.091 55.913 59.735 63.557 67.379 71.202 75.024 78.846 82.668 86.490 90.313 94.135 97.957 101.779 N = 1000 Mean = 64.162 S. e. = .
310 Example 20 fail for models that fit poorly. If some way could be found to successfully fit Model 2 to these 19 samples—for example, with hand-picked start values or a superior algorithm—it seems likely that the discrepancies would be large. According to this line of reasoning, discarding bootstrap samples for which estimation failed would lead to a downward bias in the mean discrepancy.
Example 21 Bootstrapping to Compare Estimation Methods Introduction This example demonstrates how bootstrapping can be used to choose among competing estimation criteria. Estimation Methods The discrepancy between the population moments and the moments implied by a model depends not only on the model but also on the estimation method. The technique used in Example 20 to compare models can be adapted to the comparison of estimation methods.
312 Example 21 About the Data The Holzinger-Swineford (1939) data from Example 20 (in the file Grant.sav) are used in the present example. About the Model The present example estimates the parameters of Model 2R from Example 20 by four alternative methods: Asymptotically distribution-free (ADF), maximum likelihood (ML), generalized least squares (GLS), and unweighted least squares (ULS). To compare the four estimation methods, you need to run Amos four times.
313 Bootstrapping to Compare Estimation Method s E Finally, click the Bootstrap tab. E Select Perform bootstrap and type 1000 for Number of bootstrap samples. E Select Bootstrap ADF, Bootstrap ML, Bootstrap GLS, and Bootstrap ULS.
314 Example 21 Selecting Bootstrap ADF, Bootstrap ML, Bootstrap GLS, Bootstrap SLS, and Bootstrap ULS specifies that each of CADF , CML, CGLS, and CULS is to be used to measure the discrepancy between the sample moments in the original sample and the implied moments from each bootstrap sample. To summarize, when you perform the analysis (Analyze > Calculate Estimates), Amos will fit the model to each of 1,000 bootstrap samples using the ADF discrepancy.
315 Bootstrapping to Compare Estimation Method s E Select the Maximum likelihood discrepancy to repeat the analysis. E Select the Generalized least squares discrepancy to repeat the analysis again. E Select the Unweighted least squares discrepancy to repeat the analysis one last time. The four Amos Graphics input files for this example are Ex21-adf.amw, Ex21-ml.amw, Ex21-gls.amw, and Ex21-uls.amw. Text Output In the first of the four analyses (as found in Ex21-adf.
316 Example 21 ADF discrepancy (implied vs pop) (Default model) N = 1000 Mean = 20.601 S. e. = .218 7.359 10.817 14.274 17.732 21.189 24.647 28.104 31.562 35.019 38.477 41.934 45.392 48.850 52.307 55.
317 Bootstrapping to Compare Estimation Method s The following histogram shows the distribution of C GLS ( α̂b, a ) . To view this histogram: E Click Bootstrap Distributions > GLS Discrepancy (implied vs pop) in the tree diagram in the upper left pane of the Amos Output window. GLS discrepancy (implied vs pop) (Default model) N = 1000 Mean = 21.827 S. e. = .263 7.248 11.076 14.904 18.733 22.561 26.389 30.217 34.046 37.874 41.702 45.530 49.359 53.187 57.015 60.
318 Example 21 Below is a table showing the mean of C ( α̂b, a ) across 1,000 bootstrap samples with the standard errors in parentheses. The four distributions just displayed are summarized in the first row of the table. The remaining three rows show the results of estimation by minimizing CML, CGLS, and CULS, respectively. Population discrepancy for evaluation: C ( α̂b, a b ) CADF Sample discrepancy CML for estimation C GLS C ( α̂b, a b ) CULS CADF 20.60 (0.22) 19.19 (0.20) 19.45 (0.20) 24.89 (0.
Example 22 Specification Search Introduction This example takes you through two specification searches: one is largely confirmatory (with few optional arrows), and the other is largely exploratory (with many optional arrows). About the Data This example uses the Felson and Bohrnstedt (1979) girls’ data, also used in Example 7.
320 Example 22 GPA academic 1 error1 height weight attract 1 error2 rating Figure 22-1: Felson and Bohrnstedt’s model for girls Specification Search with Few Optional Arrows Felson and Bohrnstedt were primarily interested in the two single-headed arrows, academic←attract and attract←academic. The question was whether one or both, or possibly neither, of the arrows was needed. For this reason, you will make both arrows optional during this specification search.
321 Specification Search E Click on the Specification Search toolbar, and then click the double-headed arrow that connects error1 and error2. The arrow changes color to indicate that the arrow is optional. Tip: If you want the optional arrow to be dashed as well as colored, as seen below, choose View → Interface Properties from the menus, click the Accessibility tab, and select the Alternative to color check box.
322 Example 22 When you perform the exploratory analysis later on, the program will treat the three colored arrows as optional and will try to fit the model using every possible subset of them. Selecting Program Options E Click the Options button on the Specification Search toolbar. E In the Options dialog box, click the Current results tab. E Click Reset to ensure that your options are the same as those used in this example. E Now click the Next search tab.
323 Specification Search Limiting the number of models reported can speed up a specification search significantly. However, only eight models in total will be encountered during the specification search for this example, and specifying a nonzero value for Retain only the best ___ models would have the undesirable side effect of inhibiting the program from normalizing Akaike weights and Bayes factors so that they sum to 1 across all models, as seen later. E Close the Options dialog box.
324 Example 22 The following table summarizes fit measures for the eight models and the saturated model: The Model column contains an arbitrary index number from 1 through 8 for each of the models fitted during the specification search. Sat identifies the saturated model. Looking at the first row, Model 1 has 19 parameters and 2 degrees of freedom. The discrepancy function (which in this case is the likelihood ratio chi-square statistic) is 2.761.
325 Specification Search GPA academic 1 error1 height weight attract 1 error2 rating Figure 22-2: Path diagram for Model 7 Viewing Parameter Estimates for a Model E Click on the Specification Search toolbar. E In the Specification Search window, double-click the row for Model 7. The drawing area displays the parameter estimates for Model 7. 12.12 .02 .02 GPA academic 1 error1 1.82 8.43 -6.71 1.44 height .00 19.02 .53 371.48 weight -.47 .14 .00 attract 1 error2 -5.24 1.
326 Example 22 Using BCC to Compare Models E In the Specification Search window, click the column heading BCC0. The table sorts according to BCC so that the best model according to BCC (that is, the model with the smallest BCC) is at the top of the list. Based on a suggestion by Burnham and Anderson (1998), a constant has been added to all the BCC values so that the smallest BCC value is 0. The 0 subscript on BCC0 serves as a reminder of this rescaling.
327 Specification Search Viewing the Akaike Weights E Click the Options button on the Specification Search toolbar. E In the Options dialog box, click the Current results tab. E In the BCC, AIC, BIC group, select Akaike weights / Bayes factors (sum = 1). In the table of fit measures, the column that was labeled BCC0 is now labeled BCCp and contains Akaike weights. (See Appendix G.
328 Example 22 The Akaike weight has been interpreted (Akaike, 1978; Bozdogan, 1987; Burnham and Anderson, 1998) as the likelihood of the model given the data. With this interpretation, the estimated K-L best model (Model 7) is only about 2.4 times more likely (0.494 / 0.205 = 2.41) than Model 6.
329 Specification Search E In the Specification Search window, click the column heading BIC0. The table is now sorted according to BIC so that the best model according to BIC (that is, the model with the smallest BIC) is at the top of the list. Model 7, with the smallest BIC, is the model with the highest approximate posterior probability (using equal prior probabilities for the models and using a particular prior distribution for the parameters of each separate model).
330 Example 22 In the table of fit measures, the column that was labeled BIC0 is now labeled BICp and contains Bayes factors scaled so that they sum to 1. With equal prior probabilities for the models and using a particular prior distribution of the parameters of each separate model (Raftery, 1995; Schwarz, 1978), BICp values are approximate posterior probabilities. Model 7 is the correct model with probability 0.860. One can be 99% sure that the correct model is among Models 7, 6, and 8 (0.860 + 0.
331 Specification Search Madigan and Raftery (1994) suggest that only models in Occam’s window be used for purposes of model averaging (a topic not discussed here). The symmetric Occam’s window is the subset of models obtained by excluding models that are much less probable (Madigan and Raftery suggest something like 20 times less probable) than the most probable model.
332 Example 22 Examining the Short List of Models E Click on the Specification Search toolbar. This displays a short list of models. In the figure below, the short list shows the best model for each number of parameters. It shows the best 16-parameter model, the best 17-parameter model, and so on. Notice that all criteria agree on the best model when the comparison is restricted to models with a fixed number of parameters.
333 Specification Search Viewing a Scatterplot of Fit and Complexity E Click on the Specification Search toolbar. This opens the Plot window, which displays the following graph: The graph shows a scatterplot of fit (measured by C) versus complexity (measured by the number of parameters) where each point represents a model.
334 Example 22 E Choose one of the models from the pop-up menu to see that model highlighted in the table of model fit statistics and, at the same time, to see the path diagram of that model in the drawing area. In the following figure, the cursor points to two overlapping points that represent Model 6 (with a discrepancy of 2.76) and Model 8 (with a discrepancy of 2.90). The graph contains a horizontal line representing points for which C is constant.
335 Specification Search Adjusting the Line Representing Constant Fit E Move your mouse over the adjustable line. When the pointer changes to a hand, drag the line so that NFI1 is equal to 0.900. (Keep an eye on NFI1 in the lower left panel while you reposition the adjustable line.) NFI1 is the familiar form of the NFI statistic for which the baseline model requires the observed variables to be uncorrelated without constraining their means and variances. Points that are below the line have NFI1 > 0.
336 Example 22 Viewing the Line Representing Constant C – df E In the Plot window, select C – df in the Fit measure group. This displays the following: The scatterplot remains unchanged except for the position of the adjustable line. The adjustable line now contains points for which C – df is constant. Whereas the line was previously horizontal, it is now tilted downward, indicating that C – df gives some weight to complexity in assessing model adequacy.
337 Specification Search Appendix G). Initially, both CFI1 and CFI2 are equal to 1 for points on the adjustable line. When you move the adjustable line, the fit measures in the lower left panel change to reflect the changing position of the line. Adjusting the Line Representing Constant C – df E Drag the adjustable line so that CFI1 is equal to 0.950.
338 Example 22 Viewing Other Lines Representing Constant Fit E Click AIC, BCC, and BIC in turn. Notice that the slope of the adjustable line becomes increasingly negative. This reflects the fact that the five measures (C, C – df, AIC, BCC, and BIC) give increasing weight to model complexity. For each of these five measures, the adjustable line has constant slope, which you can confirm by dragging the line with the mouse.
339 Specification Search Each point in this graph represents a model for which C is less than or equal to that of any other model that has the same number of parameters. The graph shows that the best 16-parameter model has C = 67.342 , the best 17-parameter model has C = 3.071 , and so on. While Best fit is selected, the table of fit measures shows the best model for each number of parameters. This table appeared earlier on p. 332.
340 Example 22 BIC is the measure among C, C – df, AIC, BCC, and BIC that imposes the greatest penalty for complexity. The high penalty for complexity is reflected in the steep positive slope of the graph as the number of parameters increases beyond 17. The graph makes it clear that, according to BIC, the best 17-parameter model is superior to any other candidate model.
341 Specification Search E In the Fit measure group, select C. The Plot window displays the following graph: Figure 22-6: Scree plot for C In this scree plot, the point with coordinate 17 on the horizontal axis has coordinate 64.271 on the vertical axis. This represents the fact that the best 17-parameter model ( C = 3.071 ) fits better than the best 16-parameter model ( C = 67.342 ), with the difference being 67.342 – 3.071 = 64.271 .
342 Example 22 The figure on either p. 338 or p. 341 can be used to support a heuristic point of diminishing returns argument in favor of 17 parameters. There is this difference: In the best-fit graph (p. 338), one looks for an elbow in the graph, or a place where the slope changes from relatively steep to relatively flat. For the present problem, this occurs at 17 parameters, which can be taken as support for the best 17-parameter model. In the scree plot (p.
343 Specification Search For C – df, AIC, BCC, and BIC, the units and the origin of the vertical axis are different than for C, but the graphs are otherwise identical. This means that the final model selected by the scree test is independent of which measure of fit is used (unless C / df is used). This is the advantage of the scree plot over the best-fit plot demonstrated earlier in this example (see “Viewing the Best-Fit Graph for C” on p. 338, and “Viewing the Best-Fit Graph for Other Fit Measures” on p.
344 Example 22 Specification Search with Many Optional Arrows The previous specification search was largely confirmatory in that there were only three optional arrows. You can take a much more exploratory approach to constructing a model for the Felson and Bohrnstedt data. Suppose that your only hypothesis about the six measured variables is that academic depends on the other five variables, and attract depends on the other five variables.
345 Specification Search Specifying the Model E Open Ex22b.amw. If you performed a typical installation, the path will be C:\Program Files\IBM\SPSS\Amos\21\Examples\\Ex22b.amw. Tip: If the last file you opened was in the Examples folder, you can open the file by double-clicking it in the Files list to the left of the drawing area. GPA academic 1 error1 height weight attract 1 error2 rating Making Some Arrows Optional E From the menus, choose Analyze > Specification Search.
346 Example 22 This restores the default setting we altered earlier in this example. With the default setting, the program displays only the 10 best models according to whichever criterion you use for sorting the columns of the model list. This limitation is desirable now because of the large number of models that will be generated for this specification search. E Click the Current results tab. E In the BCC, AIC, BIC group, select Zero-based (min = 0).
347 Specification Search Using BIC to Compare Models E In the Specification Search window, click the BIC0 column heading. This sorts the table according to BIC0. Figure 22-8: The 10 best models according to BIC0 The sorted table shows that Model 22 is the best model according to BIC0. (Model numbers depend in part on the order in which the objects in the path diagram were drawn; therefore, if you draw your own path diagram, your model numbers may differ from the model numbers here.
348 Example 22 Viewing the Scree Plot E Click on the Specification Search toolbar. E In the Plot window, select Scree in the Plot type group. The scree plot strongly suggests that models with 15 parameters provide an optimum trade-off of model fit and parsimony. E Click the point with the horizontal coordinate 15. A pop-up appears that indicates the point represents Model 22, for which the change in chi-square is 46.22. E Click 22 (46.22) to display Model 22 in the drawing area.
Example 23 Exploratory Factor Analysis by Specification Search Introduction This example demonstrates exploratory factor analysis by means of a specification search. In this approach to exploratory factor analysis, any measured variable can (optionally) depend on any factor. A specification search is performed to find the subset of single-headed arrows that provides the optimum combination of simplicity and fit.
350 Example 23 visperc 1 err_v 1 F1 cubes lozenges paragraph 1 1 1 err_c err_l err_p 1 F2 sentence wordmean 1 1 err_s err_w Figure 23-1: Exploratory factor analysis model with two factors Specifying the Model E Open the file Ex23.amw. If you performed a typical installation, the path will be C:\Program Files\IBM\SPSS\Amos\21\Examples\\Ex23.amw. Initially, the path diagram appears as in Figure 23-1.
351 Exploratory Factor Analysis by Specification Search Making All Regression Weights Optional E Click on the Specification Search toolbar, and then click all the single-headed arrows in the path diagram.
352 Example 23 E Now click the Next search tab. Notice that the default value for Retain only the best ___ models is 10.
353 Exploratory Factor Analysis by Specification Search With this setting, the program will display only the 10 best models according to whichever criterion you use for sorting the columns of the model list. For example, if you click the column heading C / df, the table will show the 10 models with the smallest values of C / df, sorted according to C / df. Scatterplots will display only the 10 best 1-parameter models, the 10 best 2-parameter models, and so on.
354 Example 23 Using BCC to Compare Models E In the Specification Search window, click the column heading BCC0. The table sorts according to BCC so that the best model according to BCC (that is, the model with the smallest BCC) is at the top of the list. Figure 23-3: The 10 best models according to BCC0 The two best models according to BCC0 (Models 52 and 53) have identical fit measures (out to three decimal places anyway). The explanation for this can be seen from the path diagrams for the two models.
355 Exploratory Factor Analysis by Specification Search E To see the path diagram for Model 53, double-click its row.
356 Example 23 The occurrence of equivalent candidate models makes it unclear how to apply Bayesian calculations to select a model in this example. Similarly, it is unclear how to use Akaike weights. Furthermore, Burnham and Anderson’s guidelines (see p. 326) for the interpretation of BCC0 are based on reasoning about Akaike weights, so it is not clear whether those guidelines apply in the present example.
357 Exploratory Factor Analysis by Specification Search Viewing the Scree Plot E Click on the Specification Search toolbar. E In the Plot window, select Scree in the Plot type group. The scree plot strongly suggests the use of 13 parameters because of the way the graph drops abruptly and then levels off immediately after the 13th parameter. Click the point with coordinate 13 on the horizontal axis. A pop-up shows that the point represents Models 52 and 53, as shown in Figure 23-4 on p. 355.
358 Example 23 Heuristic Specification Search The number of models that must be fitted in an exhaustive specification search grows rapidly with the number of optional arrows. There are 12 optional arrows in Figure 12 23-2 on p. 351 so that an exhaustive specification search requires fitting 2 = 4096 models. (The number of models will be somewhat smaller if you specify a small positive number for Retain only the best___models on the Next search tab of the Options dialog box.
359 Exploratory Factor Analysis by Specification Search and Backward searches are alternated until one Forward or Backward search is completed with no improvement. Performing a Stepwise Search E Click the Options button on the Specification Search toolbar. E In the Options dialog box, click the Next search tab. E Select Stepwise. E On the Specification Search toolbar, click . The results in Figure 23-7 suggest examining the 13-parameter model, Model 7.
360 Example 23 Viewing the Scree Plot E Click on the Specification Search toolbar. E In the Plot window, select Scree in the Plot type group. The scree plot confirms that adding a 13th parameter provides a substantial reduction in discrepancy and that adding additional parameters beyond the 13th provides only slight reductions. Figure 23-8: Scree plot after stepwise specification search E Click the point in the scree plot with horizontal coordinate 13, as in Figure 23-8.
361 Exploratory Factor Analysis by Specification Search Limitations of Heuristic Specification Searches A heuristic specification search can fail to find any of the best models for a given number of parameters. In fact, the stepwise search in the present example did fail to find any of the best 11-parameter models. As Figure 23-7 on p. 359 shows, the best 11-parameter model found by the stepwise search had a discrepancy (C) of 97.475.
Example 24 Multiple-Group Factor Analysis Introduction This example demonstrates a two-group factor analysis with automatic specification of cross-group constraints. About the Data This example uses the Holzinger and Swineford girls’ and boys’ (1939) data from Examples 12 and 15. Model 24a: Modeling Without Means and Intercepts The presence of means and intercepts as explicit model parameters adds to the complexity of a multiple-group analysis.
364 Example 24 1 spatial visperc cubes lozenges 1 verbal paragrap sentence wordmean 1 1 1 1 1 1 err_v err_c err_l err_p err_s err_w Figure 24-1: Two-factor model for girls and boys This is the same two-group factor analysis problem that was considered in Example 12. The results obtained in Example 12 will be obtained here automatically. Specifying the Model E From the menus, choose File > Open. E In the Open dialog box, double-click the file Ex24a.amw.
365 Multiple-Group Factor Analysis Figure 24-2: The Multiple-Group Analysis dialog box Most of the time, you will simply click OK. This time, however, let's take a look at some parts of the Multiple-Group Analysis dialog box. There are eight columns of check boxes. Check marks appear only in the columns labeled 1, 2, and 3. This means that the program will generate three models, each with a different set of cross-group constraints.
366 Example 24 subsets that appear in a black (that is, not gray) font are mutually exclusive and exhaustive, so that column 3 generates a model in which all parameters are constant across groups. In summary, columns 1 through 3 generate a hierarchy of models in which each model contains all the constraints of its predecessor. First, the factor loadings are held constant across groups. Then, the factor variances and covariances are held constant. Finally, the residual (unique) variances are held constant.
367 Multiple-Group Factor Analysis Viewing the Generated Models E In the Multiple-Group Analysis dialog box, click OK. The path diagram now shows names for all parameters. In the panel at the left of the path diagram, you can see that the program has generated three new models in addition to an Unconstrained model in which there are no cross-group constraints at all. Figure 24-3: Amos Graphics window after automatic constraints E Double-click XX: Measurement weights.
368 Example 24 Fitting All the Models and Viewing the Output E From the menus, choose Analyze > Calculate Estimates to fit all models. E From the menus, choose View > Text Output. E In the navigation tree of the output viewer, click the Model Fit node to expand it, and then click CMIN. The CMIN table shows the likelihood ratio chi-square statistic for each fitted model. The data do not depart significantly from any of the models.
369 Multiple-Group Factor Analysis Here is the CMIN table: Model NPAR CMIN DF P CMIN/DF Unconstrained Measurement weights Structural covariances Measurement residuals Saturated model Independence model 26 22 19 13 42 12 16.48 18.29 22.04 26.02 0.00 337.55 16 20 23 29 0 30 0.42 0.57 0.52 0.62 1.03 0.91 0.96 0.90 0.00 11.25 E In the navigation tree, click AIC under the Model Fit node.
370 Example 24 Model 24b: Comparing Factor Means Introducing explicit means and intercepts into a model raises additional questions about which cross-group parameter constraints should be tested, and in what order. This example shows how Amos constrains means and intercepts while fitting the factor analysis model in Figure 24-1 on p. 364 to data from separate groups of girls and boys. This is the same two-group factor analysis problem that was considered in Example 15.
371 Multiple-Group Factor Analysis Removing Constraints Initially, the factor means are fixed at 0 for both boys and girls. It is not possible to estimate factor means for both groups. However, Sörbom (1974) showed that, by fixing the factor means of a single group to constant values and placing suitable constraints on the regression weights and intercepts in a factor model, it is possible to obtain meaningful estimates of the factor means for all of the other groups.
372 Example 24 Now that the constraints on the girls’ factor means have been removed, the girls’ and boys’ path diagrams look like this: 1 spatial visperc lozenges verbal 0, err_v 0, cubes 1 1 paragrap 1 1 1 0, err_c spatial sentence wordmean 1 cubes 0, err_l lozenges 0, err_p 0, 1 visperc 1 1 paragrap 0, err_s verbal sentence 0, err_w Girls wordmean 1 1 1 1 1 1 0, err_v 0, err_c 0, err_l 0, err_p 0, err_s 0, err_w Boys Tip: To switch between path diagram
373 Multiple-Group Factor Analysis The default settings, as shown above, will generate the following nested hierarchy of five models: Model Model 1 (column 1) Model 2 (column 2) Model 3 (column 3) Model 4 (column 4) Model 5 (column 5) Constraints Measurement weights (factor loadings) are equal across groups. All of the above, and measurement intercepts (intercepts in the equations for predicting measured variables) are equal across groups.
374 Example 24 cross-group constraints, and the Measurement weights model with factor loadings held equal across groups, are unidentified. Viewing the Output E From the menus, choose View > Text Output. E In the navigation tree of the output viewer, expand the Model Fit node. Some fit measures for the four automatically generated and identified models are shown here, along with fit measures for the saturated and independence models. E Click CMIN under the Model Fit node.
375 Multiple-Group Factor Analysis Assuming model Measurement intercepts to be correct, the following table shows that this chi-square difference is significant: NFI Delta-1 Model DF CMIN P Structural means Structural covariances Measurement residuals 2 5 11 8.030 11.787 15.865 0.018 0.024 0.038 0.035 0.146 0.047 IFI Delta-2 RFI rho-1 TLI rho2 0.026 0.038 0.051 0.021 0.022 0.014 0.023 0.024 0.
Example 25 Multiple-Group Analysis Introduction This example shows you how to automatically implement Sörbom’s alternative to analysis of covariance. Example 16 demonstrates the benefits of Sörbom’s approach to analysis of covariance with latent variables. Unfortunately, as Example 16 also showed, the Sörbom approach is difficult to apply, involving many steps. This example automatically obtains the same results as Example 16. About the Data The Olsson (1973) data from Example 16 will be used here.
378 Example 25 0, eps2 1 1 pre_syn 0, 0, eps1 pre_opp 1 0, eps3 eps4 1 1 post_syn post_opp 1 pre_verbal post_verbal 1 0, zeta Figure 25-1: Sörbom model for Olsson data Specifying the Model E Open Ex25.amw. If you performed a typical installation, the path will be C:\Program Files\IBM\SPSS\Amos\21\Examples\\Ex25.amw. The path diagram is the same for the control and experimental groups and is shown in Figure 25-1. Some regression weights are fixed at 1.
379 Multiple-Group Analysis E In the drawing area, right-click pre_verbal and choose Object Properties from the pop- up menu. E In the Object Properties dialog box, click the Parameters tab. E In the Mean text box, type 0. E With the Object Properties dialog box still open, click post_verbal in the drawing area. E In the Intercept text box of the Object Properties dialog box, type 0. E Close the Object Properties dialog box.
380 Example 25 E Click OK to generate the following nested hierarchy of eight models: Model Model 1 (column 1) Model 2 (column 2) Model 3 (column 3) Model 4 (column 4) Model 5 (column 5) Model 6 (column 6) Model 7 (column 7) Model 8 (column 8) Constraints Measurement weights (factor loadings) are constant across groups. All of the above, and measurement intercepts (intercepts in the equations for predicting measured variables) are constant across groups.
381 Multiple-Group Analysis Fitting the Models E From the menus, choose Analyze > Calculate Estimates. The panel to the left of the path diagram shows that two models could not be fitted to the data. The two models that could not be fitted, the Unconstrained model and the Measurement weights model, are unidentified. Viewing the Text Output E From the menus, choose View > Text Output. E In the navigation tree of the output viewer, expand the Model Fit node, and click CMIN.
382 Example 25 There are many chi-square statistics in this table, but only two of them matter. The Sörbom procedure comes down to two basic questions. First, does the Structural weights model fit? This model specifies that the regression weight for predicting post_verbal from pre_verbal be constant across groups. If the Structural weights model is accepted, one follows up by asking whether the next model up the hierarchy, the Structural intercepts model, fits significantly worse.
383 Multiple-Group Analysis E Now click experimental in the panel on the left. As you can see in the following covariance table for the experimental group, there are four modification indices greater than 4: eps2 <--> eps4 eps2 <--> eps3 eps1 <--> eps4 eps1 <--> eps3 M.I. Par Change 9.314 9.393 8.513 6.192 4.417 –4.117 –3.947 3.110 Of these, only two modifications have an obvious theoretical justification: allowing eps2 to correlate with eps4, and allowing eps1 to correlate with eps3.
384 Example 25 Model NPAR CMIN DF P CMIN/DF Measurement intercepts Structural weights Structural intercepts Structural means Structural covariances Structural residuals Measurement residuals Saturated model Independence model 24 23 22 21 20 19 14 28 16 2.797 3.976 55.094 63.792 69.494 83.194 93.197 0.000 682.638 4 5 6 7 8 9 14 0 12 0.59 0.55 0.00 0.00 0.00 0.00 0.00 0.699 0.795 9.182 9.113 8.687 9.244 6.657 0.00 56.
Example 26 Bayesian Estimation Introduction This example demonstrates Bayesian estimation using Amos. Bayesian Estimation In maximum likelihood estimation and hypothesis testing, the true values of the model parameters are viewed as fixed but unknown, and the estimates of those parameters from a given sample are viewed as random but known.
386 Example 26 interpret. A good way to start is to plot the marginal posterior density for each parameter, one at a time. Often, especially with large data samples, the marginal posterior distributions for parameters tend to resemble normal distributions. The mean of a marginal posterior distribution, called a posterior mean, can be reported as a parameter estimate.
387 Bayesian Estimation Selecting Priors A prior distribution quantifies the researcher’s belief concerning where the unknown parameter may lie. Knowledge of how a variable is distributed in the population can sometimes be used to help researchers select reasonable priors for parameters of interest. Hox (2002) cites the example of a normed intelligence test with a mean of 100 units and a standard deviation of 15 units in the general population.
388 Example 26 Performing Bayesian Estimation Using Amos Graphics To illustrate Bayesian estimation using Amos Graphics, we revisit Example 3, which shows how to test the null hypothesis that the covariance between two variables is 0 by fixing the value of the covariance between age and vocabulary to 0. Estimating the Covariance The first thing we need to do for the present example is to remove the zero constraint on the covariance so that the covariance can be estimated. E Open Ex03.amw.
389 Bayesian Estimation This is the resulting path diagram (you can also find it in Ex26.amw): Results of Maximum Likelihood Analysis Before performing a Bayesian analysis of this model, we perform a maximum likelihood analysis for comparison purposes. E From the menus, choose Analyze > Calculate Estimates to display the following parameter estimates and standard errors: Covariances: (Group number 1 - Default model) age <--> vocabulary Estimate –5.014 S.E. 8.560 C.R. –0.586 P 0.558 Label C.R. 4.
390 Example 26 Bayesian Analysis Bayesian analysis requires estimation of explicit means and intercepts. Before performing any Bayesian analysis in Amos, you must first tell Amos to estimate means and intercepts. E From the menus, choose View > Analysis Properties. E Select Estimate means and intercepts. (A check mark will appear next to it.) E To perform a Bayesian analysis, from the menus, choose Analyze > Bayesian Estimation, or press the keyboard combination Ctrl+B.
391 Bayesian Estimation The Bayesian SEM window appears, and the MCMC algorithm immediately begins generating samples. The Bayesian SEM window has a toolbar near the top of the window and has a results summary table below. Each row of the summary table describes the marginal posterior distribution of a single model parameter. The first column, labeled Mean, contains the posterior mean, which is the center or average of the posterior distribution.
392 Example 26 Replicating Bayesian Analysis and Data Imputation Results The multiple imputation and Bayesian estimation algorithms implemented in Amos make extensive use of a stream of random numbers that depends on an initial random number seed. The default behavior of Amos is to change the random number seed every time you perform Bayesian estimation, Bayesian data imputation, or stochastic regression data imputation.
393 Bayesian Estimation maintains a log of previous seeds used, so it is possible to match the file creation dates of previously generated analysis results or imputed datasets with the dates reported in the Seed Manager. Changing the Current Seed E Click Change and enter a previously used seed before performing an analysis. Amos will use the same stream of random numbers that it used the last time it started out with that seed.
394 Example 26 Record the value of this seed in a safe place so that you can replicate the results of your analysis at a later date. Tip: We use the same seed value of 14942405 for all examples in this guide so that you can reproduce our results. We mentioned earlier that the MCMC algorithm used by Amos draws random values of parameters from high-dimensional joint posterior distributions via Monte Carlo simulation of the posterior distribution of parameters.
395 Bayesian Estimation The likely distance between the posterior mean and the unknown true parameter is reported in the third column, labeled S.D., and that number is analogous to the standard error in maximum likelihood estimation. Additional columns contain the convergence statistic (C.S.), the median value of each parameter, the lower and upper 50% boundaries of the distribution of each parameter, and the skewness, kurtosis, minimum value, and maximum value of each parameter.
396 Example 26 You can change the refresh interval to something other than the default of 1,000 observations. Alternatively, you can refresh the display at a regular time interval that you specify. If you select Refresh the display manually, the display will never be updated automatically. Regardless of what you select on the Refresh tab, you can refresh the display manually at any time by clicking the Refresh button on the Bayesian SEM toolbar.
397 Bayesian Estimation samples, one may ask whether there are enough of these samples to accurately estimate the summary statistics, such as the posterior mean. That question pertains to the second type of convergence, which we may call convergence of posterior summaries. Convergence of posterior summaries is complicated by the fact that the analysis samples are not independent but are actually an autocorrelated time series.
398 Example 26 At this point, we have 22,501 analysis samples, although the display was most recently updated at the 22,500th sample. The largest C.S. is 1.0012, which is below the 1.002 criterion that indicates acceptable convergence. Reflecting the satisfactory convergence, Amos now displays a happy face . Gelman et al. (2004) suggest that, for many analyses, values of 1.10 or smaller are sufficient. The default criterion of 1.002 is conservative.
399 Bayesian Estimation E Select the age< - >vocabulary parameter from the Bayesian SEM window.
400 Example 26 The Posterior dialog box now displays a frequency polygon of the distribution of the age-vocabulary covariance across the 22,500 samples. One visual aid you can use to judge whether it is likely that Amos has converged to the posterior distribution is a simultaneous display of two estimates of the distribution, one obtained from the first third of the accumulated samples and another obtained from the last third.
401 Bayesian Estimation In this example, the distributions of the first and last thirds of the analysis samples are almost identical, which suggests that Amos has successfully identified the important features of the posterior distribution of the age-vocabulary covariance. Note that this posterior distribution appears to be centered at some value near –6, which agrees with the Mean value for this parameter.
402 Example 26 E To view the trace plot, select Trace. The plot shown here is quite ideal. It exhibits rapid up-and-down variation with no long-term trends or drifts. If we were to mentally break up this plot into a few horizontal sections, the trace within any section would not look much different from the trace in any other section. This indicates that the convergence in distribution takes place rapidly. Long-term trends or drifts in the plot indicate slower convergence.
403 Bayesian Estimation E To display this plot, select Autocorrelation. Lag, along the horizontal axis, refers to the spacing at which the correlation is estimated. In ordinary situations, we expect the autocorrelation coefficients to die down and become close to 0, and remain near 0, beyond a certain lag. In the autocorrelation plot shown above, the lag-10 correlation—the correlation between any sampled value and the value drawn 10 iterations later—is approximately 0.50.
404 Example 26 when the missing values fall in a peculiar pattern, or in models with some parameters that are poorly estimated. If this should happen, the trace plots for one or more parameters in the model will have long-term drifts or trends that do not diminish as more and more samples are taken. Even as the trace plot gets squeezed together like an accordion, the drifts and trends will not go away.
405 Bayesian Estimation E Select Histogram to display a similar plot using vertical blocks. E Select Contour to display a two-dimensional plot of the bivariate posterior density.
406 Example 26 Ranging from dark to light, the three shades of gray represent 50%, 90%, and 95% credible regions, respectively. A credible region is conceptually similar to a bivariate confidence region that is familiar to most data analysts acquainted with classical statistical inference methods.
407 Bayesian Estimation Credible Intervals Recall that the summary table in the Bayesian SEM window displays the lower and upper endpoints of a Bayesian credible interval for each estimand. By default, Amos presents a 50% interval, which is similar to a conventional 50% confidence interval. Researchers often report 95% confidence intervals, so you may want to change the boundaries to correspond to a posterior probability content of 95%.
408 Example 26 Learning More about Bayesian Estimation Gill (2004) provides a readable overview of Bayesian estimation and its advantages in a special issue of Political Analysis. Jackman (2000) offers a more technical treatment of the topic, with examples, in a journal article format. The book by Gelman, Carlin, Stern, and Rubin (2004) addresses a multitude of practical issues with numerous examples.
Example 27 Bayesian Estimation Using a Non-Diffuse Prior Distribution Introduction This example demonstrates using a non-diffuse prior distribution. About the Example Example 26 showed how to perform Bayesian estimation for a simple model with the uniform prior distribution that Amos uses by default. In the present example, we consider a more complex model and make use of a non-diffuse prior distribution.
410 Example 27 As the sample size increases, the likelihood function becomes more and more tightly concentrated about the ML estimate. In that case, a diffuse prior tends to be nearly flat or constant over the region where the likelihood is high; the shape of the posterior distribution is largely determined by the likelihood, that is by the data themselves. Under a uniform prior distribution for θ, p(θ) is completely flat, and the posterior distribution is simply a re-normalized version of the likelihood.
411 Bayesian Estimation Using a Non-Diffuse Prior Distribution experimental condition, measured their levels of depression, treated the experimental group, and then re-measured participants’ depression. The researchers did not rely on a single measure of depression. Instead, they used two well-known depression scales, the Beck Depression Inventory (Beck, 1967) and the Hamilton Rating Scale for Depression (Hamilton, 1960). We will call them BDI and HRSD for short. The data are in the file feelinggood.sav.
412 Example 27 Bayesian Estimation with a Non-Informative (Diffuse) Prior Does a Bayesian analysis with a diffuse prior distribution yield results similar to those of the maximum likelihood solution? To find out, we will do a Bayesian analysis of the same model. First, we will show how to increase the number of burn-in observations. This is just to show you how to do it. Nothing suggests that the default of 500 burn-in observations needs to be changed.
413 Bayesian Estimation Using a Non-Diffuse Prior Distribution The summary table should look something like this:
414 Example 27 In this analysis, we allowed Amos to reach its default limit of 100,000 MCMC samples. When Amos reaches this limit, it begins a process known as thinning. Thinning involves retaining an equally-spaced subset of samples rather than all samples. Amos begins the MCMC sampling process by retaining all samples until the limit of 100,000 samples is reached.
415 Bayesian Estimation Using a Non-Diffuse Prior Distribution Fortunately, there is a remedy for this problem: Assign a prior density of 0 to any parameter vector for which the variance of e5 is negative. To change the prior distribution of the variance of e5: E From the menus, choose View > Prior. Alternatively, click the Prior button on the Bayesian SEM toolbar, or enter the keyboard combination Ctrl+R. Amos displays the Prior dialog box.
416 Example 27 E Select the variance of e5 in the Bayesian SEM window to display the default prior distribution for e5. E Replace the default lower bound of – 3.4 × 10 – 38 with 0.
417 Bayesian Estimation Using a Non-Diffuse Prior Distribution E Click Apply to save this change. Amos immediately discards the accumulated MCMC samples and begins sampling all over again.
418 Example 27 The posterior mean of the variance of e5 is now positive. Examining its posterior distribution confirms that no sampled values fall below 0.
419 Bayesian Estimation Using a Non-Diffuse Prior Distribution Is this solution proper? The posterior mean of each variance is positive, but a glance at the Min column shows that some of the sampled values for the variance of e2 and the variance of e3 are negative. To avoid negative variances for e2 and e3, we can modify their prior distributions just as we did for e5. It is not too difficult to impose such constraints on a parameter-by-parameter basis in small models like this one.
420 Example 27 E From the menus, choose View > Options. E In the Options dialog box, click the Prior tab. E Select Admissibility test. (A check mark will appear next to it.) Selecting Admissibility test sets the prior density to 0 for parameter values that result in a model where any covariance matrix fails to be positive definite. In particular, the prior density is set to 0 for non-positive variances. Amos also provides a stability test option that works much like the admissibility test option.
421 Bayesian Estimation Using a Non-Diffuse Prior Distribution Notice that the analysis took only 73,000 observations to meet the convergence criterion for all estimands. Minimum values for all estimated variances are now positive.
Example 28 Bayesian Estimation of Values Other Than Model Parameters Introduction This example shows how to estimate other quantities besides model parameters in a Bayesian analyses. About the Example Examples 26 and 27 demonstrated Bayesian analysis. In both of those examples, we were concerned exclusively with estimating model parameters. We may also be interested in estimating other quantities that are functions of the model parameters.
424 Example 28 Indirect Effects Suppose we are interested in the indirect effect of ses on 71_alienation through the mediation of 67_alienation. In other words, we suspect that socioeconomic status exerts an impact on alienation in 1967, which in turn influences alienation in 1971.
425 Bayesian Estimation of Values Other Than Model Parameters Estimating Indirect Effects E Before starting the Bayesian analysis, from the menus in Amos Graphics, choose View > Analysis Properties. E In the Analysis Properties dialog box, click the Output tab. E Select Indirect, direct & total effects and Standardized estimates to estimate standardized indirect effects. (A check mark will appear next to these options.) E Close the Analysis Properties dialog box.
426 Example 28 E From the menus, choose Analyze > Calculate Estimates to obtain the maximum likelihood chi-square test of model fit and the parameter estimates. The results are identical to those shown in Example 6, Model C. The standardized direct effect of ses on 71_alienation is –0.19. The standardized indirect effect of ses on 71_alienation is defined as the product of two standardized direct effects: the standardized direct effect of ses on 67_alienation (–0.
427 Bayesian Estimation of Values Other Than Model Parameters You do not have to work the standardized indirect effect out by hand. To view all the standardized indirect effects: E From the menus, choose View > Text Output. E In the upper left corner of the Amos Output window, select Estimates, then Matrices, and then Standardized Indirect Effects. Bayesian Analysis of Model C To begin Bayesian estimation for Model C: E From the menus, choose Analyze > Bayesian Estimation.
428 Example 28 The MCMC algorithm converges quite rapidly within 22,000 MCMC samples. Additional Estimands The summary table displays results for model parameters only. To estimate the posterior of quantities derived from the model parameters, such as indirect effects: E From the menus, choose View > Additional Estimands.
429 Bayesian Estimation of Values Other Than Model Parameters Estimating the marginal posterior distribution of the additional estimands may take a while. A status window keeps you informed of progress. Results are displayed in the Additional Estimands window. To display the posterior mean for each standardized indirect effect: E Select Standardized Indirect Effects and Mean in the panel at the left side of the window.
430 Example 28 E To print the results, select the items you want to print. (A check mark will appear next to them). E From the menus, choose File > Print. Be careful because it is possible to generate a lot of printed output. If you put a check mark in every check box in this example, the program will print 1 × 8 × 11 = 88 matrices. E To view the posterior means of the standardized direct effects, select Standardized Direct Effects and Mean in the panel at the left.
431 Bayesian Estimation of Values Other Than Model Parameters Inferences about Indirect Effects There are two methods for finding a confidence interval for an indirect effect or for testing an indirect effect for significance. Sobel (1982, 1986) gives a method that assumes that the indirect effect is normally distributed.
432 Example 28 The lower boundary of the 95% credible interval for the indirect effect of socioeconomic status on alienation in 1971 is –0.382. The corresponding upper boundary value is –0.270, as shown below: We are now 95% certain that the true value of this standardized indirect effect lies between –0.382 and –0.270. To view the posterior distribution: E From the menus in the Additional Estimands window, choose View > Posterior. At first, Amos displays an empty posterior window.
433 Bayesian Estimation of Values Other Than Model Parameters E Select Mean and Standardized Indirect Effects in the Additional Estimands window.
434 Example 28 Amos then displays the posterior distribution of the indirect effect of socioeconomic status on alienation in 1971. The distribution of the indirect effect is approximately, but not exactly, normal.
435 Bayesian Estimation of Values Other Than Model Parameters The skewness of the mean of the indirect effect values is –0.13; the kurtosis is 0.02. These values indicate very mild non-normality of the distribution of the mean indirect effect values.
Example 29 Estimating a User-Defined Quantity in Bayesian SEM Introduction This example shows how to estimate a user-defined quantity: in this case, the difference between a direct effect and an indirect effect. About the Example In the previous example, we showed how to use the Additional Estimands feature of Amos Bayesian analysis to estimate an indirect effect.
438 Example 29 (“c”) and the two components of the indirect effect (“a” and “b”). Although not required, parameter labels make it easier to specify custom estimands. To begin a Bayesian analysis of this model: E From the menus, choose Analyze > Bayesian Estimation.
439 Estimating a User-Defined Quantity in Bayesian SEM
440 Example 29 E From the menus, choose View > Additional Estimands. E In the Additional Estimands window, select Standardized Direct Effects and Mean. The posterior mean for the direct effect of ses on 71_alienation is –0.195.
441 Estimating a User-Defined Quantity in Bayesian SEM E Select Standardized Indirect Effects and Mean. The indirect effect of socioeconomic status on alienation in 1971 is –0.320.
442 Example 29 The posterior distribution of the indirect effect lies entirely to the left of 0, so we are practically certain that the indirect effect is less than 0. You can also display the posterior distribution of the direct effect. The program does not, however, have any built-in way to examine the posterior distribution of the difference between the indirect effect and the direct effect (or perhaps their ratio).
443 Estimating a User-Defined Quantity in Bayesian SEM Numeric Custom Estimands In this section, we show how to write a Visual Basic program for estimating the numeric difference between a direct effect and an indirect effect. (You can use C# instead of Visual Basic.) The final Visual Basic program is in the file Ex29.vb. The first step in writing a program to define a custom estimand is to open the custom estimands window. E From the menus on the Bayesian SEM window, choose View > Custom estimands.
444 Example 29 The skeleton program contains a subroutine and a function. You have no control over when the subroutine and the function are called. They are called by Amos. Amos calls your DeclareEstimands subroutine once to find out how many new quantities (estimands) you want to estimate and what you want to call them. Amos calls your CalculateEstimands function repeatedly.
445 Estimating a User-Defined Quantity in Bayesian SEM computing the direct effect and the indirect individually, but this is only to show how to do it. The direct effect and the indirect effect individually can be estimated without defining them as custom estimands. To define each estimand, we use the keyword newestimand, as shown below: The words “direct”, “indirect”, and “difference” are estimand labels. You can use different labels.
446 Example 29 The placeholder ‘TODO: Your code goes here needs to be replaced with lines for evaluating the estimands called “direct”, “indirect” and “difference”. We start by writing Visual Basic code for computing the direct effect. In the following figure, we have already typed part of a Visual Basic statement: estimand(“direct”) .value =.
447 Estimating a User-Defined Quantity in Bayesian SEM We need to finish the statement by adding additional code to the right of the equals (=) sign, describing how to compute the direct effect. The direct effect is to be calculated for a set of parameter values that are accessible through the AmosEngine object that is supplied as an argument to the CalculateEstimands function.
448 Example 29 Tip: When you get the mouse pointer on the right spot, a plus (+) symbol will appear next to the mouse pointer. E Hold down the left mouse button, drag the mouse pointer into the Visual Basic window to the spot where you want the expression for the direct effect to go, and release the mouse button.
449 Estimating a User-Defined Quantity in Bayesian SEM The parameter on the right side of the equation is identified by the label (“c”) that was used in the path diagram shown earlier. We next turn our attention to calculating the indirect effect of socioeconomic status on alienation in 1971.
450 Example 29 Using the same drag-and-drop process as previously described, start dragging things from the Bayesian SEM window to the Unnamed.vb window. E First, drag the direct effect of socioeconomic status on alienation in 1967 to the right side of the equals sign in the unfinished statement.
451 Estimating a User-Defined Quantity in Bayesian SEM E Next, drag and drop the direct effect of 1967 alienation on 1971alienation. This second direct effect appears in the Unnamed.vb window as sem.ParameterValue(“b”).
452 Example 29 E Finally, use the keyboard to insert an asterisk (*) between the two parameter values.
453 Estimating a User-Defined Quantity in Bayesian SEM Hint: For complicated custom estimands, you can also drag and drop from the Additional Estimands window to the Custom Estimands window.
454 Example 29 E To find the posterior distribution of all three custom estimands, click Run. The results will take a few seconds. A status window keeps you informed of progress.
455 Estimating a User-Defined Quantity in Bayesian SEM The marginal posterior distributions of the three custom estimands are summarized in the following table: The results for direct can also be found in the Bayesian SEM summary table, and the results for indirect can be found in the Additional Estimands table. We are really interested in difference. Its posterior mean is –0.132. Its minimum is –0.412, and its maximum is 0.111. E To see its marginal posterior, from the menus, choose View > Posterior.
456 Example 29 Most of the area lies to the left of 0, meaning that the difference is almost sure to be negative. In other words, it is almost certain that the indirect effect is more negative than the direct effect. Eyeballing the posterior, perhaps 95% or so of the area lies to the left of 0, so there is about a 95% chance that the indirect effect is larger than the direct effect. It is not necessary to rely on eyeballing the posterior, however.
457 Estimating a User-Defined Quantity in Bayesian SEM Dichotomous Custom Estimands Visual inspection of the frequency polygon reveals that the majority of difference values are negative, but it does not tell us exactly what proportion of values are negative. That proportion is our estimate of the probability that the indirect effect exceeds the direct effect. To estimate probabilities like these, we can use dichotomous estimands.
458 Example 29 E Add lines to the CalculateEstimates function specifying how to compute them. In this example, the first dichotomous custom estimand is true when the value of the indirect effect is less than 0. The second dichotomous custom estimand is true when the indirect effect is smaller than the direct effect. E Click the Run button. Amos evaluates the truth of each logical expression for each MCMC sample drawn.
459 Estimating a User-Defined Quantity in Bayesian SEM The P column shows the proportion of times that each evaluated expression was true in the whole series of MCMC samples. In this example, the number of MCMC samples was 29,501, so P is based on approximately 30,000 samples. The P1, P2, and P3 columns show the proportion of times each logical expression was true in the first third, the middle third, and the final third of the MCMC samples.
Example 30 Data Imputation Introduction This example demonstrates multiple imputation in a factor analysis model. About the Example Example 17 showed how to fit a model using maximum likelihood when the data contain missing values. Amos can also impute values for those that are missing. In data imputation, each missing value is replaced by some numeric guess.
462 Example 30 Bayesian imputation is like stochastic regression imputation except that it takes into account the fact that the parameter values are only estimated and not known. Multiple Imputation In multiple imputation (Schafer, 1997), a nondeterministic imputation method (either stochastic regression imputation or Bayesian imputation) is used to create multiple completed datasets. While the observed values never change, the imputed values vary from one completed dataset to the next.
463 Data Imputation Step 1: Use the Data Imputation feature of Amos to create m complete data files. Step 2: Perform an analysis of each of the m completed data files separately. Performing this analysis is up to you. You can perform the analysis in Amos but, typically, you would use some other program. For purposes of this example and the next, we will use SPSS Statistics to carry out a regression analysis in which one variable (sentence) is used to predict another variable (wordmean).
464 Example 30 E From the menus, choose Analyze > Data Imputation. Amos displays the Amos Data Imputation window. E Make sure that Bayesian imputation is selected. E Set Number of completed datasets to 10. (This sets m = 10.) You might suppose that a large number of completed data files are needed. It turns out that, in most applications, very few completed data files are needed.
465 Data Imputation SPSS Statistics to analyze the completed datasets, the simplest thing would be to select Single output file. Then, the split file capability of SPSS Statistics could be used in Step 2 to analyze each completed dataset separately. However, to make it easy to replicate this example using any regression program: E Select Multiple output files. You can save imputed data in two file formats: plain text or SPSS Statistics format. E Click File Names to display a Save As dialog box.
466 Example 30 E Click Save. E Click Options in the Data Imputation window to display the available imputation options. The online help explains these options. To get an explanation of an option, place your mouse pointer over the option in question and press the F1 key. The figure below shows how the number of observations can be changed from 10,000 (the default) to 30,000. E Close the Options dialog box and click the Impute button in the Data Imputation window.
467 Data Imputation Each completed data file contains 73 complete cases. Here is a view of the first few records of the first completed data file, Grant_Imp1.
468 Example 30 Here is an identical view of the second completed data file, Grant_Imp2.sav: The values in the first two cases for visperc were observed in the original data file and therefore do not change across the imputed data files. By contrast, the values for these cases for cubes were missing in the original data file, Grant_x.sav, so Amos has imputed different values across the imputed data files for cubes for these two cases.
Example 31 Analyzing Multiply Imputed Datasets Introduction This example demonstrates the analysis of multiply (pronounced multiplee) imputed datasets. Analyzing the Imputed Data Files Using SPSS Statistics Ten completed datasets were created in Example 30. That was Step 1 in a three-step process: Use the Data Imputation feature of Amos to impute m complete data files. (Here, m = 10.) The next two steps are: Step 2: Perform an analysis of each of the m completed data files separately.
470 Example 31 Step 2: Ten Separate Analyses For each of the 10 completed datasets from Example 30, we need to perform a regression analysis in which sentence is used to predict wordmean. We start by opening the first completed dataset, Grant_Imp1.sav, in SPSS Statistics. E From the SPSS Statistics menus, choose Analyze > Regression > Linear and perform the regression analysis. (We assume you do not need detailed instructions for this step.
471 Analyzing Multiply Imputed Datasets Imputation ML Estimate ML Standard Error 1 2 3 4 5 6 7 8 9 10 1.106 1.080 1.118 1.273 1.102 1.286 1.121 1.283 1.270 1.081 0.160 0.160 0.151 0.155 0.154 0.152 0.139 0.140 0.156 0.157 Step 3: Combining Results of Multiply Imputed Data Files The standard errors from an analysis of any single completed dataset are not accurate because they do not take into account the uncertainty arising from imputing missing data values.
472 Example 31 To obtain a standard error for the combined parameter estimate, go through the following steps: E Compute the average within-imputation variance. 1 U = ---m m ∑U (t) = 0.0233 t=1 E Compute the between-imputation variance. 1 B = ------------m–1 m ∑( Q̂ (t) 2 – Q ) = 0.0085 t=1 E Compute the total variance. 1 1 T = U + ⎛ 1 + ----⎞ B = 0.0233 + ⎛ 1 + ------⎞ 0.0085 = 0.0326 ⎝ ⎝ m⎠ 10⎠ The multiple-group standard error is then T = 0.0326 = 0.
473 Analyzing Multiply Imputed Datasets Further Reading Amos provides several advanced methods of handling missing data, including FIML (described in Example 17), multiple imputation, and Bayesian estimation. To learn more about each method, consult Schafer and Graham (2002) for an overview of the strengths of FIML and multiple imputation.
Example 32 Censored Data Introduction This example demonstrates parameter estimation, estimation of posterior predictive distributions, and data imputation with censored data. About the Data For this example, we use the censored data from 103 patients who were accepted into the Stanford Heart Transplantation Program during the years 1967 through 1974. The data were collected by Crowley and Hu (1977) and have been reanalyzed by Kalbfleisch and Prentice (2002), among others.
476 Example 32 Reading across the first visible row in the figure above, Patient 17 was accepted into the program in 1968. The patient at that time was 20.33 years old. The patient died 35 days later. The next number, 5.916, is the square root of 35. Amos assumes that censored variables are normally distributed. The square root of survival time will be used in this example in the belief that it is probably more nearly normally distributed than is survival time itself.
477 Censored Data known more precisely than it is. Of course, wherever the data provide an exact numeric value, as in the case of Patient 24 who is known to have survived for 218 days, that exact numeric value is used. Recoding the Data The data file needs to be recoded before Amos reads it. The next figure shows a portion of the dataset after recoding. (This complete dataset is in the file transplant-b.sav.
478 Example 32 E Then in the Data Files dialog box, click the File Name button. E Select the data file transplant-b.sav. E Select Allow non-numeric data (a check mark appears next to it). Recoding the data as shown above and selecting Allow non-numeric data are the only extra steps that are required for analyzing censored data. In all other respects, fitting a model with censored data and interpreting the results is exactly the same as if the data were purely numeric.
479 Censored Data To fit the model: E Click on the toolbar. or E From the menus, choose Analyze > Bayesian Estimation. Note: The button is disabled because, with non-numeric data, you can perform only Bayesian estimation. After the Bayesian SEM window opens, wait until the unhappy face changes into a happy face . The table of estimates in the Bayesian SEM window should look something like this: (Only a portion of the table is shown in the figure.
480 Example 32 The Posterior dialog box opens, displaying the posterior distribution of the regression weight. The posterior distribution of the regression weight is indeed centered around –0.29. The distribution lies almost entirely between –0.75 and 0.25, so it is practically guaranteed that the regression weight lies in that range. Most of the distribution lies between –0.5 and 0, so we are pretty sure that the regression weight lies between –0.5 and 0.
481 Censored Data Posterior Predictive Distributions Recall that the dataset contains some censored values like Patient 25’s survival time. All we really know about Patient 25’s survival time is that it is longer than 1,799 days or, equivalently, that the square root of survival time exceeds 42.415. Even though we do not know the amount by which this patient’s timesqr exceeds 42.415, we can ask for its posterior distribution. Taking into account the fact that timesqr exceeds 42.
482 Example 32 The posterior distribution for Patient 25’s timesqr lies entirely to the right of 42.415. Of course, we knew from the data alone that timesqr exceeds 42.415, but now we also know that there is practically no chance that Patient 25’s timesqr exceeds 70. For that matter, there is only a slim chance that timesqr exceeds even 55.
483 Censored Data Patient 100 was still alive when last observed on the 38th day after acceptance into the program, so that his timesqr is known to exceed 6.164. The posterior distribution of that patient’s timesqr shows that it is practically guaranteed to be between 6.164 and 70, and almost certain to be between, 6.164 and 50. The mean is 27.36, providing a point estimate of timesqr if one is needed. Squaring 27.36 gives 748, an estimate of Patient 100’s survival time in days.
484 Example 32 Imputation You can use this model to impute values for the censored values. E Close the Bayesian SEM window if it is open. E From the Amos Graphics menu, choose Analyze > Data Imputation. Notice that Regression imputation and Stochastic regression imputation are disabled. When you have non-numeric data such as censored data, Bayesian imputation is the only choice.
485 Censored Data E Wait until the Data Imputation dialog box displays a happy face to indicate that each of the 10 completed datasets is effectively uncorrelated with the others. Note: After you see a happy face but before you click OK, you may optionally choose to right-click a parameter in the Bayesian SEM window and choose Show Posterior from the pop-up menu. This will allow you to examine the Trace and Autocorrelation plots. E Click OK in the Data Imputation dialog box.
486 Example 32 E Double-click the file name to display the contents of the single completed data file, which contains 10 completed datasets. The file contains 1,030 cases because each of the 10 completed datasets contains 103 cases. The first 103 rows of the new data file contain the first completed dataset. The Imputation_ variable is equal to 1 for each row in the first completed dataset, and the CaseNo variable runs from 1 through 103.
487 Censored Data The first row of the completed data file contains a timesqr value of 7. Because that was not a censored value, 7 is not an imputed value. It is just an ordinary numeric value that was present in the original data file. On the other hand, Patient 25’s timesqr was censored, so that patient has an imputed timesqr (in this case, 49.66.) The value of 49.66 is a value drawn randomly from the posterior predictive distribution in the figure on p. 482.
488 Example 32 Normally, the next step would be to use the 10 completed datasets in transplantb_C.sav as input to some other program that cannot accept censored data. You would use that other program to perform 10 separate analyses, using each one of the 10 completed datasets in turn. Then you would do further computations to combine the results of those 10 separate analyses into a single set of results, as was done in Example 31. Those steps will not be carried out here.
Example 33 Ordered-Categorical Data Introduction This example shows how to fit a factor analysis model to ordered-categorical data. It also shows how to find the posterior predictive distribution for the numeric variable that underlies a categorical response and how to impute a numeric value for a categorical response. About the Data This example uses data on attitudes toward environment issues obtained from a questionnaire administered to 1,017 respondents in the Netherlands.
490 Example 33 One way to analyze these data is to assign numbers to the four categorical responses; for example, using the assignment 1 = SD, 2 = D, 3 = A, 4 = SA. If you assign numbers to categories in that way, you get the dataset in environment-nl-numeric.sav. In an Amos analysis, it is not necessary to assign numbers to categories in the way just shown. It is possible to use only the ordinal properties of the four categorical responses.
491 Ordered-Categorical Data It may be slightly easier to use environment-nl-numeric.sav because Amos will assume by default that the numbered categories go in the order 1, 2, 3, 4, with 1 being the lowest category. That happens to be the correct order. With environment-nlstring.sav, by contrast, Amos will assume by default that the categories are arranged alphabetically in the order A, D, SA, SD, with A being the lowest category.
492 Example 33 Recoding the Data within Amos The ordinal properties of the data cannot be inferred from the data file alone. To give Amos the additional information it needs so that it can interpret the data values SD, D, A, and SA: E From the Amos Graphics menus, choose Tools > Data Recode. E Select item1 in the list of variables in the upper-left corner of the Data Recode window. This displays a frequency distribution of the responses to item1 at the bottom of the window.
493 Ordered-Categorical Data In the box labeled Recoding rule, the notation No recoding means that Amos will read the responses to item1 as is. In other words, it will read either SD, D, A, SA, or an empty string. We can’t leave things that way because Amos doesn’t know what to do with SD, D, and so on. E Click No recoding and select Ordered-categorical from the drop-down list.
494 Example 33 The frequency table at the bottom of the window now has a New Value column that shows how the item1 values in the data file will be recoded before Amos reads the data. The first row of the frequency table shows that empty strings in the original data file will be treated as missing values. The second row shows that the A response will be translated into the string <0.0783345405060296.
495 Ordered-Categorical Data column, based on the assumption that scores on the underlying numeric variable are normally distributed with a mean of 0 and a standard deviation of 1. The ordering of the categories in the Original Value column needs to be changed. To change the ordering: E Click the Details button. The Ordered-Categorical Details dialog box opens.
496 Example 33 You can rearrange the categories and the boundaries. To do this: E Drag and drop with the mouse. or E Select a category or boundary with the mouse and then click the Up or Down button. After putting the categories and boundaries in the correct order, the OrderedCategorical Details dialog box looks like this: The Unordered categories list box contains a list of values that Amos will treat as missing.
497 Ordered-Categorical Data Note: You can’t drag and drop between the Ordered categories list box and the Unordered categories list box. You have to use the Up and Down buttons to move a category from one box to the other. We could stop here and close the Ordered-Categorical Details dialog box because we have the right number of boundaries and categories and we have the categories going in the right order.
498 Example 33 distributed with a mean of 0 and a standard deviation of 1. Alternatively, you can assign a value to a boundary instead of letting Amos estimate it. To assign a value: E Select the boundary with the mouse. E Type a numeric value in the text box. The following figure shows the result of assigning values 0 and 1 to the two boundaries.
499 Ordered-Categorical Data The changes that were just made to the categories and the interval boundaries are now reflected in the frequency table at the bottom of the Data Recode window. The frequency table shows how the values that appear in the data file will be recoded before Amos reads them. Reading the frequency table from top to bottom: An empty string will be treated as a missing value. The strings SD and D will be recoded as <0, meaning that the underlying numeric score is less than 0.
500 Example 33 That takes care of item1. What was just done for item1 has to be repeated for each of the five remaining observed variables. After specifying the recoding for all six observed variables, you can view the original dataset along with the recoded variables. To do this: E Click the View Data button. The table on the left shows the contents of the original data file before recoding. The table on the right shows the recoded variables after recoding.
501 Ordered-Categorical Data environment. The other three items were designed to be measures of awareness of environmental issues. This design of the questionnaire is reflected in the following factor analysis model, which is saved in the file Ex33-a.amw. 0, 1 WILLING item1 item2 item3 item4 AWARE 0, 1 item5 item6 1 1 1 1 1 1 e1 0, e2 0, e3 0, e4 0, e5 0, e6 0, The path diagram is drawn exactly as it would be drawn for numeric data.
502 Example 33 After the Bayesian SEM window opens, wait until the unhappy face changes into a happy face. The Bayesian SEM window should then look something like this: (The figure above shows some, but not all, of the parameter estimates.) The Mean column provides a point estimate for each parameter. For example, the regression weight for using WILLING to predict item1 is estimated to be 0.59. The skewness (0.09) and kurtosis (–0.
503 Ordered-Categorical Data The Posterior window displays the posterior distribution. The appearance of the distribution confirms what was concluded above from the mean, standard deviation, skewness, and kurtosis of the distribution. The shape of the distribution is nearly normal, and it looks like roughly 95% of the area lies between 0.53 and 0.65 (that is, within 0.06 of 0.59).
504 Example 33 MCMC Diagnostics If you know how to interpret the diagnostic output from MCMC algorithms (for example, see Gelman, et al, 2004), you might want to view the Trace plot and the Autocorrelation plot.
505 Ordered-Categorical Data The First and last plot provides another diagnostic. It shows two estimates of the posterior distribution (two superimposed plots), one estimate from the first third of the MCMC sample and another estimate from the last third of the MCMC sample.
506 Example 33 Posterior Predictive Distributions When you think of estimation, you normally think of estimating model parameters or some function of the model parameters such as a standardized regression weight or an indirect effect. However, there are other unknown quantities in the present analysis. Each entry in the data table on p. 490 represents a numeric value that is either unknown or partially known.
507 Ordered-Categorical Data We are in an even better position to guess at Person 1’s score on the numeric variable that underlies item1 because Person 1 gave a response to item1. This person’s response places his or her score in the middle interval, between the two boundaries.
508 Example 33 The Posterior Predictive Distributions window contains a table with a row for every person and a column for every observed variable in the model. An asterisk (*) indicates a missing value, while << indicates a response that places inequality constraints on the underlying numeric variable. To display the posterior distribution for an item: E Click on the table entry in the upper-left corner (Person 1’s response to item1).
509 Ordered-Categorical Data That is because the program is building up an estimate of the posterior distribution as MCMC sampling proceeds. The longer you wait, the better the estimate of the posterior distribution will be.
510 Example 33 E Next, click the table entry in the first column of the 22nd row to estimate Person 22’s score on the numeric variable that underlies his or her response to item1.
511 Ordered-Categorical Data The mean of the posterior distribution (0.52) can be taken as an estimate of Person 1’s score on the underlying variable if a point estimate is required. Looking at the plot of the posterior distribution, we can be nearly 100% sure that the score is between –1 and 2. The score is probably between 0 and 1 because most of the area under the posterior distribution lies between 0 and 1.
512 Example 33 0, 1 WILLING item1 item2 item3 item4 AWARE 0, 1 item5 item6 1 1 1 1 1 1 e1 0, e2 0, e3 0, e4 0, e5 0, e6 0, That takes care of the path diagram. It is also necessary to make a change to the data because if WILLING is an observed variable, then there has to be a WILLING column in the data file. You can directly modify the data file.
513 Ordered-Categorical Data E Change V1 to WILLING. (If necessary, click the Rename Variable button.
514 Example 33 E You can optionally view the recoded dataset that includes the new WILLING variable by clicking the View Data button.
515 Ordered-Categorical Data The table on the left shows the original dataset. The table on the right shows the recoded dataset as read by Amos. It includes item1 through item6 after recoding, and also the new WILLING variable. E Close the Data Recode window. E Start the Bayesian analysis by clicking on the Amos Graphics toolbar. E In the Bayesian SEM window, wait until the unhappy face face and then click the Posterior Predictive button .
516 Example 33 Imputation Data imputation works the same way for ordered-categorical data as it does for numeric data. With ordered-categorical data, you can impute numeric values for missing values, for scores on latent variables, and for scores on the unobserved numeric variables that underlie observed ordered-categorical measurements. You need a model in order to perform imputation. You could use the factor analysis model that was used earlier.
517 Ordered-Categorical Data item1 item2 item3 item4 item5 item6 After drawing the path diagram for the saturated model, you can begin the imputation. E From the Amos Graphics menu, choose Analyze > Data Imputation.
518 Example 33 In the Amos Data Imputation window, notice that Regression imputation and Stochastic regression imputation are disabled. When you have non-numeric data, Bayesian imputation is the only choice. We will accept the options shown in the preceding figure, creating 10 completed datasets and saving all 10 in a single SPSS Statistics data file called environment-nlstring_C.sav. To start the imputation: E Click the Impute button.
519 Ordered-Categorical Data E Click OK in the Data Imputation dialog box. The Summary window shows a list of the completed data files that were created. In this case, only one completed data file was created. E Double-click the file name in the Summary window to display the contents of the single completed data file, which contains 10 completed data sets. The file contains 10,170 cases because each of the 10 completed datasets contains 1,017 cases.
520 Example 33 Normally, the next step would be to use the 10 completed datasets in environment-nlstring_C.sav as input to some other program that requires numeric (not orderedcategorical) data. You would use that other program to perform 10 separate analyses using each one of the 10 completed data sets in turn. Then, you would do further computations to combine the results of those 10 separate analyses into a single set of results, as was done in Example 31. Those steps will not be carried out here.
Example 34 Mixture Modeling with Training Data Introduction Mixture modeling is appropriate when you have a model that is incorrect for an entire population, but where the population can be divided into subgroups in such a way that the model is correct in each subgroup.
522 Example 34 The dataset contains four measurements on flowers from 150 different plants. The first 50 flowers were irises of the species setosa. The next 50 were irises of the species versicolor. The last 50 were of the species virginica. A scatterplot of two of the numeric measurements, PetalLength and PetalWidth, suggests that those two measurements alone will be useful in classifying the flowers according to species.
523 Mixture Modeling with Training Data & & PetalLength 6.0 % % % % % % % % % % % % 4.0 % % % % % % % & & & & & & & & & % & & & % & & & % % % % % % % % % % & % % & & & & Species & & & & & & & & & & & & & & & & & & & & & & & $ setosa % versicolor & virginica % % % 2.0 $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ 0.5 1.0 1.5 2.0 2.5 PetalWidth The setosa flowers are all by themselves in the lower left corner of the scatterplot.
524 Example 34 Species information is available for 10 of the setosa flowers, 10 of the versicolor flowers, and 10 of the virginica flowers. Species is unknown for the remaining 120 flowers. When Amos analyzes these data, it will have 10 examples of each kind of flower to assist in classifying the rest of the flowers. Performing the Analysis E From the menus, choose File > New to start a new path diagram. E From the menus, choose Analyze > Manage Groups.
525 Mixture Modeling with Training Data E Click New to create a second group. E Change the name in the Group Name text box from Group number 2 to PossiblyVersicolor. E Click New to create a third group. E Change the name in the Group Name text box from Group number 3 to PossiblyVirginica. E Click Close.
526 Example 34 Specifying the Data File E From the menus, choose File > Data Files. E Click PossiblySetosa to select that row. E Click File Name, select the iris3.sav file that is in the Amos Examples directory, and click Open. E Click Grouping Variable and double-click Species in the Choose a Grouping Variable dialog box. This tells the program that the Species variable will be used for classifying flowers.
527 Mixture Modeling with Training Data E In the Data Files dialog box, click Group Value and then double-click setosa in the Choose Value for Group dialog box.
528 Example 34 The Data Files dialog box should now look like this:
529 Mixture Modeling with Training Data E Repeat the preceding steps for the PossiblyVersicolor group, but this time double-click versicolor in the Choose Value for Group dialog box. E Repeat the preceding steps once more for the PossiblyVirginica group, but this time double-click virginica in the Choose Value for Group dialog box.
530 Example 34 E Click OK to close the Data Files dialog box. Specifying the Model We will use a saturated model for the variables PetalLength and PetalWidth. The scatterplot that was shown earlier suggests that these two variables will allow the program to do a good job of classifying the flowers according to species. Note that you are not limited to saturated models when doing mixture modeling. You can use a factor analysis model or a regression model or any other kind of model.
531 Mixture Modeling with Training Data E From the menus, choose View > Analysis Properties. E Select Estimate means and intercepts (a check mark will appear next to it).
532 Example 34 Fitting the Model E Click on the toolbar. or E From the menus, choose Analyze > Bayesian Estimation. Note: The button is disabled because, in mixture modeling, you can perform only Bayesian estimation.
533 Mixture Modeling with Training Data After the Bayesian SEM window opens, wait until the unhappy face changes into a happy face . The table of estimates in the Bayesian SEM window should look something like this: The Bayesian SEM window displays all of the parameter estimates that you would get in an ordinary three-group analysis. The table displays the results for one group at a time. You can switch from one group to another by clicking the tabs at the top of the table.
534 Example 34 setosa flowers in the population is estimated to be 0.333. (It should be pointed out that it was by design that the sample contained equal numbers of setosa, versicolor, and virginica flowers. It is therefore not meaningful in this example to draw inferences about population proportions from the sample. Nevertheless, we will treat species here as a random variable in order to demonstrate how such inferences can be made.
535 Mixture Modeling with Training Data The Posterior window shows that the proportion of flowers that belong to the setosa species is almost certainly between 0.25 and 0.45. It looks like there is about a 50–50 chance that the proportion is somewhere between 0.3 and 0.35. Classifying Individual Cases To obtain probabilities of group membership for each individual flower: E Click the Posterior Predictive button . or E From the menus, choose View > Posterior Predictive.
536 Example 34 For each flower, the Posterior Predictive Distributions window shows the probability that that flower is setosa, versicolor, or virginica. For the first 50 flowers (the ones that actually are setosa), the probability of membership in the setosa group is nearly 1. We expected that result because the setosa flowers were clearly separated from flowers of other species in the scatterplot shown earlier. Most of the versicolor flowers (starting with case number 51) were also correctly classified.
537 Mixture Modeling with Training Data Latent Structure Analysis It was mentioned earlier that you are not limited to saturated models when doing mixture modeling. You can use a factor analysis model, a regression model, or any model at all. You may want to become familiar with an important variation of the saturated model. Latent structure analysis (Lazarsfeld and Henry, 1968) is a variation of mixture modeling in which the measured variables are required to be independent within each group.
Example 35 Mixture Modeling without Training Data Introduction Mixture modeling is appropriate when you have a model that is incorrect for an entire population, but where the population can be divided into subgroups in such a way that the model is correct in each subgroup. When Amos performs mixture modeling, it allows you to assign some cases to groups before the analysis starts. Example 34 shows how to do that.
540 Example 35 Notice that the dataset contains a Species column, even though that column is empty. It is important that the Species column be present even if it contains no values. This is because Amos allows for the possibility that you might already know the species of some cases (as in Example 34). The variable that is used for classifying cases does not actually have to be named Species. Any variable name will do. The variable does, however, have to be a string (non-numeric) variable.
541 Mixture Modeling without Training Data E Click New to create a second group. E Click New once more to create a third group. E Click Close. This example fits a three-group mixture model. When you aren’t sure how many groups there are, you can run the program multiple times. Run the program once to fit a two-group model, then again to fit a three-group model, and so on.
542 Example 35 Specifying the Data File E From the menus, choose File > Data Files. E Click Group number 1 to select the first row. E Click File Name, select the iris2.sav file that is in the Amos Examples directory, and click Open. E Click Grouping Variable and double-click Species in the Choose a Grouping Variable dialog box. This tells the program that the Species variable will be used to distinguish one group from another.
543 Mixture Modeling without Training Data E Repeat the preceding steps for Group number 2, specifying the same data file (iris2.sav) and the same grouping variable (Species). E Repeat the preceding steps once more for Group number 3, specifying the same data file (iris2.sav) and the same grouping variable (Species).
544 Example 35 E Select Assign cases to groups (a check mark will appear next to it). So far, this has been just like any ordinary multiple-group analysis except for the check mark next to Assign cases to groups. That check mark turns this into a mixture modeling analysis. The check mark tells Amos to assign a flower to a group if the grouping variable in the data file does not already assign it to a group.
545 Mixture Modeling without Training Data was not necessary to click Group Value to specify a value for the grouping variable. The data file contains no values for the grouping variable (Species), so the program automatically constructed the following Species values for the three groups: Cluster1, Cluster2, and Cluster3. E Click OK to close the Data Files dialog box. Specifying the Model We will use a saturated model for the variables PetalLength and PetalWidth.
546 Example 35 Constraining the Parameters In this example, variances and covariances will be required to be invariant across groups. This is the assumption of homogeneity of variances and covariances that is often made in discriminant analysis and some kinds of clustering. In principle, the assumption of homogeneity of variances and covariances is not necessary in mixture modeling. The reason we will make the assumption here is that, for this example, the algorithm in Amos fails without that assumption.
547 Mixture Modeling without Training Data E Right-click PetalLength in the path diagram, choose Object Properties from the pop-up menu, and enter the parameter name, v1, in the Variance text box. While the Object Properties dialog is still open, click PetalWidth in the path diagram. E While the Object Properties dialog box is still open, click PetalWidth in the path diagram. E In the Object Properties dialog box, enter the parameter name, v2, in the Variance text box.
548 Example 35 E In the Object Properties dialog box, enter the parameter name, c12, in the Covariance text box. The path diagram should now look like the following figure. (This path diagram is saved as Ex35-a.amw.) Fitting the Model E Click on the toolbar. or E From the menus, choose Analyze > Bayesian Estimation. Note: The button is disabled because, in mixture modeling, you can perform only Bayesian estimation.
549 Mixture Modeling without Training Data After the Bayesian SEM window opens, wait until the unhappy face changes into a happy face . The table of estimates in the Bayesian SEM window should then look something like this: The Bayesian SEM window displays all of the parameter estimates that you would get in an ordinary three-group analysis. The table displays the estimates for one group at a time. You can switch from one group to another by clicking the tabs at the top of the table.
550 Example 35 In a mixture modeling analysis, you also get an estimate of the proportion of the population that lies in each group. In the preceding figure, the proportion of setosa flowers in the population is estimated to be 0.306. E To view the posterior distribution of a population proportion, right-click the row that contains the proportion and choose Show Posterior from the pop-up menu.
551 Mixture Modeling without Training Data Classifying Individual Cases To obtain probabilities of group membership for each individual flower: E Click the Posterior Predictive button . or E From the menus, choose View > Posterior Predictive. For each flower, the Posterior Predictive Distributions window shows the probability that the value of the Species variable is Cluster1, Cluster2, or Cluster3.
552 Example 35 The first 50 cases, which we know to be examples of setosa, are placed in Group number 3 with a probability of 1, so Group number 3 clearly contains setosa flowers. Cases 51 through 100 fall mainly into Group number 2, so Group number 2 clearly contains versicolor flowers. Similarly, although the preceding figure does not show it, cases 101 through 150 are assigned mainly to Group number 1, so Group number 1 clearly contains virginica flowers.
553 Mixture Modeling without Training Data Latent Structure Analysis There is a variation of mixture modeling called latent structure analysis in which observed variables are required to be independent within each group. E To require that PetalLength and PetalWidth be uncorrelated and therefore (because they are multivariate normally distributed) independent, remove the double-headed arrow that connects them in the path diagram. The resulting path diagram is shown here.
554 Example 35 Label Switching If you attempt to replicate the analysis in this example, it is possible that you will get the results that are reported here but with the group names permuted. The results reported here for Group number 1 might correspond to the results you get for Group number 2 or Group number 3. This is sometimes called label switching (Chung, Loken, and Schafer, 2004). Label switching is not really a problem unless it occurs during the course of a single analysis.
555 Mixture Modeling without Training Data Label switching can be revealed by a multi-model posterior distribution for one or more parameters. The preceding trace plot corresponds to the following posterior distribution estimate. The preceding graph shows that the mean of a parameter’s posterior distribution may not be a meaningful estimate in a mixture modeling analysis when label switching occurs.
Example 36 Mixture Regression Modeling Introduction Mixture regression modeling (Ding, 2006) is appropriate when you have a regression model that is incorrect for an entire population, but where the population can be divided into subgroups in such a way that the regression model is correct in each subgroup. About the Data Two artificial datasets will be used to explain mixture regression. First Dataset The following dataset is in the file DosageAndPerformance1.sav.
558 Example 36 A scatterplot of dosage and performance shows two distinct groups of people in the sample. In one group, performance improves as dosage goes up. In the other group, performance gets worse as dosage goes up. performance 20.00 10.00 0.
559 Mixture Regression Modeling It would be a mistake to try to fit a single regression line to the whole sample. On the other hand, two straight lines, one for each group, would fit the data well. This is a job for mixture regression modeling. A mixture regression analysis would attempt to divide the sample up into groups and to fit a separate regression line to each group. Second Dataset The following dataset, in the file DosageAndPerformance2.
560 Example 36 $$ $ $ $ $ $$ $$ $ $ $$ $ $ $ $$$ $ $ $ $ $ $$$$$ $ $ $ $$ $$ $$ $ $ $ $$ $ $$$$ $$ $ $ $$$ $$ $$$ $ $ $ $ $ $ $ $ $ $$$ $$$ $ $$ $$ $ $$$ $ $$ $ $$$$ $$ $ $ $ $$$ $ $$$ $ $$$$ $ $ $$$$$$ $ $ $ $ $ $ $ $ $$$$$$$ $ $$ $ $$$ $$$ $ $$$ $$$$ $$$$ $$$ $$ $ $$$$$$ $$$$ $ $$$ $ $ $ $ $$$$$$ $ $$$ $$$$$$$ $$$$$$ $ $ $$$$$$$ $$ $$$ $ $$$$$ $ $$$$$$ $$$$$$ $$ $ $ $$$$$$ $ $ $ $ $ $ $ $ $$$$$ $$ $$ $ $$ $$$ $ $$$ $$ $ $$$$ $ $ $$ $$$$$$$$$$$$ $ $$$$ $$$$$ $ $$ $$$$$$$ $ $$$$ $$$$$ $ $ $ $ $ $$$$ $$ $
561 Mixture Regression Modeling The program will then use the five cases that have been pre-classified to assist in classifying the remaining cases. Pre-assigning selected individual cases to groups is mentioned here only as a possibility. In the present example, no cases will be preassigned to groups. Performing the Analysis Only the DosageAndPerformance2.sav dataset will be analyzed in this example. E From the menus, choose File > New to start a new path diagram.
562 Example 36 E Click New to create a second group. E Click Close. This example fits a two-group mixture regression model. When you aren’t sure how many groups there are, you can run the program multiple times. Run the program once to fit a two-group model, then again to fit a three-group model, and so on.
563 Mixture Regression Modeling Specifying the Data File E From the menus, choose File > Data Files. E Click Group number 1 to select that row. E Click File Name, select the DosageAndPerformance2.sav file that is in the Amos Examples directory, and click Open. E Click Grouping Variable and double-click group in the Choose a Grouping Variable dialog box. This tells the program that the variable called group will be used to distinguish one group from another.
564 Example 36 E Repeat the preceding steps for Group number 2, specifying the same data file (DosageAndPerformance2.sav) and the same grouping variable (group).
565 Mixture Regression Modeling E Select Assign cases to groups (a check mark will appear next to it). So far, this has been just like any ordinary multiple-group analysis except for the check mark next to Assign cases to groups.
566 Example 36 modeling analysis. The check mark tells Amos to assign a case to a group if the grouping variable in the data file does not already assign it to a group. Notice that it was not necessary to click Group Value to specify a value for the grouping variable. The data file contains no values for the grouping variable (group), so the program automatically constructed values for the group variable: Cluster1 for cases in Group number 1, and Cluster2 for cases in Group number 2.
567 Mixture Regression Modeling Fitting the Model E Click on the toolbar. or E From the menus, choose Analyze > Bayesian Estimation. Note: The button is disabled because, in mixture modeling, you can perform only Bayesian estimation.
568 Example 36 After the Bayesian SEM window opens, wait until the unhappy face changes into a happy face . The table of estimates in the Bayesian SEM window should then look something like this: The Bayesian SEM window contains all of the parameter estimates that you would get in an ordinary multiple-group regression analysis. There is a separate table of estimates for each group. You can switch from group to group by clicking the tabs just above the table of estimates.
569 Mixture Regression Modeling The bottom row of the table contains an estimate of the proportion of the population that lies in an individual group. The preceding figure, which displays estimates for Group number 1, shows that the proportion of the population in Group number 1 is estimated to be 0.247. To see the estimated posterior distribution of that population proportion, right-click the proportion’s row in the table and choose Show Posterior from the pop-up menu.
570 Example 36 The graph in the Posterior window shows that the proportion of the population in Group number 1 is practically guaranteed to be somewhere between 0.15 and 0.35. Let’s compare the regression weight and the intercept in Group number 1 with the corresponding estimates in Group number 2. In Group number 1, the regression weight estimate is 2.082 and the intercept estimate is 5.399. In Group number 2, the regression weight estimate (1.
571 Mixture Regression Modeling
572 Example 36 Classifying Individual Cases To obtain probabilities of group membership for each individual case: E Click the Posterior Predictive button . or E From the menus, choose View > Posterior Predictive. For each case, the Posterior Predictive Distributions window shows the probability that the group variable takes on the value Cluster1 or Cluster2. Case 1 is estimated to have a 0.88 probability of being in Group number 1 and a 0.12 probability of being in Group number 2.
573 Mixture Regression Modeling Improving Parameter Estimates You can improve the parameter estimates (and also improve Amos’s ability to form clusters) by reducing the number of parameters that need to be estimated. As we have seen, the slope of the regression line is about the same for the two groups. Also, the variability about each regression line appears to be about the same for the two groups.
574 Example 36 The path diagram should now look like the following figure. (This path diagram is saved as Ex36-b.amw.) After constraining the slope and error variance to be the same for the two groups, you can repeat the mixture modeling analysis by clicking the Bayesian button . The results of that analysis will not be presented here.
575 Mixture Regression Modeling Prior Distribution of Group Proportions For the prior distribution of group proportions, Amos uses a Dirichlet distribution with parameters that you can specify. By default, the Dirichlet parameters are 4, 4, …. E To specify the Dirichlet parameters, right-click on a group proportion’s estimate in the Bayesian SEM window and choose Show Prior from the pop-up menu.
576 Example 36 Label Switching It is possible that the results reported here for Group number 1 will match the results that you get for Group number 2, and that the results reported here for Group number 2 will match those that you get for Group number 1. In other words, your results may match the results reported here, but with the group names reversed. This is sometimes called label switching (Chung, Loken, and Schafer, 2004). Label switching is discussed further at the end of Example 35.
Example 37 Using Amos Graphics without Drawing a Path Diagram Introduction People usually specify models in Amos Graphics by drawing path diagrams; however, Amos Graphics also provides a non-graphical method for model specification. If you don't want to draw a path diagram, you can specify a model by entering text in the form of a Visual Basic or C# program.
578 Example 37 About the Data The Holzinger and Swineford (1939) dataset from Example 8 is used for this example. A Common Factor Model The factor analysis model from Example 8 is used for this example. Whereas the model was specified in Example 8 by drawing its path diagram, the same model will be specified in the current example by writing a Visual Basic program. Creating a Plugin to Specify the Model E From the menus, choose Plugins > Plugins. E In the Plugins dialog box, click Create.
579 Using Amos Graphics without Drawing a Path Diagram The Program Editor window opens. E In the Program Editor window, change the Name and Description functions so that they return meaningful strings. You may find it helpful at this point to refer to the first path diagram in Example 8. We are going to add one line to the Mainsub function for each rectangle, ellipse and arrow in the path diagram.
580 Example 37 E In the Program Editor, enter the line pd.Observed("visperc") as the first line in the Mainsub function. If you save the plugin now, you can use it later on to draw a rectangle representing a variable called visperc. The rectangle will be drawn with arbitrary height and width at a random location in the path diagram. You can specify its height, width and location. For example, pd.Observed("visperc", 400, 300, 200, 100) draws a rectangle for a variable called visperc.
581 Using Amos Graphics without Drawing a Path Diagram E Enter the following additional lines in the Mainsub function so that the plugin will draw five more rectangles for the five remaining observed variables: pd.Observed("cubes") pd.Observed("lozenges") pd.Observed("paragrap") pd.Observed("sentence") pd.Observed("wordmean") E Enter the following lines so that the plugin will draw eight ellipses for the eight unobserved variables: pd.Unobserved("err_v") pd.Unobserved("err_c") pd.Unobserved("err_l") pd.
582 Example 37 E Enter the following lines so that the plugin will draw the 12 single-headed arrows: pd.Path("visperc", "spatial", 1) pd.Path("cubes", "spatial") pd.Path("lozenges", "spatial") pd.Path("paragrap", "verbal", 1) pd.Path("sentence", "verbal") pd.Path("wordmean", "verbal") pd.Path("visperc", "err_v", 1) pd.Path("cubes", "err_c", 1) pd.Path("lozenges", "err_l", 1) pd.Path("paragrap", "err_p", 1) pd.Path("sentence", "err_s", 1) pd.
583 Using Amos Graphics without Drawing a Path Diagram E Specify a height, width and location each time you use the Observed, Unobserved and Caption methods of the pd class. (See the online help for the Observed, Unobserved and Caption methods.) or E In your plugin, use the Reposition method to improve the positioning of objects. After running the plugin, use the drawing tools in the Amos Graphics toolbox to interactively move and resize the objects in the path diagram.
584 Example 37 The Mainsub function now looks like this in the Program Editor:
585 Using Amos Graphics without Drawing a Path Diagram This completes the plugin for specifying the factor analysis model from Example 8. You can find a pre-written copy of the plugin in a file called Ex37a-plugin.vb located in a subfolder of Amos’s plugins folder. If you performed a typical installation of Amos, Ex37a-plugin.vb is in the location: C:\Program Files\IBM\SPSS\Amos\21\Plugins\. Compiling and Saving the Plugin E Click Compile in the Program Editor window.
586 Example 37 After you have saved your plugin, its name, Example 37a, appears on the list of plugins in the Plugins window. (Recall that Example 37a is the string returned by the plugin’s Name function.) E Close the Plugins window. Using the Plugin E From the menus, choose File > New to start with an empty path diagram. If you are asked whether you want to save your work, choose either Yes or No: E From the menus, choose Plugins > Example 37a.
587 Using Amos Graphics without Drawing a Path Diagram certainly get a different path diagram because a random number generator plays a role in positioning the elements in the path diagram.
588 Example 37 Other Aspects of the Analysis in Addition to Model Specification In Example 8, the data file Grnt_fem.sav was specified interactively (by choosing File > Data Files on the menus). You can do the same thing here as well. As an alternative, you can specify the Grnt_fem.sav data file within the plugin by adding the following lines to the Mainsub function: pd.SetDataFile(1, MiscAmosTypes.cDatabaseFormat.mmSPSS, _ "C:\Program Files\IBM\SPSS\Amos\21\Examples\English\grnt_fem.
589 Using Amos Graphics without Drawing a Path Diagram Then you can use the program variable wordmean to refer to the model variable called wordmean, and use the program variable verbal to refer to the model variable called verbal. If you want to draw a single-headed arrow from the verbal variable to the wordmean variable, you can write either pd.Path(wordmean, verbal) or pd.
Appendix A Notation q = the number of parameters γ = the vector of parameters (of order q) G = the number of groups N (g) = the number of observations in group g G N = ∑N (g) = the total number of observations in all groups combined g=1 p p (g) = the number of observed variables in group g *(g) = the number of sample moments in group g.
592 Appendix A (g) Σ ( γ ) = the covariance matrix for group g, according to the model (g) μ ( γ ) = the mean vector for group g, according to the model (g) = the population covariance matrix for group g (g) = the population mean vector for group g Σ0 μ0 s (g) (g) = vec ( S ) = the p * vector (g) distinct elements of S (g) arranged in a single column (g) σ ( γ ) = vec ( Σ ( g ) ( γ ) ) r = the non-negative integer specified by the ChiCorrect method. By default r = G.
Appendix B Discrepancy Functions Amos minimizes discrepancy functions (Browne, 1982, 1984) of the form: (D1) ⎛ G (g ) ⎜ ∑ N f μ ( g ) , Σ( g );x ( g ) , S ( g ) g =1 C (α ,a ) = [N − r ]⎜⎜ N ⎜⎜ ⎝ ( )⎞⎟ ⎟ = [N − r ]F (α ,a ) ⎟ ⎟⎟ ⎠ Different discrepancy functions are obtained by changing the way f is defined. If means and intercepts are unconstrained and do not appear as explicit model (g) (g) (g) (g) parameters, x and μ will be omitted and f will be written f ( Σ ; S ) .
594 Appendix B (D2) ( ) ( ) ( ) f ML μ (g ) , Σ (g ); x ( g ) , S ( g ) = f KL μ (g ) , Σ (g ); x (g ) , S ( g ) − f KL x (g ) , S ( g ); x (g ) , S (g ) −1 −1 ′ = log Σ ( g ) + tr S ( g )Σ (g ) − log S ( g ) − p (g ) + x (g ) − μ ( g ) Σ (g ) x (g ) − μ (g ) .
595 Discrepancy Functions [U ( ) ] g ij , kl =wij( g,kl) − wij( g ) wkl( g ) For scale-free least squares estimation (SLS), C SLS , and F SLS are obtained by taking f to be: (D5) ( [ ) ( f SLS Σ ( g );S ( g ) = 12 tr D(g ) S (g ) − Σ (g ) where D ( g) −1 )] 2 (g) = diag(S ).
596 Appendix B Suppose you have two independent samples and a model for each. Furthermore, suppose that you analyze the two samples simultaneously, but that, in doing so, you impose no constraints requiring any parameter in one model to equal any parameter in the other model. Then, if you minimize (D1a), the parameter estimates obtained from the simultaneous analysis of both groups will be the same as from separate analyses of each group alone.
Appendix C Measures of Fit Model evaluation is one of the most unsettled and difficult issues connected with structural modeling. Bollen and Long (1993), MacCallum (1990), Mulaik, et al. (1989), and Steiger (1990) present a variety of viewpoints and recommendations on this topic. Dozens of statistics, besides the value of the discrepancy function at its minimum, have been proposed as measures of the merit of a model. Amos calculates most of them.
598 Appendix C Measures of Parsimony Models with relatively few parameters (and relatively many degrees of freedom) are sometimes said to be high in parsimony, or simplicity. Models with many parameters (and few degrees of freedom) are said to be complex, or lacking in parsimony. This use of the terms simplicity and complexity does not always conform to everyday usage.
599 Measures of Fit where p is the number of sample moments and q is the number of distinct parameters. Rigdon (1994a) gives a detailed explanation of the calculation and interpretation of degrees of freedom. Note: Use the \df text macro to display the degrees of freedom in the output path diagram. PRATIO The parsimony ratio (James, Mulaik, and Brett, 1982; Mulaik, et al.
600 Appendix C specified model). That is, P is a “p value” for testing the hypothesis that the model fits perfectly in the population. One approach to model selection employs statistical hypothesis testing to eliminate from consideration those models that are inconsistent with the available data. Hypothesis testing is a widely accepted procedure, and there is a lot of experience in its use.
601 Measures of Fit Our opinion...is that this null hypothesis [of perfect fit] is implausible and that it does not help much to know whether or not the statistical test has been able to detect that it is false. (Browne and Mels, 1992, p. 78). See also “PCLOSE” on p. 605. Note: Use the \p text macro for displaying this p value in the output path diagram. CMIN/DF CMIN/DF is the minimum discrepancy, Ĉ , (see Appendix B) divided by its degrees of freedom.
602 Appendix C FMIN FMIN is the minimum value, F̂ , of the discrepancy, F (see Appendix B). Note: Use the \fmin text macro to display the minimum value F̂ of the discrepancy function F in the output path diagram. Measures Based On the Population Discrepancy Steiger and Lind (1980) introduced the use of the population discrepancy function as a measure of model adequacy.
603 Measures of Fit for δ , and δU is obtained by solving ( ) Φ Cˆ | δ , d = .05 for δ , where Φ(x δ, d) is the distribution function of the noncentral chi-squared distribution with noncentrality parameter δ and d degrees of freedom. Note: Use the \ncp text macro to display the value of the noncentrality parameter estimate in the path diagram, \ncplo to display the lower 90% confidence limit, and \ncphi for the upper 90% confidence limit.
604 Appendix C error of approximation, called RMS by Steiger and Lind, and RMSEA by Browne and Cudeck (1993). population RMSEA = F0 d estimated RMSEA = Fˆ0 d The columns labeled LO 90 and HI 90 contain the lower limit and upper limit of a 90% confidence interval on the population value of RMSEA. The limits are given by LO 90 = HI 90 = δL n d δU n d Rule of Thumb Practical experience has made us feel that a value of the RMSEA of about 0.
605 Measures of Fit PCLOSE 2 PCLOSE = 1 – Φ( Ĉ .05 nd, d ) is a p value for testing the null hypothesis that the population RMSEA is no greater than 0.05. H 0 : RMSEA ≤ .05 By contrast, the p value in the P column (see “P” on p. 599) is for testing the hypothesis that the population RMSEA is 0. H 0 : RMSEA = 0 Based on their experience with RMSEA, Browne and Cudeck (1993) suggest that a RMSEA of 0.05 or less indicates a close fit.
606 Appendix C See also “ECVI” on p. 607. Note: Use the \aic text macro to display the value of the Akaike information criterion in the output path diagram. BCC The Browne-Cudeck (1989) criterion is given by G BCC = Cˆ + 2q ∑ b (g ) g =1 ( ) p (g ) p (g ) + 3 N (g ) − p (g ) − 2 ∑ p (g )(p (g ) + 3) G g =1 (g) N (g) (g) (g) where b = N – 1 if the Emulisrel6 command has been used, or b = n --------- if it N has not. BCC imposes a slightly greater penalty for model complexity than does AIC.
607 Measures of Fit Note: Use the \bic text macro to display the value of the Bayes information criterion in the output path diagram. CAIC Bozdogan’s (1987) CAIC (consistent AIC) is given by the formula ( ) CAIC = Cˆ + q ln N (1) + 1 CAIC assigns a greater penalty to model complexity than either AIC or BCC but not as great a penalty as does BIC. CAIC is reported only for the case of a single group where means and intercepts are not explicit model parameters.
608 Appendix C MECVI Except for a scale factor, MECVI is identical to BCC. G MECVI = ∑ 1 (BCC) = Fˆ + 2q g =1 n a (g ) ( ) p ( g ) p (g ) + 3 N (g ) − p (g ) − 2 ∑ p (g )(p (g ) + 3) G g =1 where a has not. (g) (g) (g) (g) N –1 N = ------------------ if the Emulisrel6 command has been used, or a = --------- if it N–G N See also “BCC” on p. 606. Note: Use the \mecvi text macro to display the modified ECVI statistic in the output path diagram.
609 Measures of Fit Model NPAR CMIN DF P CMIN/DF Model A: No Autocorrelation Model B: Most General Model C: Time-Invariance Model D: A and C Combined Saturated model Independence model 15 16 13 12 21 6 71.544 6.383 7.501 73.077 0.000 2131.790 6 5 8 9 0 15 0.000 0.271 0.484 0.000 11.924 1.277 0.938 8.120 0.000 142.119 This things-could-be-much-worse philosophy of model evaluation is incorporated into a number of fit measures.
610 Appendix C Model NPAR CMIN DF P CMIN/DF Model A: No Autocorrelation Model B: Most General Model C: Time-Invariance Model D: A and C Combined Saturated model Independence model 15 16 13 12 21 6 71.544 6.383 7.501 73.077 0.000 2131.790 6 5 8 9 0 15 0.000 0.271 0.484 0.000 11.924 1.277 0.938 8.120 0.000 142.119 Looked at in this way, the fit of Model A is a lot closer to the fit of the saturated model than it is to the fit of the independence model.
611 Measures of Fit where Ĉ and d are the discrepancy and the degrees of freedom for the model being evaluated, and Ĉ b and d b are the discrepancy and the degrees of freedom for the baseline model. The RFI is obtained from the NFI by substituting F / d for F. RFI values close to 1 indicate a very good fit. Note: Use the \rfi text macro to display the relative fit index value in the output path diagram.
612 Appendix C CFI The comparative fit index (CFI; Bentler, 1990) is given by CFI = 1 − ( ) NCP max Cˆ − d , 0 =1− ˆ NCP b max Cb − d b , 0 ( ) where Ĉ , d, and NCP are the discrepancy, the degrees of freedom, and the noncentrality parameter estimate for the model being evaluated, and Ĉ b , d b, and NCP b are the discrepancy, the degrees of freedom, and the noncentrality parameter estimate for the baseline model.
613 Measures of Fit PNFI The PNFI is the result of applying James, et al.’s (1982) parsimony adjustment to the NFI PNFI = (NFI)(PRATIO) = NFI d db where d is the degrees of freedom for the model being evaluated, and d b is the degrees of freedom for the baseline model. Note: Use the \pnfi text macro to display the value of the parsimonious normed fit index in the output path diagram. PCFI The PCFI is the result of applying James, et al.
614 Appendix C The GFI is given by GFI = 1 − Fˆ Fˆ b where F̂ is the minimum value of the discrepancy function defined in Appendix B and (g) Fˆb is obtained by evaluating F with Σ = 0 , g = 1, 2,...,G. An exception has to be made for maximum likelihood estimation, since (D2) in Appendix B is not defined for (g) Σ = 0.
615 Measures of Fit PGFI The PGFI (parsimony goodness-of-fit index), suggested by Mulaik, et al. (1989), is a modification of the GFI that takes into account the degrees of freedom available for testing the model PGFI = GFI d db where d is the degrees of freedom for the model being evaluated, and G db = ∑ p*(g ) g =1 is the degrees of freedom for the baseline zero model. Note: Use the \pgfi text macro to display the value of the parsimonious GFI in the output path diagram.
616 Appendix C Here are the critical N’s displayed by Amos for each of the models in Example 6: Model HOELTER 0.05 HOELTER 0.01 Model A: No Autocorrelation Model B: Most General Model C: Time-Invariance Model D: A and C Combined Independence model 164 1615 1925 216 11 219 2201 2494 277 14 Model A, for instance, would have been accepted at the 0.05 level if the sample moments had been exactly as they were found to be in the Wheaton study but with a sample size of 164.
617 Measures of Fit The smaller the RMR is, the better. An RMR of 0 indicates a perfect fit. The following output from Example 6 shows that, according to the RMR, Model A is the best among the models considered except for the saturated model: Model RMR GFI AGFI PGFI Model A: No Autocorrelation Model B: Most General Model C: Time-Invariance Model D: A and C Combined Saturated model Independence model 0.284 0.757 0.749 0.263 0.000 12.342 0.975 0.998 0.997 0.975 1.000 0.494 0.913 0.990 0.993 0.941 0.
Appendix D Numeric Diagnosis of Non-Identifiability In order to decide whether a parameter is identified or an entire model is identified, Amos examines the rank of the matrix of approximate second derivatives and of some related matrices. The method used is similar to that of McDonald and Krane (1977). There are objections to this approach in principle (Bentler and Weeks, 1980; McDonald, 1982). There are also practical problems in determining the rank of a matrix in borderline cases.
Appendix E Using Fit Measures to Rank Models In general, it is hard to pick a fit measure because there are so many from which to choose. The choice gets easier when the purpose of the fit measure is to compare models to each other rather than to judge the merit of models by an absolute standard. For example, it turns out that it does not matter whether you use RMSEA, RFI, or TLI when rank ordering a collection of models.
622 Appendix E NCP = max ( Ĉ – d, 0 ) Ĉ – d F0 = F̂ 0 = max ⎛ -------------, 0⎞ ⎝ n ⎠ max ( Ĉ – d, 0 ) CFI = 1 – --------------------------------------------------------ˆ max ( C b – d b, Ĉ – d, 0 ) Ĉ – d RNI = 1 – ----------------- (not reported by Amos) ˆ Cb – db The following fit measures depend monotonically on Ĉ and not at all on d. The specification search procedure reports only Ĉ as representative of them all.
623 Using Fit Measures to Rank Models Each of the following fit measures is capable of providing a unique rank order of models. The rank order depends on the choice of baseline model as well. The specification search procedure does not report these measures. IFI = Δ 2 PNFI PCFI The following fit measures are the only ones reported by Amos that are not functions of Ĉ and d in the case of maximum likelihood estimation. The specification search procedure does not report these measures.
Appendix F Baseline Models for Descriptive Fit Measures Seven measures of fit (NFI, RFI, IFI, TLI, CFI, PNFI, and PCFI) require a null or baseline bad model against which other models can be compared. The specification search procedure offers a choice of four null, or baseline, models: Null 1: The observed variables are required to be uncorrelated. Their means and variances are unconstrained.
626 Appendix F To specify which baseline models you want to be fitted during specification searches: E From the menus, choose Analyze > Specification Search. E Click the Options button on the Specification Search toolbar. E In the Options dialog box, click the Next search tab. The four null models and the saturated model are listed in the Benchmark models group.
Appendix G Rescaling of AIC, BCC, and BIC The fit measures, AIC, BCC, and BIC, are defined in Appendix C. Each measure is of the form Ĉ + kq , where k takes on the same value for all models. Small values are good, reflecting a combination of good fit to the data (small Ĉ ) and parsimony (small q). The measures are used for comparing models to each other and not for judging the merit of a single model.
628 Appendix G The rescaled values are either 0 or positive. For example, the best model according to AIC has AIC 0 = 0 , while inferior models have positive AIC 0 values that reflect how much worse they are than the best model. E To display AIC 0 , BCC 0 , and BIC 0 after a specification search, click on the Specification Search toolbar. E On the Current results tab of the Options dialog box, click Zero-based (min = 0).
629 Rescaling of AIC, BCC, and BI C Akaike Weights and Bayes Factors (Max = 1) E To obtain the following rescaling, select Akaike weights and Bayes factors (max = 1) on the Current results tab of the Options dialog box.
Notices This information was developed for products and services offered in the U.S.A. IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used.
632 Notices The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
633 Notices Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.
Bibliography Akaike, H. 1973. Information theory and an extension of the maximum likelihood principle. In: Proceedings of the 2nd International Symposium on Information Theory, B. N. Petrov and F. Csaki, eds. Budapest: Akademiai Kiado. 267–281. ______. 1978. A Bayesian analysis of the minimum AIC procedure. Annals of the Institute of Statistical Mathematics, 30: 9–14. ______. 1987. Factor analysis and AIC. Psychometrika, 52: 317–332. Allison, P. D. 2002. Missing data. Thousand Oaks, CA: Sage Publications.
636 Bibliography Attig, M. S. 1983. The processing of spatial information by adults. Presented at the annual meeting of The Gerontological Society, San Francisco. Beale, E. M. L., and R. J. A. Little. 1975. Missing values in multivariate analysis. Journal of the Royal Statistical Society Series B, 37: 129–145. Beck, A. T. 1967. Depression: causes and treatment. Philadelphia, PA: University of Pennsylvania Press. Bentler, P. M. 1980. Multivariate analysis with latent variables: Causal modeling.
637 Bibliography Bollen, K. A., and J. S. Long, eds. 1993. Testing structural equation models. Newbury Park, CA: Sage Publications. Bollen, K. A., and R. A. Stine. 1992. Bootstrapping goodness-of-fit measures in structural equation models. Sociological Methods and Research, 21: 205–229. Bolstad, W. M. 2004. Introduction to Bayesian Statistics. Hoboken, NJ: John Wiley and Sons. Boomsma, A. 1987. The robustness of maximum likelihood estimation in structural equation models.
638 Bibliography Carmines, E. G., and J. P. McIver. 1981. Analyzing models with unobserved variables. In: Social measurement: Current issues, G. W. Bohrnstedt and E. F. Borgatta, eds. Beverly Hills: Sage Publications. Cattell, R. B. 1966. The scree test for the number of factors. Multivariate Behavioral Research, 1: 245–276. Celeux, G., M. Hurn, and C. P. Robert. 2000. Computational and inferential difficulties with mixture posterior distributions.
639 Bibliography Draper, N. R., and H. Smith. 1981. Applied regression analysis. 2nd ed. New York: John Wiley and Sons. Edgington, E. S. 1987. Randomization Tests. 2nd ed. New York: Marcel Dekker. Efron, B. 1979. Bootstrap methods: Another look at the jackknife. Annals of Statistics, 7: 1–26. ______. 1982. The jackknife, the bootstrap, and other resampling plans. (SIAM Monograph #38) Philadelphia: Society for Industrial and Applied Mathematics. ______. 1987. Better bootstrap confidence intervals.
640 Bibliography Graham, J. W., S. M. Hofer, and D. P. MacKinnon. 1996. Maximizing the usefulness of data obtained with planned missing value patterns: An application of maximum likelihood procedures. Multivariate Behavorial Research, 31: 197–218. Gulliksen, H., and J. W. Tukey. 1958. Reliability for the law of comparative judgment. Psychometrika, 23: 95–110. Hamilton, L. C. 1990. Statistics with Stata. Pacific Grove, CA: Brooks/Cole. Hamilton, M. 1960. A rating scale for depression.
641 Bibliography Jöreskog, K. G. 1967. Some contributions to maximum likelihood factor analysis. Psychometrika, 32: 443–482. ______. 1969. A general approach to confirmatory maximum likelihood factor analysis. Psychometrika, 34: 183–202. ______. 1971. Simultaneous factor analysis in several populations. Psychometrika, 36: 409–426. ______. 1979. A general approach to confirmatory maximum likelihood factor analysis with addendum. In: Advances in factor analysis and structural equation models, K. G.
642 Bibliography Little, R. J. A., and D. B. Rubin. 1987. Statistical analysis with missing data. New York: John Wiley and Sons. ______. 1989. The analysis of social science data with missing values. Sociological Methods and Research, 18: 292–326. ______. 2002. Statistical analysis with missing data. New York: John Wiley and Sons. Little, R. J. A., and N. Schenker. 1995. Missing data. In: Handbook of statistical modeling for the social and behavioral sciences, G. Arminger, C. C. Clogg, and M. E.
643 Bibliography Mantel, N. 1967. The detection of disease clustering and a generalized regression approach. Cancer Research, 27: 209–220. Mantel, N., and R. S. Valand. 1970. A technique of nonparametric multivariate analysis. Biometrics, 26: 47–558. Mardia, K. V. 1970. Measures of multivariate skewness and kurtosis with applications. Biometrika, 57: 519–530. ______. 1974. Applications of some measures of multivariate skewness and kurtosis in testing normality and robustness studies.
644 Bibliography Mulaik, S. A. 1990. An analysis of the conditions under which the estimation of parameters inflates goodness of fit indices as measures of model validity. Paper presented at the Annual Meeting, Psychometric Society, Princeton, New Jersey, June 28–30, 1990. Mulaik, S. A., L. R. James, J. Van Alstine, N. Bennett, S. Lind, and C. D. Stilwell. 1989. Evaluation of goodness-of-fit indices for structural equation models. Psychological Bulletin, 105: 430–445. Muthén, B., D. Kaplan, and M. Hollis.
645 Bibliography Schafer, J. L., and M. K. Olsen. 1998. Multiple imputation for multivariate missing-data problems: A data analyst's perspective. Multivariate Behavioral Research, 33:4, 545–571. Schwarz, G. 1978. Estimating the dimension of a model. The Annals of Statistics, 6: 461–464. Scheines, R., H. Hoijtink, and A. Boomsma. 1999. Bayesian estimation and testing of structural equation models. Psychometrika, 64: 37–52. Shrout, P. E., and N. Bolger. 2002.
646 Bibliography Tanaka, J. S., and G. J. Huba. 1985. A fit index for covariance structure models under arbitrary GLS estimation. British Journal of Mathematical and Statistical Psychology, 38: 197–201. ______. 1989. A general coefficient of determination for covariance structure models under arbitrary GLS estimation. British Journal of Mathematical and Statistical Psychology, 42: 233–239. Tucker, L. R., and C. Lewis. 1973. A reliability coefficient for maximum likelihood factor analysis.
Index additive constant (intercept), 221 ADF, asymptotically distribution-free, 594 admissibility test in Bayesian estimation, 420 AGFI, adjusted goodness-of-fit index, 614 AIC Akaike information criterion, 309, 605 Burnham and Anderson’s guidelines for, 326 Akaike weights, 628, 629 interpreting, 328 viewing, 327 alternative to analysis of covariance, 145, 241 Amos Graphics, launching, 9 AmosEngine methods, 57 analysis of covariance, 147 alternative to, 145, 241 comparison of methods, 256 Anderson iris dat
648 Index category boundaries, 495 censored data, 475 CFI, comparative fit index, 612 change default behavior, 243 defaults, 243 fonts, 27 orientation of drawing area, 86 chi-square probability method, 281 chi-square statistic, 53 display in figure caption, 53 classification errors, 536 CMIN minimum discrepancy function C, 120, 599 table, 368 CMIN/DF, minimum discrepancy function divided by degrees of freedom, 601 combining results of multiply imputed data files, 471 common factor analysis model, 139 commo
649 Index draw covariances, 190 drawing area add covariance paths, 90 add unobserved variable, 90 change orientation of, 86 viewing measurement weights, 366 duplicate measurement model, 88 ECVI, expected cross-validation index, 607 endogenous variables, 69, 76 EQS (SEM program), 243 equality constraints, 140 equation format for AStructure method, 78 establishing covariances, 27 estimate means and intercepts option when not selected, 212 when selected, 212 estimating indirect effects, 425 means, 209 varian
650 Index indirect effects, 122 estimating, 425 finding a confidence interval for, 431 viewing standardized, 427 inequality constraints on data, 481, 488 information-theoretic measures of fit, 605 iris data, 521, 539 journals about structural equation modeling, 5 just-identified model, 73 label output, 51 variances and covariances, 191 label switching, 554, 576 latent structure analysis, 537, 553 latent variable posterior predictive distribution, 511 linear dependencies, 69 LISREL (SEM program), 243 list
651 Index move objects, 15 multiple imputation, 462 multiple models in a single analysis, 116 multiple-group analysis, 377 multiple-group factor analysis, 363 multiply imputed data file, combining results, 471 multiply imputed datasets, 469 multivariate analysis of variance, 216 naming groups, 196 variables, 26 NCP, noncentrality parameter, 602 negative variances, 153 nested models, 260 new group, 56, 77, 172 NFI, normed fit index, 609 NNFI, non-normed fit index, 611 non-diffuse prior distribution, 409 no
652 Index PGFI, parsimony goodness-of-fit index, 615 Plot window display best-fit graphs, 339 scree plot, 340 PNFI, parsimonious normed fit index, 613 point of diminishing returns, 332, 339, 342 population discrepancy measure of model adequacy, 602 posterior distribution, 385 mean, 386 standard deviation, 386 posterior predictive distribution, 481, 506, 535, 551, 572 for a latent variable, 511 PRATIO, parsimony ratio, 599 predictive distribution.
653 Index RMSEA, 621 viewing fit measures, 323 with few optional arrows, 320 specify benefits of equal parameters, 44 equal paramaters, 43 group name in figure caption, 176 specifying group differences conventions, 161 squared multiple correlation, 144 stability index, 135 stability test in Bayesian estimation, 420 stable model, 135 standardized estimates, 33, 132 obtain, 142 view, 143 statistical hypothesis testing, 104 stochastic regression imputation, 461 structural covariances, 365 structural equation