Xi, X., Higgins, D., Zechner, K., & Williamson D. (2012). A comparison of two scoring methods for an automated speech scoring system. Language Testing, 29(3), 371-394.
摘要：This paper compares two alternative scoring methods -- multiple regression and classification trees -- for an automated speech scoring system used in a practice environment. The two methods were evaluated on two criteria: construct representation and empirical performance in predicting human scores. The empirical performance of the two scoring models is reported in Zechner, Higgins, Xi, & Williamson (2009), which discusses the development of the entire automated speech scoring system; the current paper shifts the focus to the comparison of the two scoring methods, elaborating both technical and substantive considerations and providing a reasoned argument for the trade-off between them. We concluded that a multiple regression model with expert weights was superior to the classification tree model. In addition to comparing the relative performance of the two models, we also evaluated the adequacy of the regression model for the intended use. In particular, the construct representation of the model was sufficiently broad to justify its use in a low-stakes application. The correlation of the model-predicted total test scores with human scores (r = 0.7) was also deemed acceptable for practice purposes. [Reprinted by permission of Sage Publications, Ltd., copyright holder.]
关键词：applied linguistics, language testing and assessment, English as a Second Language Tests, Automatic Speaker Recognition