Using cross-validation to evaluate model components: The case of visual working memory
Mathematical models are frequently used to formalize and test theories of psychological processes. When there are multiple competing models, the scientific question becomes one of model selection: How to select the model that most likely represents the underlying data-generating process? One common method is to select the model that strikes the best compromise between goodness of fit and complexity. For example, by penalizing model fit with the number of parameters (e.g. AIC, BIC). The idea of such an approach is that a model that fits the data well but is not too complex likely generalizes well to new data. A more direct approach of evaluating a model’s ability to generalize to new data is using cross validation; each model is repeatedly fit to a subset of data and the results of that fit are used to predict the subset of the data that was not used for fitting.We compared both methods of model selection in the domain of visual-working memory. The theoretical debates in this domain are reflected in the components of its formal models: guessing processes, item limits, the stability of memory across trials, etc. We selected a number of common model variations and compared them using both AIC (which is commonly used in the field) and three types of cross validation. Our results suggest that both methods largely lead to the same theoretical inferences about the nature of memory. However, numerical issues commonly occur when fitting more complex model variants which complicates model selection and inference.
Very nice talk on an important but rarely brought-up issue in model comparison. I really liked your inclusion of LOSsO-CV, which is an actual PREDictive measure, unlike the many theory-derived & statistically-inferred POSTdictive measures such as AIC, BIC, standard CV, and even MDL. LOSsO-CV is similar, at least in spirit, to Three-way CV or 3CV in...