Statistics & Individual Differences
Dr. Judith ter Schure
Prof. Peter Grünwald
In this talk we introduce the Anytime Live and Leading Interim (ALL-IN) meta-analysis approach that allows experimenters to adaptively design multi-lab replication experiments in a safe manner. In this context, safe refers to the fact that the procedure comes with explicit guarantees regarding the tolerable type I error rate *during* data collection. This approach to meta-analysis is based on a so-called meta e-value that combines the currently available evidence across all studies. At any moment in time, experimenters can safely consult this meta e-value to decide whether it is worthwhile to (i) extend the meta-analysis with another (replication) study, (ii) continue recruiting participants in the currently active studies, or (iii) stop the data collection of all studies, because the combined evidence is already compelling. It is important to note, that despite this data-driven approach leading to interdependent studies, the statistical inferences from ALL-IN meta-analyses remain valid. This is unfortunately not the case for conventional meta-analyses based on p-values and confidence intervals, because, when used adaptively, they over-inflate the type I error, thus, have a high chance of mistaking random noise for structural effects. Statistical reliable use of conventional methods, therefore, requires experimenters to confine themselves to rigid designs, in which a meta-analysis is only conducted retrospectively once all --either too few, or too many-- studies are completed. The ALL-IN procedure frees experimenters from the statistical shackles imposed by conventional methods and empowers them with a more flexible, efficient, and safe approach to conducting meta-analyses.
This is an in-person presentation on July 21, 2023 (11:00 ~ 11:20 UTC).
Mr. Max Maier
Publication bias is a well-recognized threat to research synthesis. Although a variety of models have been proposed to adjust for publication, no single model performs well across different meta-analytic conditions (Carter et al., 2019; Hong & Reed, 2021). One possible remedy to this problem lies in Bayesian model averaging with Robust Bayesian Meta-Analysis (RoBMA; Maier et al., 2022). RoBMA addressed publication bias by averaging over 36 candidate models of the publication process and was shown to perform well under diverse conditions (Bartoš et al, 2022). In this talk, we extend RoBMA to meta-regression settings. The newly introduced moderator analyses enable testing for the presence as well as the absence of continuous and categorical moderators using Bayes factors. This advances existing frequentist methodologies by allowing researchers to also make claims about evidence for the absence of a moderator (rather than the mere absence of evidence as implied by a nonsignificant p-value). Furthermore, RoBMA's meta-regression does not only model average over the different publication process models, but also over the included moderators. Consequently, researchers can draw inferences about each moderator while accounting for the uncertainty in the remaining moderators. We evaluate the performance of the developed methodology in a simulation study and illustrate it with an example.
This is an in-person presentation on July 21, 2023 (11:20 ~ 11:40 UTC).
Dr. Marieke Van Vugt
Dr. Craig Hedge
Dr. Aline Bompas
Behavioural performance shows substantial endogenous variability over time, regardless of the task at hand. This intra-individual variability is a reliable marker of individual differences. Of growing interest to psychologists is the realisation that variability is not fully random, but typically exhibits short- and longer-range temporal structures. However, the measurement of temporal structures come with several controversies, and their potential benefit for studying individual differences in healthy and clinical populations remains unclear. The interpretation of these structures is also controversial. Behavioural variability is often implicitly associated with fluctuations in attentional focus, which also fluctuates over time between on- and off-task (e.g., “I was performing poorly because my mind was wandering elsewhere”). However, empirical evidence for this intuition is lacking. In the current research, we analyse the temporal structures in reaction (RT) series from new and archival datasets, using 11 different sensorimotor and cognitive tasks across 526 participants. We first investigate the intra-individual repeatability of the most common measures of temporal structures. Secondly, we examine inter-individual differences in these measures using: 1) task performance assessed from the same data, 2) meta-cognitive ratings of on-taskness from thought probes occasionally presented throughout the task, and 3) self-assessed attention-deficit related traits. Across all datasets, autocorrelation at lag 1 and Power Spectra Density slope showed high intra-individual repeatability across sessions and correlated with task performance. The Detrended Fluctuation Analysis slope showed the same pattern, but less reliably. The long-term component (d) of the ARFIMA(1,d,1) model showed poor repeatability and no correlation to performance. Overall, these measures failed to show external validity when correlated with either mean subjective attentional state or self-assessed traits between participants. Overall, temporal structures may be stable within individuals over time, but their relationship with subjective state and trait measures of attentional state remains elusive.
This is an in-person presentation on July 21, 2023 (11:40 ~ 12:00 UTC).
Prof. Andy Wills
This work explores model adequacy as a function of heterogeneity, prediction and a priori likelihood. Models are often evaluated when their behaviour is at its closest to a single group-averaged empirical result. This evaluation neglects the fact that both models and humans are heterogeneous. Models' and humans' behavioural repertoire is not restricted to a single unit of behaviour but is composed of a range of distinguishable behaviours - ordinal patterns. In this framework, we develop a measure, g-distance, that considers model adequacy to be the extent to which models exhibit a similar range of behaviours to the human behaviours it models. We then continue to apply this framework to models of an irrational learning effect, the inverse base-rate effect. We include 6 models in our model comparison. In the process of analysing the human data, we show that the canonical averaged group-level empirical result hides theoretically important and robust relationships between pairs of stimuli. These are amongst the most commonly observed ordinal results on a subject level. Models are unable to accommodate these relationships. They all perform unanimously poorly in our benchmark. In addition, the model that best accommodates human behaviour also predicts almost all unobserved possible behaviours. We show that all models unanimously predicted many more unobserved behaviours than accommodated already observed behaviours. We discuss these sets of results in terms of how well they approximate human behaviour in the inverse base-rate paradigm if most of the behaviours they produce are not exhibited by humans. Finally, we propose various new avenues for formal computational modelling by clearly defining a handful of scientific problems.
This is an in-person presentation on July 21, 2023 (12:00 ~ 12:20 UTC).
Submitting author
Author