Society for Mathematical Psychology

Dr. Rich Shiffrin

Lord (1967) published a two page paper with simple data presented in a graph. He wished to show the absurdity of using ANCOVA, without good reason, to reach a scientific conclusion. The use of ‘paradox’ in the title misled people to think the use of ANCOVA might have been valid, so Lord (1969) published another two page paper to clarify. That did not end the confusion. Statisticians and causal theorists have been publishing long articles every few years since 1969 arguing that Lord was wrong (as would be the very many scientists who would agree with Lord), and arguing that ANCOVA could be justified for the data Lord presented. This history illustrates the divide between scientific inference and statistical inference, closely related to the difference between deduction (statistics) and induction/abduction (science). It is telling that not one of the many publications since 1969 have shown a model capable of generating the data shown in Lord’s original paper and also justifying the ANCOVA conclusions. Rather theoretical arguments have been given that there ought to be one. Scientists of course build theories; their theories are approximations to reality but attempt to explain in the simplest way consistent with present and past data the primary causal mechanisms that are operating to produce the data. The many statisticians and causal theorists analyzing Lord’s paradox since 1969 seem to have missed this point.

This is an in-person presentation on July 19, 2023 (09:00 ~ 09:20 UTC).

No recording available Join the discussion

Richard Morey

In recent discussions about the replication crisis, statistical looms large; claims about the misuse of classical significance testing, lax statistical evidence standards, non-replication (defined in a variety of statistical ways), and meta-analysis --- statistical inference from statistical inferences --- all involve statistical inference in some way. This is not surprising, since statistical inference has become one of the main tools for scientists since Fisher made it popular in the early 20th century. Arguments over the "right" way of approaching statistical inference give it outsized importance. I argue that, in fact, we cannot make statistical inferences except in trivial cases, and that all meaningful scientific inferences are non-statistical in nature. There is no unique, or obvious, mapping between a statistical "inference" and a scientific one; unfortunately, scientists have largely offloaded responsibility for their scientific inferences onto statistical theories that were not meant for the job. This point is not really new (Fisher made it in attacking Neyman and Pearson in 1955), and researchers often pay lip service to it when convenient (e.g., quoting Box, 1976: “All models are wrong…”). Statistical inference should be regarded as a mechanism for generating useful toys (Hennig, 2020) to introduce scepticism into scientific inferences, and no more. This does not mean inferential statistics are mere descriptive statistics, but the primacy of inferential statistics in interpretation of scientific data must be questioned (see also Amrhein, Trafimow, & Greenland, 2019).

This is an in-person presentation on July 19, 2023 (09:20 ~ 09:35 UTC).

No recording available Join the discussion

Eric-Jan Wagenmakers

Preregistration innoculates researchers against the myriad of biases that all humans inevitably succumb to: hindsight bias, confirmation bias, motivated reasoning, the bias blindspot, and many more. Without preregistration, researchers are not attended to the fact that they are cherry-picking among hypotheses or among likelihood functions. Without preregistration, the probability that an ESP researcher reports the absence of ESP is about as low as the probability that a mathematical psychologist reports that the data undercut their pet model and support that of their rival (has it ever happened?). As a Ulysses contract, however, preregistration may tie the researcher to the mast a little too tightly: when the data contain unexpected patterns this demands a different analysis than was originally foreseen, and the penalty of classifying the new analysis as "exploratory" is overly harsh. There is considerable promise in two alternative Ulysses contracts: analysis blinding and the mini-multi-analysts approach. The feasibility of these contracts will be discussed.

This is an in-person presentation on July 19, 2023 (09:35 ~ 09:50 UTC).

No recording available Join the discussion

Dr. Dora Matzke

The “crisis of confidence” in psychological research is fueled by concerns about the replicability of key results and the widespread use of questionable research practices, such as the selective reporting of significant results. The controversy has drawn widespread public attention and triggered a broad range of attempts to identify and remedy the factors that contributed to the crisis. Although the proposed recommendations vary considerably in focus, they often aim to restrict researchers’ degrees of freedom and analytic flexibility. In this talk, I argue that psychology’s reform movement cannot succeed in the absence of profound changes in the present academic culture and incentive system. As long as academic journals prefer strong claims and clean stories as opposed to the messy reality, and as long as funding agencies and universities make their decisions based on performance metrics valuing quantity over quality, researchers are unlikely to resist the temptation to take shortcuts, exaggerate claims, and aim for high-impact journals that place more emphasis on novelty than rigor.

This is an in-person presentation on July 19, 2023 (09:50 ~ 10:05 UTC).

No recording available Join the discussion

Don van Ravenzwaaij

Recently, the argument has been made that “what is called testing may have its place in inference […], but it actually is just one way of describing one’s belief with respect to the possible values of a parameter. Instead, we recommend estimation of the full posterior distribution […].” (Tendeiro & Kiers, 2019). In this talk, I will argue why I believe in many cases statistical testing is a necessary precursor to parameter estimation. I will structure my talk along two main arguments: (1) the principle of parsimony; (2) the size of the effect depending on the specifics of the experimental set-up.

This is an in-person presentation on July 19, 2023 (10:05 ~ 10:20 UTC).

No recording available Join the discussion

- Plenary Session

Plenary Discussion on Scientific and Statistical Inference

This is an in-person presentation on July 19, 2023 (10:20 ~ 11:00 UTC).

No recording available Join the discussion

Symposium: Scientific Inference & Statistical Inference