How to create within-subject scatter plots in R with ggplot2

Today, we'll take a look at creating a specific type of visualization for data from a within-subjects experiment. You'll often see within-subject data visualized as bar graphs (condition means, and maybe mean difference if you're lucky.) But alternatives exist, and today we'll take a look at within-subjects scatterplots.

How to Compare Two Groups with Robust Bayesian Estimation Using R, Stan and brms

2017 will be the year when social scientists finally decided to diversify their applied statistics toolbox, and stop relying 100% on null hypothesis significance testing (NHST). A very appealing alternative to NHST is Bayesian statistics, which in itself contains many approaches to statistical inference. In this post, I provide an introductory and practical tutorial to Bayesian parameter estimation in the context of comparing two independent groups' data.

How to arrange ggplot2 panel plots

Panel plots are a common name for figures showing every person’s (or whatever your sampling unit is) data in their own little panel. This plot is sometimes also known as “small multiples”, although that more commonly refers to plots that illustrate interactions. Here, I’ll illustrate how to add information to a panel plot by arranging the panels according to some meaningful value. Here’s an example of a panel plot, using the sleepstudy data set from the lme4 package.

Meta-analysis is a special case of Bayesian multilevel modeling

Introduction Hello everybody! Recently, there’s been a lot of talk about meta-analysis, and here I would just like to quickly show that Bayesian multilevel modeling nicely takes care of your meta-analysis needs, and that it is easy to do in R with the rstan and brms packages. As you’ll see, meta-analysis is a special case of Bayesian multilevel modeling when you are unable or unwilling to put a prior distribution on the meta-analytic effect size estimate.

Statistical inference: Prix fixe or à la carte?

Experimental investigations commonly begin with a hypothesis, an expectation of what one might find: “We hypothesize that alcohol leads to slower reactions to events in a driving simulator.” Data is then collected and analyzed to specifically address this hypothesis. Almost always, the support for or against the hypothesis is statistical, not intraocular (Krantz, 1999). However, the prevailing statistical paradigm—null hypothesis significance testing (NHST)—never tests the researcher’s offered hypothesis, but instead the “null hypothesis”: That there is no relationship between alcohol consumption and reaction time.

Plots with subplots in R

Visualizations are great for learning from data, and communicating the results of a statistical investigation. In this post, I illustrate how to create small multiples from data using R and ggplot2. Small multiples display the same basic plot for many different groups simultaneously. For example, a data set might consist of a X ~ Y correlation measured simultaneously in many countries; small multiples display each country’s correlation in its own panel.

Multilevel Confidence

In this post, I address the following problem: How to obtain regression lines and their associated confidence intervals at the average and individual-specific levels, in a two-level multilevel linear regression. Background Visualization is perhaps the most effective way of communicating the results of a statistical model. For regression models, two figures are commonly used: The coefficient plot shows the coefficients of a model graphically, and can be used to replace or augment a model summary table.

Some Short Notes on Statistics

Gerd Gigerenzer writes (in a paper 10 years ago): “Most researchers, [a prominent textbook author] argued, are not really interested in statistical thinking, but only in how to get their papers published.” The article offers an idiosyncratic, interesting and to my mind an agreeable yet discomforting view of the historical development of the NHSTP–Null Hypothesis Significance Testing Procedure–and its perils. It’s full of interesting historical trivia, such as R.

Where are the keys to my F-16?

The average psychologist’s statistical toolkit is expanding. Multilevel (mixed effects) models are now routinely used where 10 years ago repeated measures ANOVA prevailed. Bayesian statistics are coming. Isn’t this fantastic? Well, yes and no. Here is a quote about the use of multilevel models by psycholinguists: At a recent workshop on mixed-effects models, a prominent psycholinguist memorably quipped that encouraging psycholinguists to use linear mixed-effects models was like giving shotguns to toddlers.

Replication Language

It is common to describe replication studies as “failed” when they don’t yield results in the same direction as the original study, or don’t have a p-value under the same threshold. Is this fair? What does a “failed replication” mean? Does it matter? The answers are no, it depends, and yes. What does it mean to fail? Failure is often asserted when a replication study doesn’t yield results consistent with the original study.