Big Data Solutions: A/B t test
@drsimonj here to share my code for using Welch’s t-test to compare group means using summary statistics.
I’ve just started working with A/B tests that use big data. Where once I’d whimsically run
t.test(), now my data won’t fit into memory!
I’m sharing my solution here in the hope that it might help others.
As a baseline, let’s start with an in-memory case by comparing whether automatic and manual cars have different Miles Per Gallon ratings on average (using the
mtcars data set).
t.test(mpg ~ am, data = mtcars) #> #> Welch Two Sample t-test #> #> data: mpg by am #> t = -3.7671, df = 18.332, p-value = 0.001374 #> alternative hypothesis: true difference in means is not equal to 0 #> 95 percent confidence interval: #> -11.280194 -3.209684 #> sample estimates: #> mean in group 0 mean in group 1 #> 17.14737 24.39231
Well… that was easy!
Continue reading →