DevOps Articles

Curated articles, resources, tips and trends from the DevOps World.

Interpreting A/B test results: false positives and statistical significance

3 years ago netflixtechblog.com
Interpreting A/B test results: false positives and statistical significance

Summary: This is a summary of an article originally published by Netflix Tech Blog. Read the full original article here →

Subsequent posts will go into more details on experimentation across Netflix, how Netflix has invested in infrastructure to support and scale experimentation, and the importance of the culture of experimentation within Netflix. In Part 2: What is an A/B Test we talked about testing the Top 10 lists on Netflix, and how the primary decision metric for this test was a measure of member satisfaction with Netflix.

By convention, this false positive rate is usually set to 5%: for tests where there is not a meaningful difference between treatment and control, we’ll falsely conclude that there is a “statistically significant” difference 5% of the time.

Say we want to know if a coin is unfair, in the sense that the probability of heads is not 0.5 (or 50%).

If an observation falls in the rejection region, we conclude that there is statistically significant evidence that the coin is not fair, and “reject” the null.

Made with pure grit © 2024 Jetpack Labs Inc. All rights reserved. www.jetpacklabs.com