Last updated:

For growth marketers, there’s nothing more important than data. Wait, scratch that — there’s nothing more important than analyzing data properly. If you fail to do so, it’s easy to come to the wrong conclusions. Fortunately, there’s a way to measure data without any room for guesswork: A/B testing. 

What is A/B Testing?

A/B testing, also known as split testing or bucket testing, is one of the most powerful tools in a marketer’s kit. It’s a method of comparing two different user experiences against each other to determine which one drives better results. 

First, you’ll randomly divide your target audience into a test versus control group and show a different experience to each group across the same time period. Below is an illustrative view of how A/B testing works: 

If at the end of the experiment period, you have observed a conversion rate of 23% across the control group and 35% across the test group, with the only significant difference being the designed experience, then you can conclude that your change in experience caused the improvement.

The elements that typically change for test experiences include: 

  • Calls to action (CTA): There are various ways to test CTA buttons: You can test button colors, size, and placement, and the copy within the CTA buttons (something unique versus generic, such as “Subscribe Now!).
  • Ad copy: When it comes to copy, you can test a variety of things, including voice, tone, and length. Test product descriptions, copy that lives on your landing page, and more. 
  • Images: Test to see what types of images resonate with your audience the most — for instance, do they prefer images with people or illustrations? 
  • Email subject lines: Test different subject lines to see what makes people open your emails (or send them right to the trash). 
  • Landing pages: Test everything from product descriptions and the layout to videos and headlines. 

Why is A/B Testing Important? 

Imagine that you make a change to your website and your signup rate goes up around the same time. You may be tempted to attribute the results to the change that you made, but without the benefit of a control group, you have no way to tell whether your change caused the result or if they merely happened around the same time. The world that we live in is constantly changing — your signup rate improvement could’ve been caused by a change in your audience makeup, seasonal shifts, an unexpected press hit, or even random chance. Without a control group, you can’t confidently conclude whether your actions had the desired effect.

While A/B testing sounds simple in theory, setting up a proper A/B test can be quite challenging and is something that many people get wrong. If your tests aren’t executed properly, your results will be invalid and you will be relying on misleading data.

What is Statistical Significance? 

In a perfect data world, there would be no uncertainty. However, even A/B tests have limitations — after all, you’re measuring a sample of the infinitely many future visitors to your site, and then predicting how those visitors would behave. Any time we try to glean knowledge about a whole population or predict future behavior, there’s always going to be an element of uncertainty there. 

That’s where statistical significance comes in. In the context of A/B testing, the concept of statistical significance is to “quantify uncertainty.” In other words, statistical significance is “how likely it is that the difference between your experiment’s control version and test version isn’t due to error or random chance.” It’s an integral component of A/B testing and plays an essential part in conversion optimization and user testing.

Three Tips On How to Use Them Properly

Choose your control groups carefully and objectively

  • Think of the most significant customer segments that you have. Do customers from different geo regions behave differently? How about customers from various industries? Different tiers of product? Different levels of spend?
  • Make sure that your test and control groups do not overlap. 
  • Note that hindsight is 20/20 and severely biased. Don’t ever, ever choose your control group after the test has already run.

Wait for significance

  • A good rule of thumb is to wait until at least 100 conversions occur from each group.
  • Another crucial guideline is to run a test for at least seven days due to weekly fluctuations. The entire internet has weekly and even monthly changes in activity. People browse significantly more on desktops in the middle of the week than on the weekends. People buy more things in the week after paychecks hit than in the week leading up to it.
  • As a related tip, make sure not to run a test during major seasonal events that aren’t indicative of regular customer behavior.

Don’t peek — seriously 

  • The time before you reach statistical significance is the wild west. Anything can happen, and any differences you observe in results are, by definition, due to random chance. Peeking will tempt you to call the test early, run it for longer than you originally planned or — worst of all — tweak the test parameters halfway through. If you jump to conclusions, you’ll skew the results, and the test won’t be accurate. 

It’s All About Planning and Patience

A/B testing is an essential part of marketing, and, ultimately, growing your business. That’s why you shouldn’t just run one or two tests and call it a day; A/B testing must be a continuous part of your marketing strategy. And although you’ll likely run into many mediocre results before finding success, it’s all apart of the process — with careful planning and patience, you’ll get there. 

Now that you’ve touched on A/B testing and statistical significance, it’s time to go over other types of testing, including multi-page funnel testing and multivariate testing. Read our post on implementation and testing for D2C brands for more.  

Julie Zhou
Author

Julie Zhou is the Senior Director for Growth at AdRoll. Her team manages paid ads, SEO, lifecycle marketing, emails, partnerships, knowledge base, and the website. When not working, she enjoys deadlifting multiples of her bodyweight and chasing after her toddler.