Welcome to Part Three of our series on designing and implementing a successful marketing experiment. In our previous posts, we looked at a few strategies for designing an experiment with a well-formulated hypothesis and a way to control for natural variation between groups. Today, we are going to discuss how to identify the most salient effects of a treatment and draw valid, useful conclusions from experimental results.
Let us return to our hypothetical marketing department. As you will remember, our goal is to determine whether a 20% discount in an email is an effective way to get customers to return to our store. First, we formulated our hypothesis as a falsifiable statement which, if confirmed, also takes the same form as our conclusion, in our case:
Emailing customers a 20% discount increases the likelihood that they will make a purchase in the following week.
Imagine that we send all of our customers a 20% discount and see that many of them return to our store. Thus, we conclude that a 20% discount is an effective way to get customers to return to our store and declare the experiment a success. Our boss congratulates us and we all take the rest of the day off.
Based on the success of our experiment, our company decides to run a similar deal with the same 20% discount, only this time we include the discount as part of a Facebook promotion. Much to our surprise, however, very few customers return to our store. Despite substantial investments of time and money, our promotion seems to have gone belly up. Our boss wants to know how this could happen but we are at a loss to explain why. Was something wrong with our experiment?
Actually, the problem was not with our experiment, but rather our conclusions. Let us examine our hypothesis again:
Emailing1 customers a 20%3 discount2 increases the likelihood that they will make a purchase in the following week.
Although our hypothesis seems pretty straightforward, if we look more closely we will see that our 20% discount email actually consists of three different variables rolled together: 1. It is an email, 2. It is a discount email, 3. It is a 20% discount email. Our challenge then, is to determine which of these variables (or what combination of them) actually brought our customers back to the store. In order to find our answer, we will need to test each of these variables independently.
We can easily parse the effects of our different variables by dividing our population into groups. In this case, we randomly assign the members of our base to one of four groups. Group I receives no email (readers who have been following our series will recognize this as our control group from Part Two), Group II receives an email with no offer, Group III receives an email with a 5% discount, and Group IV receives an email with the original 20% discount.
After defining our groups, we can compare their responses to evaluate our hypothesis. However, we cannot compare our groups directly. Because we sent emails to only a sample of the population, we can not say for certain how the entire population would have responded. However, using the tools of statistical analysis, we can estimate the range for what the response rate would have been. Based on those estimates, we can then determine what the likely email response rate would be.
Suppose we have a sample of 40,000 users, divided evenly among our four groups. After sending each group the appropriate email (or not, in the case of Group I), we measure their responses.
We can imagine a few different plausible scenarios, for example:
Here we see pretty unambiguously that customers who received an email of any kind were much more likely to return to the store than those who received no email. In this case, the magnitude of the discount (or even the presence of a discount) seems to play little or no role in increasing the response rate. Without the proper controls, however, we could have easily attributed the increase in customer returns to the discount.
Another plausible scenario could look something like this:
Based on these responses we could conclude that the discounts, rather than a simple email, are what drove customers to return to our store. Even a modest discount increased the response rate somewhat, while a larger discount increased the response rate still further.
Using even these basic tools of statistical analysis, companies can learn more and better information from their marketing experiments. This information in turn helps them reach their potential customers through more effective, targeted marketing. As we have seen, controlling for different effects through groups can be a powerful means for identifying the most salient effects in any marketing experiment. In our next post, we will discuss some statistical tools that can help us gauge the significance of the effects we have parsed here.