4 Lessons Learned About Multivariate Testing
by Corey Oordt • Published 25 Jan 2011
Recently we did our first major multivariate test on a page using Google Website Optimizer (GWO). Our goal was to increase reader involvement by having them go to any other page on our web site. There ended up being 32 combinations of content. We wrote and open-sourced a Django application to do much of the work called django-gwo. Here are a few lessons from the first test:
You can test the layout of dynamic pages
Typically A/B and multivariate testing is about a static testing URL and a static goal URL. We needed to test every story URL and our goal was any other URL on our site. The tricks:
- Use the “onclick” method on links to tell GWO that the test succeeded.
- Don’t include any dynamic content within the experiment sections
- Use <div> tags around dynamic content with an alternate <div style=”display: None”> to make it disappear.
- If you want to test if a dynamic content block is better in position A or position B, put it in both. Use surrounding <div> tags to make the content appear and disappear. You can manually remove combinations where the content would appear twice.
Test multiple things
GWO will show you the statistical results of each variation of each section of the experiment as if it was the only change on the page. What’s interesting is that one variation by itself had a statistically significant negative affect on the goal, yet the top statistical winning combinations all used that variation. If we had just tested that one thing, we would have decided that we shouldn’t do it, even though in conjunction with other variations is was much better.
It’s not that much more work to test several variations on the page. So do it and be surprised.
Let the test run, even after you get a statistical winner
We get a fair amount of traffic and we had a significant result after only a few hours. We let the experiment continue running, and by the end of the day, the former top result was at the bottom.
Your site probably has traffic patterns as well: different people from different parts of the world come to it at different times or different days. Let your test run long enough to see what those people think.
Testing leads to more testing
The winning combination lead to an 13-18% (depending on the day) improvement of customer interaction. That by no means stopped any discussions. It threw them into overdrive! We want to move away from our site layout dictated by the HiPPO to a layout that actually works better. While the results of the test tells you what worked better, it doesn’t tell you why. So everyone has an opinion on why things worked out the way they did, and what would work better.
So we have to do more testing to see who is correct!