Compare to what exactly?
There are only bad options. It does not make sense to think of the risks and downsides associated with running overlapping experiments in isolation, only in comparison to what other options we have.
This page tries to list out all the available options. Once you understand that we must choose one of these, I hope you will agree with me that overlapping experiments are truly the least of several evils.
Make changes without testing them. Effectively shipping blind.
Make fewer changes at the same time. In fact, make only one change at any given time. Only after a decision is made will we start the next test. Drastically reduces velocity and usually results in Not Testing Every Change (see above).
Isolate traffic during the test. Drastically reduces power (to fix that running longer would be equivalent to option one) AND introduces a challenge when shipping two tests. If we find two winners, we can only ship ONE of these tests. They have never been tested together, so if we ship both, we have no idea what will happen. If we have two winning tests, we will thus need a third test to see what the combined effect is.
Risk interaction effects, but test to see if we can detect any.
IMHO all of the other options are worse than risking interaction effects. Especially considering:
- Interaction effects are rare .
- Interaction effects can be detected (only if we run overlapping).
- Interaction effects can be informative (only if detected).
Running overlapping thus is not just the least bad of all options in terms of tactical execution, it also carries advantages because testing for interaction effects brings new information.