Skip to main content
Confidence summarizes all the checks for an experiment on the results page in the spotlight section. Monitor your experiment to make sure that you set it up correctly so that the data collection and variant delivery to users work as intended. Three important questions to ask when verifying an experiment are:
  • Is exposure working as intended?
  • Are the control or treatment groups biased?
  • Are users receiving the intended experience?

Sample Ratio Mismatch Check

The set up of the experiment defined exposure. The treatment must not impact exposure for results to be trustworthy. To verify that exposure works as intended, the observed proportions in all treatment groups should follow the expected variant allocations. The sample ratio mismatch check tests if the observed proportions of traffic in each variant match the expected proportions. If the test indicates a problem, you have a clear signal that there is a systematic difference across treatment groups in the probability that users log assignments. A systematic traffic imbalance invalidates the results, as the groups are often no longer comparable.
The analysis of the experiment relies on there being no systematic difference between the treatment groups. Correct randomization makes it possible to attribute any movements in metrics to the treatment. If there is a sign of a sample ratio mismatch, you should stop the experiment and investigate the issue.

Deterioration Checks

Confidence always checks the metrics you’ve selected for deterioration. Regardless of the test evaluation frequency employed, Confidence tests your metrics for movements in the wrong direction as often as the metric data supports. If there is evidence that metrics are moving in the wrong direction, Confidence alerts you and recommends aborting the experiment.

Stop the Experiment

You should stop the experiment when the experiment reaches its required sample size. At this point, all metrics have the intended amount of data for powering the metrics. You should stop the experiment regardless of if results show significant improvements or not.
Stop your experiment when all metrics meet their required sample size and achieve power.

Current Powered Effect

If you have to stop your experiment before you reach the required sample size, make sure to present the current powered effect together with results to reflect this increased uncertainty. Failing to achieve the necessary sample size to power all metrics means that the risk of overestimating effects is higher.