You first need to create a metric before you can use it in an experiment.
You can read more about how to create metrics on the metrics page.
Assignments
For Confidence to be able to evaluate the experiment, it needs to know what you are experimenting on. Specify an entity, and where the assignments for these entities exist.- Click the Edit icon to the right on the Metrics section of the experiment edit page to bring up the metric configuration dialog.
- Select the entity that you want to analyze.
- Select the assignment table that has assignment logs for the experiment.
- Optional. Select an exposure filter.
- Select how often you want to compute metrics by specifying the metric interval.
- Click Save to save the metric configuration.
Exposure Filtering
Exposure filters are methods for narrowing down more closely which users to include in the exposure definition and the analysis of your experiment. When you add an exposure filter for your experiment, the analysis only includes the exposed users that also match the exposure filter. The time of exposure is the first unit of time after default exposure where the user matches the exposure filter. You may want to use exposure filters if the default definition of exposure is too broad for what you want to measure in your experiment. Read more about Exposure FilteringSuccess Metrics
Success metrics aim to prove the hypothesis of the A/B test. For example, if your hypothesis is that “more users stream podcasts if you rank podcasts higher,” then an appropriate success metric is a metric that measures podcast consumption. In this example, the hypothesis expects the metric to increase. A significant result means that there is evidence of an effect in the desired direction. Test for the following change in a success metric.- Preferred Direction: Increase
- Preferred Direction: Decrease
Test for a significant increase of the metric. For example, an
increase in hours spent listening to Spotify.
Guardrail Metrics
A guardrail metric is a metric that ensures that the experiment doesn’t have any unexpected side effects. You can use any metric as a guardrail metric. The meaning of a significant result depends on the type of guardrail metric. Use guardrail metrics in the following situations:- When you want to ensure that your A/B test doesn’t introduce regressions in performance or product quality.
- When you want to ensure that your A/B test doesn’t have a negative impact on a metric that some other part of the organization cares about.
With Non-Inferiority Margin
Use guardrail metrics with non-inferiority margins to look for evidence that the change doesn’t negatively impact the metric more than your specified non-inferiority margin. A significant result means that there is evidence that the guardrail is within acceptable margins. Guardrail metrics with non-inferiority margins test for the following.- Non-Desired Direction: Increase
- Non-Desired Direction: Decrease
Test if there is evidence that the metric hasn’t increased
by more than the NIM. For example, the number of skipped songs in personalized playlists shouldn’t
increase by more than 1%.
Without Non-Inferiority Margin
Use guardrail metrics without non-inferiority margins to look for evidence that the change has a negative impact on the metric. A significant result means that there is evidence that the guardrail deteriorates because of the change. Guardrail metrics with non-inferiority margins test for the following.- Non-Desired Direction: Increase
- Non-Desired Direction: Decrease
Test if there is evidence that the metric has increased. For example, test if the number of skipped
songs in personalized playlists has increased.

