Skip to main content
It’s common that some of your users don’t have measurements. For example, if you measure the time it takes to load a particular part of your app and a user never visits that part of the app. In this case, the user has a missing value. This section describes how to handle missing values in Confidence.

Configure How to Treat Missing Values

In general, there are many ways to handle missing values. For example, you can discard them, replace them with a specific value, or impute them based on values of other similar users. The best approach generally depends on what you want to measure. For example, if you measure latency, it’s sensible to just discard the users with missing values. If you measure the minutes of music played, then users with missing values have played zero minutes of music. Confidence allows you to configure how to handle missing values for users, or entities, more generally. By default, Confidence replaces missing values with zero if you are using the SUM, COUNT or COUNT_DISTINCT aggregations. For other aggregations, Confidence discards users with missing values by default. You can override this behavior in the Missing Values section of a metric, see below. Configuring Missing Values
Discarding users with missing values can lead to sample ratio mismatch for a specific metric. For example, if the treatment increases the chance that a user visits a particular part of the app that you want to measure, then the treatment group has fewer users with missing values than the control group. Because of the sample ratio mismatch, there’s a risk that the two groups are no longer comparable. In this situation, a bias, in what kind of user each group’s metric value includes, can be the driver of significant differences you detect.
If you want further control over how to handle missing values, you can replace them directly in the SQL query for the fact table. For example, if you want to replace missing values with a 1 instead of a zero, you can define the measurement column IFNULL(measurement, 1) instead of just passing measurement to Confidence.

Missing Values and Variance Reduction

Confidence estimates the parameters required for variance reduction on the subset of users that have both a pre-exposure and post-exposure measurement. If this subset of users is a small part of all users then Confidence disables variance reduction.

Missing Values and Ratio Metrics

Confidence discards all rows in the fact table that have a missing value for the numerator or denominator to ensure consistency. If you don’t want this behavior, then you can replace the missing values directly in the SQL query for the fact table.