Missing Values

It’s common that some of your users don’t have measurements. For example, if you measure the time it takes to load a particular part of your app and a user never visits that part of the app. In this case, the user has a missing value. This section describes how to handle missing values in Confidence.

Configure How to Treat Missing Values

In general, there are many ways to handle missing values. For example, you can discard them, replace them with a specific value, or impute them based on values of other similar users. The best approach generally depends on what you want to measure. For example, if you measure latency, it’s sensible to just discard the users with missing values. If you measure the minutes of music played, then users with missing values have played zero minutes of music. Confidence allows you to configure how to handle missing values for users, or entities, more generally. By default, Confidence replaces missing values with zero if you are using the SUM, COUNT or COUNT_DISTINCT aggregations. For other aggregations, Confidence discards users with missing values by default. You can override this behavior in the Missing Values section of a metric, see below.

Discarding users with missing values can lead to sample ratio mismatch for a specific metric. For example, if the treatment increases the chance that a user visits a particular part of the app that you want to measure, then the treatment group has fewer users with missing values than the control group. Because of the sample ratio mismatch, there’s a risk that the two groups are no longer comparable. In this situation, a bias, in what kind of user each group’s metric value includes, can be the driver of significant differences you detect.

If you want further control over how to handle missing values, you can replace them directly in the SQL query for the fact table. For example, if you want to replace missing values with a 1 instead of a zero, you can define the measurement column IFNULL(measurement, 1) instead of just passing measurement to Confidence.

Missing Values and Variance Reduction

Confidence estimates the parameters required for variance reduction on the subset of users that have both a pre-exposure and post-exposure measurement. If this subset of users is a small part of all users then Confidence disables variance reduction.

Missing Values and Ratio Metrics

Confidence discards all rows in the fact table that have a missing value for the numerator or denominator to ensure consistency. If you don’t want this behavior, then you can replace the missing values directly in the SQL query for the fact table.

Fact Tables

Configure fact table SQL queries

Variance Reduction

Improve metric precision

Monitoring

Detect sample ratio mismatches

Metrics Reference

Configure metric aggregations

Get Started

Quickstarts

How-To Guides

About

Warehouse Setup

Reference

Configure How to Treat Missing Values

Missing Values and Variance Reduction

Missing Values and Ratio Metrics

Fact Tables

Variance Reduction

Monitoring

Metrics Reference

Get Started

Quickstarts

How-To Guides

About

Warehouse Setup

Reference

​Configure How to Treat Missing Values

​Missing Values and Variance Reduction

​Missing Values and Ratio Metrics

​Related Resources

Fact Tables

Variance Reduction

Monitoring

Metrics Reference

Configure How to Treat Missing Values

Missing Values and Variance Reduction

Missing Values and Ratio Metrics

Related Resources