When to Use Entity Relation Tables
Entity relation tables serve a specific use case: connecting anonymous users to authenticated users when:- You randomize on an anonymous entity (like Visitor)
- Users later authenticate and become a different entity (like User)
- You want to measure authenticated user metrics in experiments that started before authentication
Entity relation tables are not for:
- Creating general mappings between entities (like users to orders)
- Reusing metrics across different randomization units
- One-to-many relationships for analysis purposes
Example Use Case
If you have a Visitor entity that you randomize on for unauthenticated users, and then you have a User entity for authenticated users, you can create a relation table between Visitor and User. Confidence can then use this table to calculate results for User metrics, even if the experiment randomizes on the Visitor entity. This makes it possible to experiment on the sign-up process and measure how users behave when converting to customers.Create the Entity Relation Table
Open the entity that should own the relation (the entity you randomize on). In the Entity relation tables section, click Create. Input a SQL query that outputs two columns that specifies the mapping between the entities. Select the columns and the target entity and click the Create to create the table.Confidence doesn’t clean the data coming from this table, so it’s important that it’s of high quality
to ensure trustable experiment results. Any required data cleaning can either be done before the data ends up in the
relation table, or inline in the table definition since it can be any SQL query.Below is a short description of possible error cases and how those would affect the results, using the Visitor to User case as an example:
- No mapping exists for a visitor ID: For metrics with padding enabled, the user is included in the calculation of the metrics for the experiment, but get 0 as the metric value. Otherwise, the user is excluded.
- Multiple visitor ID’s map to the same user: The user would be included once, with first exposure set to the earliest assignment for the visitor ID.
- One visitor ID maps to multiple users: All users mapped to the visitor ID would be considered exposed to the experiment.
Create the Experiment
With the entity relation table in place, you can now create the experiment. Create an experiment and choose the entity that owns the relation as the entity to randomize on. Then, when selecting metrics you should now see both metrics from both the entities in the metric picker that you can then configure as in any other experiment.Exposure Time Considerations
Confidence considers the entity exposed at the time when the assignment happens for the randomization entity. Using the Visitor to User example, that means that if a user visits the site and resolves a flag, Confidence sets the exposure time to the time of when the flag was resolved and a variant assigned. Any User metrics with time windows and/or exposure offsets get calculated relative to that time, which means that if there is a long time between the visit and the sign-up the metric might not be measuring what you want. Sometimes it might be good to add an extra exposure offset to the metric to account for this.Common Pitfalls
Wrong Use: General Entity Maps
Problem: trying to use entity relation tables to map users to orders for metric reuse across experiments. Why it doesn’t work: entity relation tables affect how Confidence calculates statistics. Using them for general mappings produces wrong variance calculations because the system assumes you’re tracking the same entity transitioning between states, not separate entities with one-to-many relationships. Solution: use ratio metrics for order-level analysis with user randomization. Create separate metrics for different randomization units. See Ratio Metrics for details.Wrong Use of Metric Reuse Across Different Experiments
Problem: attempting to use entity relation tables to share metrics between experiments with different randomization units (for example, email-based users versus street addresses). Why it doesn’t work: each experiment needs metrics that match its randomization unit to ensure exact statistical analysis. Solution: create separate metrics for each randomization unit. While this requires duplicate metric definitions, it ensures exact results.Detailed Example
Scenario: e-commerce site testing sign-up flow improvements, measuring both pre-sign-up browsing behavior and post-sign-up buying behavior. Implementation: create an entity relation table mapping Visitor IDs to User IDs when users create accounts. This allows you to:- Randomize on Visitor entities for all site visitors
- Measure User metrics like buying rate and order value
- Correctly attribute post-authentication behavior to the original experiment exposure

