Analysis Reference

This section provides technical specifications and reference information for analysis plans and statistical testing.

For conceptual explanations of analysis, see Stats Concepts.

Comparison Specifications

Define how to compare groups in an analysis:

All to Baseline

Compare all treatment groups to a designated control:

{
  "comparisonSpec": {
    "allToBaseline": {
      "baseline": "control"
    }
  }
}

Use when: Standard A/B test with one control and multiple treatments

All Pairs

Compare every group to every other group:

{
  "comparisonSpec": {
    "allPairs": {}
  }
}

Use when: exploring all possible differences, no clear control group

Specific Pairs

Define exactly which groups to compare:

{
  "comparisonSpec": {
    "pairs": [
      {
        "baseline": "control",
        "treatment": "variant_a"
      },
      {
        "baseline": "control",
        "treatment": "variant_b"
      }
    ]
  }
}

Use when: complex designs with specific comparisons of interest

Hypothesis Types

Superiority Hypothesis

Test if a treatment improves a metric by a meaningful amount:

{
  "superiority": {
    "preferredDirection": "INCREASE",
    "minimumDetectableEffect": 0.03
  }
}

Fields:

preferredDirection: INCREASE or DECREASE
minimumDetectableEffect: Relative change considered meaningful (for example, 0.03 = 3%)

Use for: success metrics, primary outcomes

Non-Inferiority Hypothesis

Test if a treatment doesn’t harm a metric beyond an acceptable margin:

{
  "nonInferiority": {
    "preferredDirection": "INCREASE",
    "nonInferiorityMargin": 0.01
  }
}

Fields:

preferredDirection: INCREASE or DECREASE
nonInferiorityMargin: Maximum acceptable degradation (for example, 0.01 = 1%)

Use for: guardrail metrics, cost metrics, performance metrics

Preferred Direction

Value	Meaning	Example Metrics
`INCREASE`	Higher is better	Revenue, conversion rate, engagement
`DECREASE`	Lower is better	Load time, error rate, bounce rate

Decision Rules

Combine multiple hypotheses into a single decision:

AND Rule

All hypotheses must be significant:

{
  "operator": "AND",
  "items": ["metric1", "metric2", "metric3"]
}

OR Rule

At least one hypothesis must be significant:

{
  "operator": "OR",
  "items": ["metric1", "metric2", "metric3"]
}

Complex Rule

Combine AND/OR logic:

{
  "operator": "AND",
  "items": [
    {
      "rule": {
        "operator": "AND",
        "items": ["guardrail1", "guardrail2"]
      }
    },
    {
      "rule": {
        "operator": "OR",
        "items": ["success1", "success2", "success3"]
      }
    }
  ]
}

Translates to: (guardrail1 AND guardrail2) AND (success1 OR success2 OR success3)

Group Structure

Define groups with allocation weights:

{
  "groups": [
    {
      "id": "control",
      "weight": 1
    },
    {
      "id": "treatment",
      "weight": 1
    }
  ]
}

Fields:

id: Unique identifier for the group
weight: Relative allocation (typically proportional to traffic split)

Common patterns:

Equal split: All weights = 1
50/25/25: Weights = 2, 1, 1
90/10: Weights = 9, 1

Statistical Parameters

Significance Level (Alpha)

Probability of false positive:

"alpha": 0.05  // 5% false positive rate

Common values:

0.05: Standard significance level
0.01: Stricter threshold
0.10: More lenient threshold

Statistical Power

Probability of detecting a true effect:

"power": 0.80  // 80% power

Common values:

0.80: Standard power level
0.90: Higher power (larger sample needed)
0.70: Lower power (smaller sample enough)

Data Types

Binary Data

For conversion-like metrics:

{
  "binaryData": {
    "successes": [100, 110],
    "trials": [1000, 1000]
  }
}

Use for: conversion rates, click-through rates, success/failure outcomes

Continuous Data

For numeric measurements:

{
  "continuousData": {
    "means": [42.5, 43.2],
    "variances": [12.3, 11.8],
    "counts": [1000, 1000]
  }
}

Use for: revenue, duration, ratings, counts

Analysis Methods

Different methods have different assumptions and use cases:

Method	Sequential	Data Type	Use Case
Fixed horizon	No	Both	Final analysis only
Sequential	Yes	Both	Continuous monitoring
Bayesian	Yes	Both	Continuous updates with prior knowledge

Method Assumptions

All methods assume:

Random assignment: Users randomly assigned to groups
Independence: User outcomes are independent
Stable variance: Variance doesn’t change over time
No spillover: Treatment doesn’t affect control group

Sequential methods additionally assume:

Data arrives continuously: New data added over time
Stopping rules followed: Don’t peek without accounting for it

Best Practices

Hypothesis Design

Set MDE/NIM based on business impact, not statistical convenience
Use superiority for metrics you want to improve
Use non-inferiority for metrics you want to protect
Define hypotheses before looking at data

Decision Rules

Require all guardrails to pass (use AND)
Allow any success metric to trigger (use OR)
Be explicit about what defines success
Consider multiple testing adjustments

Power Analysis

Run power analysis before experiment
Ensure adequate sample size for MDE
Consider seasonal effects on sample collection
Account for multiple comparisons in power calculation

Get Started

Feature Flags

Metrics

Stats

Experiments

Surfaces

IAM

Connectors

Comparison Specifications

All to Baseline

All Pairs

Specific Pairs

Hypothesis Types

Superiority Hypothesis

Non-Inferiority Hypothesis

Preferred Direction

Decision Rules

AND Rule

OR Rule

Complex Rule

Group Structure

Statistical Parameters

Significance Level (Alpha)

Statistical Power

Data Types

Binary Data

Continuous Data

Analysis Methods

Method Assumptions

Best Practices

Hypothesis Design

Decision Rules

Power Analysis

Get Started

Feature Flags

Metrics

Stats

Experiments

Surfaces

IAM

Connectors

​Comparison Specifications

​All to Baseline

​All Pairs

​Specific Pairs

​Hypothesis Types

​Superiority Hypothesis

​Non-Inferiority Hypothesis

​Preferred Direction

​Decision Rules

​AND Rule

​OR Rule

​Complex Rule

​Group Structure

​Statistical Parameters

​Significance Level (Alpha)

​Statistical Power

​Data Types

​Binary Data

​Continuous Data

​Analysis Methods

​Method Assumptions

​Best Practices

​Hypothesis Design

​Decision Rules

​Power Analysis

Comparison Specifications

All to Baseline

All Pairs

Specific Pairs

Hypothesis Types

Superiority Hypothesis

Non-Inferiority Hypothesis

Preferred Direction

Decision Rules

AND Rule

OR Rule

Complex Rule

Group Structure

Statistical Parameters

Significance Level (Alpha)

Statistical Power

Data Types

Binary Data

Continuous Data

Analysis Methods

Method Assumptions

Best Practices

Hypothesis Design

Decision Rules

Power Analysis