AvgSpreadBounds

\operatorname{AvgSpreadBounds}(\mathbf{x}, \mathbf{y}, \mathrm{misrate}) = [L_A, U_A]

where $\alpha = \frac{\mathrm{misrate}}{2}$ , $[L_x, U_x] = \operatorname{SpreadBounds}(\mathbf{x}, \alpha)$ , $[L_y, U_y] = \operatorname{SpreadBounds}(\mathbf{y}, \alpha)$ , $w_x = \frac{n}{n + m}$ , $w_y = \frac{m}{n + m}$ , and $[L_A, U_A] = [w_x L_x + w_y L_y, w_x U_x + w_y U_y]$ .

Robust bounds on $\operatorname{AvgSpread}(\mathbf{x}, \mathbf{y})$ with specified coverage.

Interpretation $\mathrm{misrate}$ is probability that true avg spread falls outside bounds
Domain any real numbers, $n \geq 2$ , $m \geq 2$ , $\alpha \geq 2^{1-\lfloor \frac{n}{2} \rfloor}$ and $\alpha \geq 2^{1-\lfloor \frac{m}{2} \rfloor}$
Assumptions sparity(x), sparity(y)
Unit same as measurements
Note Bonferroni combination of two $\operatorname{SpreadBounds}$ calls with equal split $\alpha = \frac{\mathrm{misrate}}{2}$ ; no independence assumption needed; randomized pairing and cutoff, conservative with ties

Properties

Symmetry $\operatorname{AvgSpreadBounds}(\mathbf{x}, \mathbf{y}, \mathrm{misrate}) = \operatorname{AvgSpreadBounds}(\mathbf{y}, \mathbf{x}, \mathrm{misrate})$ (equal split)
Shift invariance adding constants to $\mathbf{x}$ and/or $\mathbf{y}$ does not change bounds
Scale equivariance $\operatorname{AvgSpreadBounds}(k \cdot \mathbf{x}, k \cdot \mathbf{y}, \mathrm{misrate}) = \lvert k \rvert \cdot \operatorname{AvgSpreadBounds}(\mathbf{x}, \mathbf{y}, \mathrm{misrate})$
Non-negativity bounds are non-negative
Monotonicity in misrate smaller $\mathrm{misrate}$ produces wider bounds

Example

AvgSpreadBounds([1..30], [21..50], 0.02) returns bounds containing AvgSpread

$\operatorname{AvgSpreadBounds}$ provides distribution-free uncertainty bounds for the pooled spread: the weighted average of the two sample spreads. The algorithm computes separate $\operatorname{SpreadBounds}$ for each sample using an equal Bonferroni split and then combines them linearly with weights $\frac{n}{n+m}$ and $\frac{m}{n+m}$ . This guarantees that the probability of missing the true $\operatorname{AvgSpread}$ is at most $\mathrm{misrate}$ without requiring independence between samples.

Minimum misrate because $\alpha = \frac{\mathrm{misrate}}{2}$ must satisfy the per-sample minimum, the overall misrate must be large enough for both samples:

\mathrm{misrate} \geq 2 \cdot \max(2^{1-\lfloor \frac{n}{2} \rfloor}, 2^{1-\lfloor \frac{m}{2} \rfloor})

Algorithm

The $\operatorname{AvgSpreadBounds}$ estimator constructs bounds on the pooled spread by combining two independent $\operatorname{SpreadBounds}$ calls through a Bonferroni split.

The algorithm proceeds as follows:

Equal Bonferroni split Set $\alpha = \frac{\mathrm{misrate}}{2}$ . Each per-sample bounds call uses half the total error budget.
Per-sample bounds Compute $[L_x, U_x] = \operatorname{SpreadBounds}(\mathbf{x}, \alpha)$ and $[L_y, U_y] = \operatorname{SpreadBounds}(\mathbf{y}, \alpha)$ (see SpreadBounds).
Weighted linear combination With weights $w_x = \frac{n}{n + m}$ and $w_y = \frac{m}{n + m}$ , return:

[L_A, U_A] = [w_x L_x + w_y L_y, w_x U_x + w_y U_y]

By Bonferronis inequality, the probability that both per-sample bounds simultaneously cover their respective true spreads is at least $1 - 2 \alpha = 1 - \mathrm{misrate}$ . Since $\operatorname{AvgSpread}$ is a weighted average of the individual spreads, the linear combination of the bounds covers the true $\operatorname{AvgSpread}$ whenever both individual bounds hold.

using Pragmastat.Exceptions;
using Pragmastat.Functions;
using Pragmastat.Internal;

namespace Pragmastat.Estimators;

/// <summary>
/// Distribution-free bounds for AvgSpread via Bonferroni combination of SpreadBounds.
/// Uses equal split alpha = misrate / 2.
/// </summary>
internal class AvgSpreadBoundsEstimator : ITwoSampleBoundsEstimator
{
  internal static readonly AvgSpreadBoundsEstimator Instance = new();

  public Bounds Estimate(Sample x, Sample y, Probability misrate)
  {
    return Estimate(x, y, misrate, null);
  }

  public Bounds Estimate(Sample x, Sample y, Probability misrate, string? seed)
  {
    Assertion.MatchedUnit(x, y);
    // Check validity for x (priority 0, subject x)
    Assertion.Validity(x, Subject.X);
    // Check validity for y (priority 0, subject y)
    Assertion.Validity(y, Subject.Y);

    if (double.IsNaN(misrate) || misrate < 0 || misrate > 1)
      throw AssumptionException.Domain(Subject.Misrate);

    int n = x.Size;
    int m = y.Size;
    if (n < 2)
      throw AssumptionException.Domain(Subject.X);
    if (m < 2)
      throw AssumptionException.Domain(Subject.Y);

    double alpha = misrate / 2.0;
    double minX = MinAchievableMisrate.OneSample(n / 2);
    double minY = MinAchievableMisrate.OneSample(m / 2);
    if (alpha < minX || alpha < minY)
      throw AssumptionException.Domain(Subject.Misrate);

    // Check sparity (priority 2)
    Assertion.Sparity(x, Subject.X);
    Assertion.Sparity(y, Subject.Y);

    Bounds boundsX = seed == null
      ? SpreadBoundsEstimator.Instance.Estimate(x, alpha)
      : SpreadBoundsEstimator.Instance.Estimate(x, alpha, seed);
    Bounds boundsY = seed == null
      ? SpreadBoundsEstimator.Instance.Estimate(y, alpha)
      : SpreadBoundsEstimator.Instance.Estimate(y, alpha, seed);

    double weightX = (double)n / (n + m);
    double weightY = (double)m / (n + m);

    double lower = weightX * boundsX.Lower + weightY * boundsY.Lower;
    double upper = weightX * boundsX.Upper + weightY * boundsY.Upper;
    return new Bounds(lower, upper, x.Unit);
  }
}

Tests

\operatorname{AvgSpreadBounds}(\mathbf{x}, \mathbf{y}, \mathrm{misrate}) = [L_A, U_A]

Let $\alpha = \frac{\mathrm{misrate}}{2}$ (equal Bonferroni split). Compute $[L_x, U_x] = \operatorname{SpreadBounds}(\mathbf{x}, \alpha)$ and $[L_y, U_y] = \operatorname{SpreadBounds}(\mathbf{y}, \alpha)$ using disjoint-pair sign-test inversion (see $\operatorname{SpreadBounds}$ ). Let $w_x = \frac{n}{n + m}$ and $w_y = \frac{m}{n + m}$ . Return $[L_A, U_A] = [w_x L_x + w_y L_y, w_x U_x + w_y U_y]$ .

The $\operatorname{AvgSpreadBounds}$ test suite validates:

bounds are well-formed ( $L_A \leq U_A$ and non-negative)
shift invariance and scale equivariance
monotonicity in $\mathrm{misrate}$
symmetry under swapping $\mathbf{x}$ and $\mathbf{y}$ (with equal split)
error cases for invalid inputs and misrate domain violations

Because $\operatorname{SpreadBounds}$ is randomized, tests fix a seed to make outputs deterministic. Both $\operatorname{SpreadBounds}$ calls use the same seed (two identical RNG streams).

Minimum misrate constraint the equal split requires

\alpha \geq 2^{1-\lfloor \frac{n}{2} \rfloor}

and

\alpha \geq 2^{1-\lfloor \frac{m}{2} \rfloor}

\mathrm{misrate} \geq 2 \cdot \max(2^{1-\lfloor \frac{n}{2} \rfloor}, 2^{1-\lfloor \frac{m}{2} \rfloor})