AvgSpreadBounds
where , , , , , and .
Robust bounds on with specified coverage.
- Interpretation is probability that true avg spread falls outside bounds
- Domain any real numbers, , , and
- Assumptions sparity(x), sparity(y)
- Unit same as measurements
- Note Bonferroni combination of two calls with equal split ; no independence assumption needed; randomized pairing and cutoff, conservative with ties
Properties
- Symmetry (equal split)
- Shift invariance adding constants to and/or does not change bounds
- Scale equivariance
- Non-negativity bounds are non-negative
- Monotonicity in misrate smaller produces wider bounds
Example
AvgSpreadBounds([1..30], [21..50], 0.02)returns bounds containingAvgSpread
provides distribution-free uncertainty bounds for the pooled spread: the weighted average of the two sample spreads. The algorithm computes separate for each sample using an equal Bonferroni split and then combines them linearly with weights and . This guarantees that the probability of missing the true is at most without requiring independence between samples.
Minimum misrate because must satisfy the per-sample minimum, the overall misrate must be large enough for both samples:
Algorithm
The estimator constructs bounds on the pooled spread by combining two independent calls through a Bonferroni split.
The algorithm proceeds as follows:
-
Equal Bonferroni split Set . Each per-sample bounds call uses half the total error budget.
-
Per-sample bounds Compute and (see SpreadBounds).
-
Weighted linear combination With weights and , return:
By Bonferronis inequality, the probability that both per-sample bounds simultaneously cover their respective true spreads is at least . Since is a weighted average of the individual spreads, the linear combination of the bounds covers the true whenever both individual bounds hold.
using Pragmastat.Exceptions;
using Pragmastat.Functions;
using Pragmastat.Internal;
namespace Pragmastat.Estimators;
/// <summary>
/// Distribution-free bounds for AvgSpread via Bonferroni combination of SpreadBounds.
/// Uses equal split alpha = misrate / 2.
/// </summary>
internal class AvgSpreadBoundsEstimator : ITwoSampleBoundsEstimator
{
internal static readonly AvgSpreadBoundsEstimator Instance = new();
public Bounds Estimate(Sample x, Sample y, Probability misrate)
{
return Estimate(x, y, misrate, null);
}
public Bounds Estimate(Sample x, Sample y, Probability misrate, string? seed)
{
Assertion.MatchedUnit(x, y);
// Check validity for x (priority 0, subject x)
Assertion.Validity(x, Subject.X);
// Check validity for y (priority 0, subject y)
Assertion.Validity(y, Subject.Y);
if (double.IsNaN(misrate) || misrate < 0 || misrate > 1)
throw AssumptionException.Domain(Subject.Misrate);
int n = x.Size;
int m = y.Size;
if (n < 2)
throw AssumptionException.Domain(Subject.X);
if (m < 2)
throw AssumptionException.Domain(Subject.Y);
double alpha = misrate / 2.0;
double minX = MinAchievableMisrate.OneSample(n / 2);
double minY = MinAchievableMisrate.OneSample(m / 2);
if (alpha < minX || alpha < minY)
throw AssumptionException.Domain(Subject.Misrate);
// Check sparity (priority 2)
Assertion.Sparity(x, Subject.X);
Assertion.Sparity(y, Subject.Y);
Bounds boundsX = seed == null
? SpreadBoundsEstimator.Instance.Estimate(x, alpha)
: SpreadBoundsEstimator.Instance.Estimate(x, alpha, seed);
Bounds boundsY = seed == null
? SpreadBoundsEstimator.Instance.Estimate(y, alpha)
: SpreadBoundsEstimator.Instance.Estimate(y, alpha, seed);
double weightX = (double)n / (n + m);
double weightY = (double)m / (n + m);
double lower = weightX * boundsX.Lower + weightY * boundsY.Lower;
double upper = weightX * boundsX.Upper + weightY * boundsY.Upper;
return new Bounds(lower, upper, x.Unit);
}
}
Tests
Let (equal Bonferroni split). Compute and using disjoint-pair sign-test inversion (see ). Let and . Return .
The test suite validates:
- bounds are well-formed ( and non-negative)
- shift invariance and scale equivariance
- monotonicity in
- symmetry under swapping and (with equal split)
- error cases for invalid inputs and misrate domain violations
Because is randomized, tests fix a seed to make outputs deterministic. Both calls use the same seed (two identical RNG streams).
Minimum misrate constraint the equal split requires
and
,
so
.