AvgSpreadBounds
where , , , , and .
Robust bounds on with specified coverage.
Input
- — first sample of measurements, where , requires sparity(x)
- — second sample of measurements, where , requires sparity(y)
- — probability that true avg spread falls outside bounds in the long run
Output
- Value interval bounding
- Unit same unit as ,
Notes
- Note Bonferroni combination of two calls with equal split ; no independence assumption needed; randomized pairing and cutoff, conservative with ties
Properties
- Symmetry (equal split)
- Shift invariance adding constants to and/or does not change bounds
- Scale equivariance
- Non-negativity bounds are non-negative
- Monotonicity in misrate smaller produces wider bounds
Example
AvgSpreadBounds([1..30], [21..50], 10^(-3))returns bounds containingAvgSpread
provides distribution-free uncertainty bounds for the pooled spread: the weighted average of the two sample spreads. The algorithm computes separate for each sample using an equal Bonferroni split and then combines them linearly with weights and . This guarantees that the probability of missing the true is at most without requiring independence between samples.
Minimum misrate because must satisfy the per-sample minimum, the overall misrate must be large enough for both samples:
Algorithm
The estimator constructs bounds on the pooled spread by combining two independent calls through a Bonferroni split.
The algorithm proceeds as follows:
-
Equal Bonferroni split Use for each per-sample bounds call. Each call uses half the total error budget.
-
Per-sample bounds Compute and (see SpreadBounds).
-
Weighted linear combination With weights and , return:
By Bonferronis inequality, the probability that both per-sample bounds simultaneously cover their respective true spreads is at least . Since is a weighted average of the individual spreads, the linear combination of the bounds covers the true whenever both individual bounds hold.
using Pragmastat.Algorithms;
using Pragmastat.Exceptions;
using Pragmastat.Functions;
using Pragmastat.Internal;
namespace Pragmastat.Estimators;
/// <summary>
/// Distribution-free bounds for AvgSpread via Bonferroni combination of SpreadBounds.
/// Uses equal split alpha = misrate / 2.
/// </summary>
internal class AvgSpreadBoundsEstimator : ITwoSampleBoundsEstimator
{
internal static readonly AvgSpreadBoundsEstimator Instance = new();
public Bounds Estimate(Sample x, Sample y, Probability misrate)
{
return Estimate(x, y, misrate, null);
}
public Bounds Estimate(Sample x, Sample y, Probability misrate, string? seed)
{
Assertion.NonWeighted("x", x);
Assertion.NonWeighted("y", y);
Assertion.CompatibleUnits(x, y);
(x, y) = Assertion.ConvertToFiner(x, y);
if (double.IsNaN(misrate) || misrate < 0 || misrate > 1)
throw AssumptionException.Domain(Subject.Misrate);
int n = x.Size;
int m = y.Size;
if (n < 2)
throw AssumptionException.Domain(Subject.X);
if (m < 2)
throw AssumptionException.Domain(Subject.Y);
double alpha = misrate / 2.0;
double minX = MinAchievableMisrate.OneSample(n / 2);
double minY = MinAchievableMisrate.OneSample(m / 2);
if (alpha < minX || alpha < minY)
throw AssumptionException.Domain(Subject.Misrate);
if (FastSpread.Estimate(x.SortedValues, isSorted: true) <= 0)
throw AssumptionException.Sparity(Subject.X);
if (FastSpread.Estimate(y.SortedValues, isSorted: true) <= 0)
throw AssumptionException.Sparity(Subject.Y);
Bounds boundsX = seed == null
? SpreadBoundsEstimator.Instance.Estimate(x, alpha)
: SpreadBoundsEstimator.Instance.Estimate(x, alpha, seed);
Bounds boundsY = seed == null
? SpreadBoundsEstimator.Instance.Estimate(y, alpha)
: SpreadBoundsEstimator.Instance.Estimate(y, alpha, seed);
double weightX = (double)n / (n + m);
double weightY = (double)m / (n + m);
double lower = weightX * boundsX.Lower + weightY * boundsY.Lower;
double upper = weightX * boundsX.Upper + weightY * boundsY.Upper;
return new Bounds(lower, upper, x.Unit);
}
}
Tests
Use an equal Bonferroni split. Compute and using disjoint-pair sign-test inversion (see ). Let and . Return .
The test suite contains 40 test cases (3 demo + 5 natural + 6 property + 6 edge + 2 distro + 5 misrate + 6 unsorted + 7 error). Since returns bounds rather than a point estimate, tests validate that bounds are well-formed and satisfy equivariance properties. Each test case output is a JSON object with lower and upper fields representing the interval bounds. Because is randomized, tests fix a seed to make outputs deterministic. Both calls use the same seed (two identical RNG streams).
Minimum misrate constraint the equal split requires
and
,
so
.
Demo examples (, ) from manual introduction:
demo-1: , , baseline fixture misratedemo-2: , , stricter fixture misrate, wider boundsdemo-3: , , looser fixture misrate
These cases illustrate how tighter misrates produce wider bounds.
Natural sequences ( varies across achievable fixture levels) 5 tests:
natural-10-10: ,natural-10-15: ,natural-15-10: ,natural-15-15: ,natural-20-20: ,
Property validation () 6 tests:
property-identity: , , expected output:property-location-shift: , , expected output: (shift invariance)property-scale-2x: , , expected output: (= 2× identity bounds, scale equivariance)property-scale-neg: , , expected output: (= identity bounds, scaling)property-symmetry: ,property-symmetry-swapped: , , same output asproperty-symmetry(swap symmetry with equal Bonferroni split)
Edge cases boundary conditions and extreme scenarios (6 tests):
edge-small: , minimum non-trivial fixture misrateedge-negative: , (negative values)edge-mixed-signs: mixed positive/negative valuesedge-wide-range: powers of 10 from to (extreme value range)edge-asymmetric-8-30: , (unbalanced sizes)edge-duplicates-mixed: , (partial ties)
Distribution tests (reference fixture misrates) 2 tests:
additive-20-20: ,uniform-20-20: ,
Misrate variation (, ) 5 tests spanning progressively stricter fixture misrates:
These tests validate monotonicity: smaller misrates produce wider bounds.
Unsorted tests verify independent sorting of and (6 tests):
unsorted-reverse-x: X reversed, Y sortedunsorted-reverse-y: X sorted, Y reversedunsorted-reverse-both: both reversedunsorted-shuffle-x: X shuffled, Y sortedunsorted-shuffle-y: X sorted, Y shuffledunsorted-wide-range: wide value range, both unsorted
These tests validate that produces identical results regardless of input order.
Error cases inputs that violate assumptions (7 tests):
error-empty-x: (empty X array) — validity errorerror-empty-y: (empty Y array) — validity errorerror-single-element-x: (too few elements for pairing) — domain errorerror-single-element-y: (too few elements for pairing) — domain errorerror-constant-x: constant violates sparity ()error-constant-y: constant violates sparity ()error-misrate-below-min: misrate below minimum achievable — domain error