SignMargin

SignMargin(n,misrate)\operatorname{SignMargin}(n, \mathrm{misrate})

Randomized exclusion count for disjoint-pair sign-test bounds.

  • Purpose determines extreme pairwise absolute differences to exclude when constructing bounds
  • Based on Binomial(n,12)\text{Binomial}(n, \frac{1}{2}) CDF inversion with randomized cutoff between adjacent grid points
  • Returns total margin split evenly between lower and upper tails
  • Used by SpreadBounds\operatorname{SpreadBounds} to select appropriate order statistics of disjoint-pair differences
  • Complexity exact for all nn
  • Domain n1n \geq 1, misrate21n\mathrm{misrate} \geq 2^{1-n}
  • Unit count
  • Note randomized to match the requested misrate exactly (deterministic rounding is conservative)

Properties

  • Bounds 0SignMargin(n,misrate)2n0 \leq \operatorname{SignMargin}(n, \mathrm{misrate}) \leq 2n
  • Monotonicity lower misrate \rightarrow smaller margin \rightarrow wider bounds

Example

Each call returns one of two adjacent grid points (randomized):

  • SignMargin(10, 0.05) returns 2 or 4
  • SignMargin(15, 0.01) returns 4 or 6
  • SignMargin(30, 1e-4) returns 8 or 10

This is a supporting function that SpreadBounds\operatorname{SpreadBounds} uses internally, so most users do not need to call it directly. It calculates how many extreme disjoint-pair absolute differences should be excluded when constructing bounds, based on the number of pairs and the desired error rate. Because the Binomial(n,12)\text{Binomial}(n, \frac{1}{2}) CDF is discrete, exact matching of an arbitrary misrate requires randomizing the cutoff between two adjacent integer values. A lower misrate (higher confidence) results in a smaller margin, which produces wider bounds.

Algorithm

The SignMargin\operatorname{SignMargin} function determines the exclusion count for disjoint-pair sign-test bounds by inverting the Binomial(n,12)\text{Binomial}(n, \frac{1}{2}) CDF.

Given nn pairs and a desired misrate\mathrm{misrate}, the algorithm finds the number of extreme order statistics to exclude so that the resulting bounds contain the true parameter with probability 1misrate1 - \mathrm{misrate}.

Binomial CDF computation

Under the symmetry assumption, the number of positive signs among nn disjoint-pair differences follows Binomial(n,12)\text{Binomial}(n, \frac{1}{2}). The CDF is computed exactly:

Pr(Wk)=i=0k(ni)2n\Pr(W \leq k) = \sum_{i=0}^k \binom{n}{i} 2^{-n}

The algorithm evaluates this sum incrementally, accumulating probabilities until the cumulative probability reaches misrate2\frac{\mathrm{misrate}}{2}.

Grid point identification

Because the Binomial CDF is a step function, the exact misrate\mathrm{misrate} typically falls between two adjacent grid points kk and k+1k + 1. The algorithm identifies these adjacent values: klok_{\text{lo}} is the largest integer where Pr(Wklo)<misrate2\Pr(W \leq k_{\text{lo}}) < \frac{\mathrm{misrate}}{2} and khi=klo+1k_{\text{hi}} = k_{\text{lo}} + 1.

Randomized cutoff

To match the requested misrate\mathrm{misrate} exactly rather than conservatively, the algorithm interpolates between the two grid points. It computes a probability pp such that using margin 2klo2 k_{\text{lo}} with probability pp and margin 2khi2 k_{\text{hi}} with probability 1p1 - p yields an expected coverage of exactly 1misrate1 - \mathrm{misrate}. A uniform random draw determines which margin to return.

This randomization ensures that the bounds are calibrated exactly at the requested error rate under weak continuity, rather than being conservative due to the discreteness of the Binomial distribution.

using Pragmastat.Exceptions;

namespace Pragmastat.Functions;

/// <summary>
/// SignMargin function for one-sample bounds based on Binomial(n, 0.5).
/// </summary>
internal class SignMargin
{
  public static readonly SignMargin Instance = new();

  public int CalcRandomized(int n, double misrate, Pragmastat.Randomization.Rng rng)
  {
    if (n <= 0)
      throw AssumptionException.Domain(Subject.X);
    if (double.IsNaN(misrate) || misrate < 0 || misrate > 1)
      throw AssumptionException.Domain(Subject.Misrate);

    double minMisrate = MinAchievableMisrate.OneSample(n);
    if (misrate < minMisrate)
      throw AssumptionException.Domain(Subject.Misrate);

    double target = misrate / 2;
    if (target <= 0)
      return 0;
    if (target >= 1)
      return checked(n * 2);

    var split = CalcSplit(n, target);
    int rLow = split.RLow;
    double logCdf = split.LogCdf;
    double logPmfHigh = split.LogPmfHigh;

    double logTarget = Math.Log(target);
    double logNum = logTarget > logCdf ? LogSubExp(logTarget, logCdf) : double.NegativeInfinity;
    double p = (IsFinite(logPmfHigh) && IsFinite(logNum))
      ? Math.Exp(logNum - logPmfHigh)
      : 0.0;
    if (p < 0)
      p = 0;
    else if (p > 1)
      p = 1;

    double u = rng.UniformDouble();
    int r = u < p ? rLow + 1 : rLow;
    return checked(r * 2);
  }

  private readonly struct SplitResult
  {
    public readonly int RLow;
    public readonly double LogCdf;
    public readonly double LogPmfHigh;

    public SplitResult(int rLow, double logCdfLow, double logPmfHigh)
    {
      RLow = rLow;
      LogCdf = logCdfLow;
      LogPmfHigh = logPmfHigh;
    }
  }

  private static SplitResult CalcSplit(int n, double target)
  {
    double logTarget = Math.Log(target);

    double logPmf = -n * Math.Log(2);
    double logCdf = logPmf;

    int rLow = 0;

    if (logCdf > logTarget)
      return new SplitResult(0, logCdf, logPmf);

    for (int k = 1; k <= n; k++)
    {
      double logPmfNext = logPmf + Math.Log(n - k + 1) - Math.Log(k);
      double logCdfNext = LogAddExp(logCdf, logPmfNext);

      if (logCdfNext > logTarget)
        return new SplitResult(rLow, logCdf, logPmfNext);

      rLow = k;
      logPmf = logPmfNext;
      logCdf = logCdfNext;
    }

    return new SplitResult(rLow, logCdf, double.NegativeInfinity);
  }

  private static double LogAddExp(double a, double b)
  {
    if (double.IsNegativeInfinity(a))
      return b;
    if (double.IsNegativeInfinity(b))
      return a;
    double m = Math.Max(a, b);
    return m + Math.Log(Math.Exp(a - m) + Math.Exp(b - m));
  }

  private static double LogSubExp(double a, double b)
  {
    if (double.IsNegativeInfinity(b))
      return a;
    double diff = Math.Exp(b - a);
    if (diff >= 1.0)
      return double.NegativeInfinity;
    return a + Math.Log(1.0 - diff);
  }

  private static bool IsFinite(double value)
  {
    return !double.IsNaN(value) && !double.IsInfinity(value);
  }
}

Tests

SignMargin(n,misrate)\operatorname{SignMargin}(n, \mathrm{misrate})

No reference test data exists for SignMargin\operatorname{SignMargin} yet. The tests/sign-margin/ directory has not been created. Test cases will be added when the SignMargin\operatorname{SignMargin} test generator is implemented.