SignMargin

\operatorname{SignMargin}(n, \mathrm{misrate})

Randomized exclusion count for disjoint-pair sign-test bounds.

Purpose determines extreme pairwise absolute differences to exclude when constructing bounds
Based on $\text{Binomial}(n, \frac{1}{2})$ CDF inversion with randomized cutoff between adjacent grid points
Returns total margin split evenly between lower and upper tails
Used by $\operatorname{SpreadBounds}$ to select appropriate order statistics of disjoint-pair differences
Complexity exact for all $n$
Domain $n \geq 1$ , $\mathrm{misrate} \geq 2^{1-n}$
Unit count
Note randomized to match the requested misrate exactly (deterministic rounding is conservative)

Properties

Bounds $0 \leq \operatorname{SignMargin}(n, \mathrm{misrate}) \leq 2n$
Monotonicity lower misrate $\rightarrow$ smaller margin $\rightarrow$ wider bounds

Example

Each call returns one of two adjacent grid points (randomized):

SignMargin(10, 0.05) returns 2 or 4
SignMargin(15, 0.01) returns 4 or 6
SignMargin(30, 1e-4) returns 8 or 10

This is a supporting function that $\operatorname{SpreadBounds}$ uses internally, so most users do not need to call it directly. It calculates how many extreme disjoint-pair absolute differences should be excluded when constructing bounds, based on the number of pairs and the desired error rate. Because the $\text{Binomial}(n, \frac{1}{2})$ CDF is discrete, exact matching of an arbitrary misrate requires randomizing the cutoff between two adjacent integer values. A lower misrate (higher confidence) results in a smaller margin, which produces wider bounds.

Algorithm

The $\operatorname{SignMargin}$ function determines the exclusion count for disjoint-pair sign-test bounds by inverting the $\text{Binomial}(n, \frac{1}{2})$ CDF.

Given $n$ pairs and a desired $\mathrm{misrate}$ , the algorithm finds the number of extreme order statistics to exclude so that the resulting bounds contain the true parameter with probability $1 - \mathrm{misrate}$ .

Binomial CDF computation

Under the symmetry assumption, the number of positive signs among $n$ disjoint-pair differences follows $\text{Binomial}(n, \frac{1}{2})$ . The CDF is computed exactly:

\Pr(W \leq k) = \sum_{i=0}^k \binom{n}{i} 2^{-n}

The algorithm evaluates this sum incrementally, accumulating probabilities until the cumulative probability reaches $\frac{\mathrm{misrate}}{2}$ .

Grid point identification

Because the Binomial CDF is a step function, the exact $\mathrm{misrate}$ typically falls between two adjacent grid points $k$ and $k + 1$ . The algorithm identifies these adjacent values: $k_{\text{lo}}$ is the largest integer where $\Pr(W \leq k_{\text{lo}}) < \frac{\mathrm{misrate}}{2}$ and $k_{\text{hi}} = k_{\text{lo}} + 1$ .

Randomized cutoff

To match the requested $\mathrm{misrate}$ exactly rather than conservatively, the algorithm interpolates between the two grid points. It computes a probability $p$ such that using margin $2 k_{\text{lo}}$ with probability $p$ and margin $2 k_{\text{hi}}$ with probability $1 - p$ yields an expected coverage of exactly $1 - \mathrm{misrate}$ . A uniform random draw determines which margin to return.

This randomization ensures that the bounds are calibrated exactly at the requested error rate under weak continuity, rather than being conservative due to the discreteness of the Binomial distribution.

using Pragmastat.Exceptions;

namespace Pragmastat.Functions;

/// <summary>
/// SignMargin function for one-sample bounds based on Binomial(n, 0.5).
/// </summary>
internal class SignMargin
{
  public static readonly SignMargin Instance = new();

  public int CalcRandomized(int n, double misrate, Pragmastat.Randomization.Rng rng)
  {
    if (n <= 0)
      throw AssumptionException.Domain(Subject.X);
    if (double.IsNaN(misrate) || misrate < 0 || misrate > 1)
      throw AssumptionException.Domain(Subject.Misrate);

    double minMisrate = MinAchievableMisrate.OneSample(n);
    if (misrate < minMisrate)
      throw AssumptionException.Domain(Subject.Misrate);

    double target = misrate / 2;
    if (target <= 0)
      return 0;
    if (target >= 1)
      return checked(n * 2);

    var split = CalcSplit(n, target);
    int rLow = split.RLow;
    double logCdf = split.LogCdf;
    double logPmfHigh = split.LogPmfHigh;

    double logTarget = Math.Log(target);
    double logNum = logTarget > logCdf ? LogSubExp(logTarget, logCdf) : double.NegativeInfinity;
    double p = (IsFinite(logPmfHigh) && IsFinite(logNum))
      ? Math.Exp(logNum - logPmfHigh)
      : 0.0;
    if (p < 0)
      p = 0;
    else if (p > 1)
      p = 1;

    double u = rng.UniformDouble();
    int r = u < p ? rLow + 1 : rLow;
    return checked(r * 2);
  }

  private readonly struct SplitResult
  {
    public readonly int RLow;
    public readonly double LogCdf;
    public readonly double LogPmfHigh;

    public SplitResult(int rLow, double logCdfLow, double logPmfHigh)
    {
      RLow = rLow;
      LogCdf = logCdfLow;
      LogPmfHigh = logPmfHigh;
    }
  }

  private static SplitResult CalcSplit(int n, double target)
  {
    double logTarget = Math.Log(target);

    double logPmf = -n * Math.Log(2);
    double logCdf = logPmf;

    int rLow = 0;

    if (logCdf > logTarget)
      return new SplitResult(0, logCdf, logPmf);

    for (int k = 1; k <= n; k++)
    {
      double logPmfNext = logPmf + Math.Log(n - k + 1) - Math.Log(k);
      double logCdfNext = LogAddExp(logCdf, logPmfNext);

      if (logCdfNext > logTarget)
        return new SplitResult(rLow, logCdf, logPmfNext);

      rLow = k;
      logPmf = logPmfNext;
      logCdf = logCdfNext;
    }

    return new SplitResult(rLow, logCdf, double.NegativeInfinity);
  }

  private static double LogAddExp(double a, double b)
  {
    if (double.IsNegativeInfinity(a))
      return b;
    if (double.IsNegativeInfinity(b))
      return a;
    double m = Math.Max(a, b);
    return m + Math.Log(Math.Exp(a - m) + Math.Exp(b - m));
  }

  private static double LogSubExp(double a, double b)
  {
    if (double.IsNegativeInfinity(b))
      return a;
    double diff = Math.Exp(b - a);
    if (diff >= 1.0)
      return double.NegativeInfinity;
    return a + Math.Log(1.0 - diff);
  }

  private static bool IsFinite(double value)
  {
    return !double.IsNaN(value) && !double.IsInfinity(value);
  }
}

Tests

\operatorname{SignMargin}(n, \mathrm{misrate})

No reference test data exists for $\operatorname{SignMargin}$ yet. The tests/sign-margin/ directory has not been created. Test cases will be added when the $\operatorname{SignMargin}$ test generator is implemented.