SignMargin
Randomized exclusion count for disjoint-pair sign-test bounds.
- Purpose determines extreme pairwise absolute differences to exclude when constructing bounds
- Based on CDF inversion with randomized cutoff between adjacent grid points
- Returns total margin split evenly between lower and upper tails
- Used by to select appropriate order statistics of disjoint-pair differences
- Complexity exact for all
- Domain ,
- Unit count
- Note randomized to match the requested misrate exactly (deterministic rounding is conservative)
Properties
- Bounds
- Monotonicity lower misrate smaller margin wider bounds
Example
Each call returns one of two adjacent grid points (randomized):
SignMargin(10, 0.05)returns 2 or 4SignMargin(15, 0.01)returns 4 or 6SignMargin(30, 1e-4)returns 8 or 10
This is a supporting function that uses internally, so most users do not need to call it directly. It calculates how many extreme disjoint-pair absolute differences should be excluded when constructing bounds, based on the number of pairs and the desired error rate. Because the CDF is discrete, exact matching of an arbitrary misrate requires randomizing the cutoff between two adjacent integer values. A lower misrate (higher confidence) results in a smaller margin, which produces wider bounds.
Algorithm
The function determines the exclusion count for disjoint-pair sign-test bounds by inverting the CDF.
Given pairs and a desired , the algorithm finds the number of extreme order statistics to exclude so that the resulting bounds contain the true parameter with probability .
Binomial CDF computation
Under the symmetry assumption, the number of positive signs among disjoint-pair differences follows . The CDF is computed exactly:
The algorithm evaluates this sum incrementally, accumulating probabilities until the cumulative probability reaches .
Grid point identification
Because the Binomial CDF is a step function, the exact typically falls between two adjacent grid points and . The algorithm identifies these adjacent values: is the largest integer where and .
Randomized cutoff
To match the requested exactly rather than conservatively, the algorithm interpolates between the two grid points. It computes a probability such that using margin with probability and margin with probability yields an expected coverage of exactly . A uniform random draw determines which margin to return.
This randomization ensures that the bounds are calibrated exactly at the requested error rate under weak continuity, rather than being conservative due to the discreteness of the Binomial distribution.
using Pragmastat.Exceptions;
namespace Pragmastat.Functions;
/// <summary>
/// SignMargin function for one-sample bounds based on Binomial(n, 0.5).
/// </summary>
internal class SignMargin
{
public static readonly SignMargin Instance = new();
public int CalcRandomized(int n, double misrate, Pragmastat.Randomization.Rng rng)
{
if (n <= 0)
throw AssumptionException.Domain(Subject.X);
if (double.IsNaN(misrate) || misrate < 0 || misrate > 1)
throw AssumptionException.Domain(Subject.Misrate);
double minMisrate = MinAchievableMisrate.OneSample(n);
if (misrate < minMisrate)
throw AssumptionException.Domain(Subject.Misrate);
double target = misrate / 2;
if (target <= 0)
return 0;
if (target >= 1)
return checked(n * 2);
var split = CalcSplit(n, target);
int rLow = split.RLow;
double logCdf = split.LogCdf;
double logPmfHigh = split.LogPmfHigh;
double logTarget = Math.Log(target);
double logNum = logTarget > logCdf ? LogSubExp(logTarget, logCdf) : double.NegativeInfinity;
double p = (IsFinite(logPmfHigh) && IsFinite(logNum))
? Math.Exp(logNum - logPmfHigh)
: 0.0;
if (p < 0)
p = 0;
else if (p > 1)
p = 1;
double u = rng.UniformDouble();
int r = u < p ? rLow + 1 : rLow;
return checked(r * 2);
}
private readonly struct SplitResult
{
public readonly int RLow;
public readonly double LogCdf;
public readonly double LogPmfHigh;
public SplitResult(int rLow, double logCdfLow, double logPmfHigh)
{
RLow = rLow;
LogCdf = logCdfLow;
LogPmfHigh = logPmfHigh;
}
}
private static SplitResult CalcSplit(int n, double target)
{
double logTarget = Math.Log(target);
double logPmf = -n * Math.Log(2);
double logCdf = logPmf;
int rLow = 0;
if (logCdf > logTarget)
return new SplitResult(0, logCdf, logPmf);
for (int k = 1; k <= n; k++)
{
double logPmfNext = logPmf + Math.Log(n - k + 1) - Math.Log(k);
double logCdfNext = LogAddExp(logCdf, logPmfNext);
if (logCdfNext > logTarget)
return new SplitResult(rLow, logCdf, logPmfNext);
rLow = k;
logPmf = logPmfNext;
logCdf = logCdfNext;
}
return new SplitResult(rLow, logCdf, double.NegativeInfinity);
}
private static double LogAddExp(double a, double b)
{
if (double.IsNegativeInfinity(a))
return b;
if (double.IsNegativeInfinity(b))
return a;
double m = Math.Max(a, b);
return m + Math.Log(Math.Exp(a - m) + Math.Exp(b - m));
}
private static double LogSubExp(double a, double b)
{
if (double.IsNegativeInfinity(b))
return a;
double diff = Math.Exp(b - a);
if (diff >= 1.0)
return double.NegativeInfinity;
return a + Math.Log(1.0 - diff);
}
private static bool IsFinite(double value)
{
return !double.IsNaN(value) && !double.IsInfinity(value);
}
}
Tests
No reference test data exists for yet. The tests/sign-margin/ directory has not been created. Test cases will be added when the test generator is implemented.