Compare1

Compare1(x,[T1,,Tk])=[P1,,Pk]\operatorname{Compare1}(\mathbf{x}, [T_1, \ldots, T_k]) = [P_1, \ldots, P_k]

where Ti=(Mi,ti,misratei)T_i = (M_i, t_i, \mathrm{misrate}_i) is a threshold with metric MiM_i (Center\operatorname{Center} or Spread\operatorname{Spread}), Pi=(ei,[Li,Ui],vi)P_i = (e_i, [L_i, U_i], v_i) is the projection with estimate eie_i, bounds [Li,Ui][L_i, U_i], and verdict vi=Greaterv_i = \mathrm{Greater} if Li>tiL_i > t_i; Less\mathrm{Less} if Ui<tiU_i < t_i; Inconclusive\mathrm{Inconclusive} otherwise.

One-sample confirmatory analysis: compares estimates against practical thresholds.

Input

  • x=(x1,,xn)\mathbf{x} = (x_1, \ldots, x_n) — sample of measurements
  • Ti=(Mi,ti,misratei)T_i = (M_i, t_i, \mathrm{misrate}_i) — list of kk thresholds: MiM_i is Center\operatorname{Center} or Spread\operatorname{Spread}, tit_i is the threshold value, misratei\mathrm{misrate}_i is the per-threshold error rate
  • seed\text{seed} — optional string for reproducible randomization (passed to SpreadBounds\operatorname{SpreadBounds})

Output

  • Value — list of kk projections in input order, each Pi=(ei,[Li,Ui],vi)P_i = (e_i, [L_i, U_i], v_i)
  • Unit — per-projection: same unit as the underlying estimator (Center\operatorname{Center} or Spread\operatorname{Spread})

Notes

  • Independence — each threshold evaluated independently; no family-wise guarantee
  • Unit compatibility — threshold unit must be compatible with sample unit
  • See alsoCompare2\operatorname{Compare2} for two-sample metrics (Shift\operatorname{Shift}, Ratio\operatorname{Ratio}, Disparity\operatorname{Disparity})

Properties

  • Order preservation PiP_i corresponds to input TiT_i
  • Metric deduplication each distinct metric computed once regardless of threshold count

Example

  • Compare1([1..10], [(Center, 20, 1e-3)])[Projection(5.5, [...], Less)]
  • Compare1([1..10], [(Spread, 0.1, 1e-3)])[Projection(3, [...], Greater)]

Compare1\operatorname{Compare1} automates the pattern of computing an estimate, constructing bounds, and comparing the bounds against a practical threshold. Instead of asking whether Center\operatorname{Center} is significantly different from zero, it answers whether Center\operatorname{Center} is reliably above or below a practical threshold. Each threshold produces a ternary verdict that respects both statistical uncertainty and practical relevance. When multiple thresholds are needed (different metrics or different misrates), pass them all in one call to avoid redundant computation.

Algorithm

Compare1\operatorname{Compare1} orchestrates estimation and comparison in two phases: pre-pass validation and the statistical phase.

Pre-pass validation

Before any statistical computation:

  • Reject weighted samples (unsupported).
  • Reject null or empty threshold list.
  • Reject threshold items containing null.
  • Reject thresholds with metrics not in Center,Spread{\operatorname{Center}, \operatorname{Spread}} (wrong arity).
  • Reject thresholds with non-finite values.

These checks happen before bounds computation, so no statistical work is done on invalid inputs.

Validate-and-normalize pass

For each threshold, in input order:

  • Center: check unit compatibility with x\mathbf{x}; convert threshold value to x\mathbf{x}s unit.
  • Spread: same as Center.

Bindings that support plain numeric shorthand (Python and R) interpret it directly on the comparison scale; explicit measurement thresholds are normalized as above.

Statistical phase (canonical metric order: Center → Spread)

For each present metric (in canonical order), compute the estimate once and bounds for each threshold entry of that metric:

estimate={Center(x)if metric=Center,Spread(x)if metric=Spread\begin{aligned} \text{estimate} = \begin{cases} \operatorname{Center}(\mathbf{x}) & \text{if metric} = \operatorname{Center}, \operatorname{Spread}(\mathbf{x}) & \text{if metric} = \operatorname{Spread} \end{cases} \end{aligned} bounds={CenterBounds(x,misratei)if metric=Center,SpreadBounds(x,misratei,seed)if metric=Spreadandseednull,SpreadBounds(x,misratei)if metric=Spreadandseed=null\begin{aligned} \text{bounds} = \begin{cases} \operatorname{CenterBounds}(\mathbf{x}, \mathrm{misrate}_i) & \text{if metric} = \operatorname{Center}, \operatorname{SpreadBounds}(\mathbf{x}, \mathrm{misrate}_i, \text{seed}) & \text{if metric} = \operatorname{Spread} and \text{seed} \neq \text{null}, \operatorname{SpreadBounds}(\mathbf{x}, \mathrm{misrate}_i) & \text{if metric} = \operatorname{Spread} and \text{seed} = \text{null} \end{cases} \end{aligned}

Verdict computation

verdicti={GreaterifLi>ti,LessifUi<ti,Inconclusiveotherwise\begin{aligned} \text{verdict}_i = \begin{cases} \text{Greater} & \text{if} L_i > t_i, \text{Less} & \text{if} U_i < t_i, \text{Inconclusive} & \text{otherwise} \end{cases} \end{aligned}

where [Li,Ui][L_i, U_i] are the bounds for threshold ii and tit_i is the normalized threshold value.

Result ordering

Results are stored in input order regardless of canonical processing order. Input [Spread, Center] produces output [spread_projection, center_projection].

using Pragmastat.Estimators;
using Pragmastat.Exceptions;
using Pragmastat.Metrology;

namespace Pragmastat.Internal;

internal static class CompareEngine
{
  private readonly struct MetricSpec
  {
    public Metric Metric { get; }
    public Func<Threshold, Sample, Sample?, Measurement> ValidateAndNormalize { get; }
    public Func<Sample, Sample?, Measurement> Estimate { get; }
    public Func<Sample, Sample?, Probability, Bounds> Bounds { get; }
    public Func<Sample, Sample?, Probability, string, Bounds>? SeededBounds { get; }

    public MetricSpec(
      Metric metric,
      Func<Threshold, Sample, Sample?, Measurement> validateAndNormalize,
      Func<Sample, Sample?, Measurement> estimate,
      Func<Sample, Sample?, Probability, Bounds> bounds,
      Func<Sample, Sample?, Probability, string, Bounds>? seededBounds = null)
    {
      Metric = metric;
      ValidateAndNormalize = validateAndNormalize;
      Estimate = estimate;
      Bounds = bounds;
      SeededBounds = seededBounds;
    }
  }

  private static readonly MetricSpec[] Compare1Specs =
  [
    new MetricSpec(
      Metric.Center,
      ValidateCenter,
      (x, _) => CenterEstimator.Instance.Estimate(x),
      (x, _, alpha) => CenterBoundsEstimator.Instance.Estimate(x, alpha)),
    new MetricSpec(
      Metric.Spread,
      ValidateSpread,
      (x, _) => SpreadEstimator.Instance.Estimate(x),
      (x, _, alpha) => SpreadBoundsEstimator.Instance.Estimate(x, alpha),
      (x, _, alpha, seed) => SpreadBoundsEstimator.Instance.Estimate(x, alpha, seed)),
  ];

  private static readonly MetricSpec[] Compare2Specs =
  [
    new MetricSpec(
      Metric.Shift,
      ValidateShift,
      (x, y) => ShiftEstimator.Instance.Estimate(x, y!),
      (x, y, alpha) => ShiftBoundsEstimator.Instance.Estimate(x, y!, alpha)),
    new MetricSpec(
      Metric.Ratio,
      ValidateRatio,
      (x, y) => RatioEstimator.Instance.Estimate(x, y!),
      (x, y, alpha) => RatioBoundsEstimator.Instance.Estimate(x, y!, alpha)),
    new MetricSpec(
      Metric.Disparity,
      ValidateDisparity,
      (x, y) => DisparityEstimator.Instance.Estimate(x, y!),
      (x, y, alpha) => DisparityBoundsEstimator.Instance.Estimate(x, y!, alpha),
      (x, y, alpha, seed) => DisparityBoundsEstimator.Instance.Estimate(x, y!, alpha, seed)),
  ];

  private static Measurement ValidateCenter(Threshold threshold, Sample x, Sample? _)
  {
    if (!threshold.Value.Unit.IsCompatible(x.Unit))
      throw new UnitMismatchException(threshold.Value.Unit, x.Unit);
    if (!threshold.Value.NominalValue.IsFinite())
      throw new ArgumentOutOfRangeException(nameof(threshold), "threshold.Value must be finite");
    double factor = MeasurementUnit.ConversionFactor(threshold.Value.Unit, x.Unit);
    return new Measurement(threshold.Value.NominalValue * factor, x.Unit);
  }

  private static Measurement ValidateSpread(Threshold threshold, Sample x, Sample? _) =>
    ValidateCenter(threshold, x, null);

  private static Measurement ValidateShift(Threshold threshold, Sample x, Sample? y)
  {
    if (!threshold.Value.Unit.IsCompatible(x.Unit))
      throw new UnitMismatchException(threshold.Value.Unit, x.Unit);
    if (!threshold.Value.NominalValue.IsFinite())
      throw new ArgumentOutOfRangeException(nameof(threshold), "threshold.Value must be finite");
    var finerUnit = MeasurementUnit.Finer(x.Unit, y!.Unit);
    double factor = MeasurementUnit.ConversionFactor(threshold.Value.Unit, finerUnit);
    return new Measurement(threshold.Value.NominalValue * factor, finerUnit);
  }

  private static Measurement ValidateRatio(Threshold threshold, Sample _, Sample? __)
  {
    var unit = threshold.Value.Unit;
    if (unit != MeasurementUnit.Ratio && unit != MeasurementUnit.Number)
      throw new UnitMismatchException(unit, MeasurementUnit.Ratio);
    double value = threshold.Value.NominalValue;
    if (value <= 0 || !value.IsFinite())
      throw new ArgumentOutOfRangeException(nameof(threshold), "Ratio threshold.Value must be finite and positive");
    return new Measurement(value, MeasurementUnit.Ratio);
  }

  private static Measurement ValidateDisparity(Threshold threshold, Sample _, Sample? __)
  {
    var unit = threshold.Value.Unit;
    if (unit != MeasurementUnit.Disparity && unit != MeasurementUnit.Number)
      throw new UnitMismatchException(unit, MeasurementUnit.Disparity);
    double value = threshold.Value.NominalValue;
    if (!value.IsFinite())
      throw new ArgumentOutOfRangeException(nameof(threshold), "Disparity threshold.Value must be finite");
    return new Measurement(value, MeasurementUnit.Disparity);
  }

  public static IReadOnlyList<Projection> Compare1(Sample x, IReadOnlyList<Threshold> thresholds, string? seed)
  {
    Assertion.NonWeighted("x", x);
    Assertion.NotNullOrEmpty("thresholds", thresholds);
    Assertion.ItemNotNull("thresholds", thresholds);

    foreach (var threshold in thresholds)
    {
      if (threshold.Metric is Metric.Shift or Metric.Ratio or Metric.Disparity)
        throw new ArgumentException(
          $"Metric {threshold.Metric} is not supported by Compare1. Use Compare2 instead.",
          nameof(thresholds));
    }

    foreach (var threshold in thresholds)
    {
      if (!threshold.Value.NominalValue.IsFinite())
        throw new ArgumentOutOfRangeException(nameof(thresholds), "threshold.Value must be finite");
    }

    var normalizedValues = new Measurement[thresholds.Count];
    for (int i = 0; i < thresholds.Count; i++)
    {
      var spec = GetSpec(Compare1Specs, thresholds[i].Metric);
      normalizedValues[i] = spec.ValidateAndNormalize(thresholds[i], x, null);
    }

    return Execute(Compare1Specs, x, null, thresholds, normalizedValues, seed);
  }

  public static IReadOnlyList<Projection> Compare2(
    Sample x, Sample y, IReadOnlyList<Threshold> thresholds, string? seed)
  {
    Assertion.NonWeighted("x", x);
    Assertion.NonWeighted("y", y);
    Assertion.CompatibleUnits(x, y);
    Assertion.NotNullOrEmpty("thresholds", thresholds);
    Assertion.ItemNotNull("thresholds", thresholds);

    foreach (var threshold in thresholds)
    {
      if (threshold.Metric is Metric.Center or Metric.Spread)
        throw new ArgumentException(
          $"Metric {threshold.Metric} is not supported by Compare2. Use Compare1 instead.",
          nameof(thresholds));
    }

    foreach (var threshold in thresholds)
    {
      if (!threshold.Value.NominalValue.IsFinite())
        throw new ArgumentOutOfRangeException(nameof(thresholds), "threshold.Value must be finite");
    }

    var normalizedValues = new Measurement[thresholds.Count];
    for (int i = 0; i < thresholds.Count; i++)
    {
      var spec = GetSpec(Compare2Specs, thresholds[i].Metric);
      normalizedValues[i] = spec.ValidateAndNormalize(thresholds[i], x, y);
    }

    return Execute(Compare2Specs, x, y, thresholds, normalizedValues, seed);
  }

  private static MetricSpec GetSpec(MetricSpec[] specs, Metric metric)
  {
    foreach (var spec in specs)
      if (spec.Metric == metric) return spec;
    throw new ArgumentException($"No spec found for metric {metric}");
  }

  private static IReadOnlyList<Projection> Execute(
    MetricSpec[] canonicalSpecs,
    Sample x,
    Sample? y,
    IReadOnlyList<Threshold> thresholds,
    Measurement[] normalizedValues,
    string? seed)
  {
    var results = new Projection[thresholds.Count];

    var byMetric = thresholds
      .Select((t, i) => (t, i, normalizedValues[i]))
      .GroupBy(item => item.t.Metric)
      .ToDictionary(g => g.Key, g => g.ToList());

    foreach (var spec in canonicalSpecs)
    {
      if (!byMetric.TryGetValue(spec.Metric, out var entries)) continue;
      var estimate = spec.Estimate(x, y);
      foreach (var (threshold, inputIndex, normalizedValue) in entries)
      {
        var bounds = (seed != null && spec.SeededBounds != null)
          ? spec.SeededBounds(x, y, threshold.Misrate, seed)
          : spec.Bounds(x, y, threshold.Misrate);
        var verdict = ComputeVerdict(bounds, normalizedValue);
        results[inputIndex] = new Projection(threshold, estimate, bounds, verdict);
      }
    }

    return results;
  }

  private static ComparisonVerdict ComputeVerdict(Bounds bounds, Measurement normalizedThreshold)
  {
    double t = normalizedThreshold.NominalValue;
    if (bounds.Lower > t) return ComparisonVerdict.Greater;
    if (bounds.Upper < t) return ComparisonVerdict.Less;
    return ComparisonVerdict.Inconclusive;
  }
}

Notes

Verdict Boundary Condition

When L=tL = t (bounds lower equals threshold), the verdict is Inconclusive\mathrm{Inconclusive}, not Greater\mathrm{Greater}. When U=tU = t (bounds upper equals threshold), the verdict is Inconclusive\mathrm{Inconclusive}, not Less\mathrm{Less}. The verdict is Greater\mathrm{Greater} only when L>tL > t (strictly).

This conservative choice reflects the discrete nature of confidence bounds: the true value could plausibly equal the boundary.

From Hypothesis Testing to Practical Thresholds

Compare1 embodies the Inversion Principle: instead of asking Can I reject the hypothesis that Center equals zero?, Compare1 answers Is Center reliably greater than my practical threshold?

Traditional hypothesis testing against zero may declare a 0.01%0.01\% difference statistically significant with large enough sample sizes, even when the difference is practically irrelevant. Compare1 forces explicit specification of practical thresholds and returns a ternary verdict (Less\mathrm{Less}, Greater\mathrm{Greater}, Inconclusive\mathrm{Inconclusive}) that respects both statistical uncertainty and practical relevance.

Tests

The Compare1\operatorname{Compare1} test suite contains 18 test cases (5 demo + 3 multi-threshold + 1 order + 3 misrate + 3 natural + 3 error). All tests use seed "compare1-tests" for reproducibility. Each test case output is a JSON object with a projections array; each projection has estimate, lower, upper, and verdict fields.

Demo examples (n=10n = 10, x=(1,,10)\mathbf{x} = (1, \ldots, 10)) single threshold, clear verdicts:

  • demo-center-less: center threshold above the upper bound → Less\mathrm{Less}
  • demo-center-greater: center threshold below the lower bound → Greater\mathrm{Greater}
  • demo-center-inconclusive: center threshold inside the bounds → Inconclusive\mathrm{Inconclusive}
  • demo-spread-less: spread threshold above the upper bound → Less\mathrm{Less}
  • demo-spread-greater: spread threshold below the lower bound → Greater\mathrm{Greater}

Multi-threshold (n=10n = 10) multiple thresholds per call:

  • multi-center-spread: one center threshold and one spread threshold → [Less,Greater][\text{Less}, \text{Greater}]
  • multi-two-centers: two center thresholds in one call → [Less,Greater][\text{Less}, \text{Greater}]
  • multi-mixed: mixed center/spread thresholds → [Greater,Less,Less][\text{Greater}, \text{Less}, \text{Less}]

Input order preservation verifies output order matches input order, not canonical order:

  • order-spread-center: spread threshold listed before center threshold → output[0] = spread projection, output[1] = center projection

Misrate variation (n=20n = 20, x=(1,,20)\mathbf{x} = (1, \ldots, 20), Center\operatorname{Center} threshold at 1010):

3 tests spanning progressively stricter fixture misrates, from narrower to wider bounds.

These validate that smaller misrates produce wider bounds.

Natural sequences:

  • natural-10: n=10n = 10, Center\operatorname{Center} threshold at 5.55.5
  • natural-15: n=15n = 15, Center\operatorname{Center} threshold at 88
  • natural-20: n=20n = 20, Center\operatorname{Center} threshold at 10.510.5

Error cases inputs that violate assumptions:

  • error-empty-x: x=()\mathbf{x} = ()validity(x)\text{validity}(x)
  • error-single-x-center: x=1|\mathbf{x}| = 1, Center\operatorname{Center} threshold → domain(x)\text{domain}(x) (requires n2n \geq 2)
  • error-constant-spread: x=(5,5,5,5,5,5)\mathbf{x} = (5, 5, 5, 5, 5, 5), Spread\operatorname{Spread} threshold → sparity(x)\text{sparity}(x)