Compare1

\operatorname{Compare1}(\mathbf{x}, [T_1, \ldots, T_k]) = [P_1, \ldots, P_k]

where $T_i = (M_i, t_i, \mathrm{misrate}_i)$ is a threshold with metric $M_i$ ( $\operatorname{Center}$ or $\operatorname{Spread}$ ), $P_i = (e_i, [L_i, U_i], v_i)$ is the projection with estimate $e_i$ , bounds $[L_i, U_i]$ , and verdict $v_i = \mathrm{Greater}$ if $L_i > t_i$ ; $\mathrm{Less}$ if $U_i < t_i$ ; $\mathrm{Inconclusive}$ otherwise.

One-sample confirmatory analysis: compares estimates against practical thresholds.

Input

$\mathbf{x} = (x_1, \ldots, x_n)$ — sample of measurements
$T_i = (M_i, t_i, \mathrm{misrate}_i)$ — list of $k$ thresholds: $M_i$ is $\operatorname{Center}$ or $\operatorname{Spread}$ , $t_i$ is the threshold value, $\mathrm{misrate}_i$ is the per-threshold error rate
$\text{seed}$ — optional string for reproducible randomization (passed to $\operatorname{SpreadBounds}$ )

Output

Value — list of $k$ projections in input order, each $P_i = (e_i, [L_i, U_i], v_i)$
Unit — per-projection: same unit as the underlying estimator ( $\operatorname{Center}$ or $\operatorname{Spread}$ )

Notes

Independence — each threshold evaluated independently; no family-wise guarantee
Unit compatibility — threshold unit must be compatible with sample unit
See also — $\operatorname{Compare2}$ for two-sample metrics ( $\operatorname{Shift}$ , $\operatorname{Ratio}$ , $\operatorname{Disparity}$ )

Properties

Order preservation $P_i$ corresponds to input $T_i$
Metric deduplication each distinct metric computed once regardless of threshold count

Example

Compare1([1..10], [(Center, 20, 1e-3)]) → [Projection(5.5, [...], Less)]
Compare1([1..10], [(Spread, 0.1, 1e-3)]) → [Projection(3, [...], Greater)]

$\operatorname{Compare1}$ automates the pattern of computing an estimate, constructing bounds, and comparing the bounds against a practical threshold. Instead of asking whether $\operatorname{Center}$ is significantly different from zero, it answers whether $\operatorname{Center}$ is reliably above or below a practical threshold. Each threshold produces a ternary verdict that respects both statistical uncertainty and practical relevance. When multiple thresholds are needed (different metrics or different misrates), pass them all in one call to avoid redundant computation.

Algorithm

$\operatorname{Compare1}$ orchestrates estimation and comparison in two phases: pre-pass validation and the statistical phase.

Pre-pass validation

Before any statistical computation:

Reject weighted samples (unsupported).
Reject null or empty threshold list.
Reject threshold items containing null.
Reject thresholds with metrics not in ${\operatorname{Center}, \operatorname{Spread}}$ (wrong arity).
Reject thresholds with non-finite values.

These checks happen before bounds computation, so no statistical work is done on invalid inputs.

Validate-and-normalize pass

For each threshold, in input order:

Center: check unit compatibility with $\mathbf{x}$ ; convert threshold value to $\mathbf{x}$ s unit.
Spread: same as Center.

Bindings that support plain numeric shorthand (Python and R) interpret it directly on the comparison scale; explicit measurement thresholds are normalized as above.

Statistical phase (canonical metric order: Center → Spread)

For each present metric (in canonical order), compute the estimate once and bounds for each threshold entry of that metric:

\begin{aligned} \text{estimate} = \begin{cases} \operatorname{Center}(\mathbf{x}) & \text{if metric} = \operatorname{Center}, \operatorname{Spread}(\mathbf{x}) & \text{if metric} = \operatorname{Spread} \end{cases} \end{aligned}

\begin{aligned} \text{bounds} = \begin{cases} \operatorname{CenterBounds}(\mathbf{x}, \mathrm{misrate}_i) & \text{if metric} = \operatorname{Center}, \operatorname{SpreadBounds}(\mathbf{x}, \mathrm{misrate}_i, \text{seed}) & \text{if metric} = \operatorname{Spread} and \text{seed} \neq \text{null}, \operatorname{SpreadBounds}(\mathbf{x}, \mathrm{misrate}_i) & \text{if metric} = \operatorname{Spread} and \text{seed} = \text{null} \end{cases} \end{aligned}

Verdict computation

\begin{aligned} \text{verdict}_i = \begin{cases} \text{Greater} & \text{if} L_i > t_i, \text{Less} & \text{if} U_i < t_i, \text{Inconclusive} & \text{otherwise} \end{cases} \end{aligned}

where $[L_i, U_i]$ are the bounds for threshold $i$ and $t_i$ is the normalized threshold value.

Result ordering

Results are stored in input order regardless of canonical processing order. Input [Spread, Center] produces output [spread_projection, center_projection].

using Pragmastat.Estimators;
using Pragmastat.Exceptions;
using Pragmastat.Metrology;

namespace Pragmastat.Internal;

internal static class CompareEngine
{
  private readonly struct MetricSpec
  {
    public Metric Metric { get; }
    public Func<Threshold, Sample, Sample?, Measurement> ValidateAndNormalize { get; }
    public Func<Sample, Sample?, Measurement> Estimate { get; }
    public Func<Sample, Sample?, Probability, Bounds> Bounds { get; }
    public Func<Sample, Sample?, Probability, string, Bounds>? SeededBounds { get; }

    public MetricSpec(
      Metric metric,
      Func<Threshold, Sample, Sample?, Measurement> validateAndNormalize,
      Func<Sample, Sample?, Measurement> estimate,
      Func<Sample, Sample?, Probability, Bounds> bounds,
      Func<Sample, Sample?, Probability, string, Bounds>? seededBounds = null)
    {
      Metric = metric;
      ValidateAndNormalize = validateAndNormalize;
      Estimate = estimate;
      Bounds = bounds;
      SeededBounds = seededBounds;
    }
  }

  private static readonly MetricSpec[] Compare1Specs =
  [
    new MetricSpec(
      Metric.Center,
      ValidateCenter,
      (x, _) => CenterEstimator.Instance.Estimate(x),
      (x, _, alpha) => CenterBoundsEstimator.Instance.Estimate(x, alpha)),
    new MetricSpec(
      Metric.Spread,
      ValidateSpread,
      (x, _) => SpreadEstimator.Instance.Estimate(x),
      (x, _, alpha) => SpreadBoundsEstimator.Instance.Estimate(x, alpha),
      (x, _, alpha, seed) => SpreadBoundsEstimator.Instance.Estimate(x, alpha, seed)),
  ];

  private static readonly MetricSpec[] Compare2Specs =
  [
    new MetricSpec(
      Metric.Shift,
      ValidateShift,
      (x, y) => ShiftEstimator.Instance.Estimate(x, y!),
      (x, y, alpha) => ShiftBoundsEstimator.Instance.Estimate(x, y!, alpha)),
    new MetricSpec(
      Metric.Ratio,
      ValidateRatio,
      (x, y) => RatioEstimator.Instance.Estimate(x, y!),
      (x, y, alpha) => RatioBoundsEstimator.Instance.Estimate(x, y!, alpha)),
    new MetricSpec(
      Metric.Disparity,
      ValidateDisparity,
      (x, y) => DisparityEstimator.Instance.Estimate(x, y!),
      (x, y, alpha) => DisparityBoundsEstimator.Instance.Estimate(x, y!, alpha),
      (x, y, alpha, seed) => DisparityBoundsEstimator.Instance.Estimate(x, y!, alpha, seed)),
  ];

  private static Measurement ValidateCenter(Threshold threshold, Sample x, Sample? _)
  {
    if (!threshold.Value.Unit.IsCompatible(x.Unit))
      throw new UnitMismatchException(threshold.Value.Unit, x.Unit);
    if (!threshold.Value.NominalValue.IsFinite())
      throw new ArgumentOutOfRangeException(nameof(threshold), "threshold.Value must be finite");
    double factor = MeasurementUnit.ConversionFactor(threshold.Value.Unit, x.Unit);
    return new Measurement(threshold.Value.NominalValue * factor, x.Unit);
  }

  private static Measurement ValidateSpread(Threshold threshold, Sample x, Sample? _) =>
    ValidateCenter(threshold, x, null);

  private static Measurement ValidateShift(Threshold threshold, Sample x, Sample? y)
  {
    if (!threshold.Value.Unit.IsCompatible(x.Unit))
      throw new UnitMismatchException(threshold.Value.Unit, x.Unit);
    if (!threshold.Value.NominalValue.IsFinite())
      throw new ArgumentOutOfRangeException(nameof(threshold), "threshold.Value must be finite");
    var finerUnit = MeasurementUnit.Finer(x.Unit, y!.Unit);
    double factor = MeasurementUnit.ConversionFactor(threshold.Value.Unit, finerUnit);
    return new Measurement(threshold.Value.NominalValue * factor, finerUnit);
  }

  private static Measurement ValidateRatio(Threshold threshold, Sample _, Sample? __)
  {
    var unit = threshold.Value.Unit;
    if (unit != MeasurementUnit.Ratio && unit != MeasurementUnit.Number)
      throw new UnitMismatchException(unit, MeasurementUnit.Ratio);
    double value = threshold.Value.NominalValue;
    if (value <= 0 || !value.IsFinite())
      throw new ArgumentOutOfRangeException(nameof(threshold), "Ratio threshold.Value must be finite and positive");
    return new Measurement(value, MeasurementUnit.Ratio);
  }

  private static Measurement ValidateDisparity(Threshold threshold, Sample _, Sample? __)
  {
    var unit = threshold.Value.Unit;
    if (unit != MeasurementUnit.Disparity && unit != MeasurementUnit.Number)
      throw new UnitMismatchException(unit, MeasurementUnit.Disparity);
    double value = threshold.Value.NominalValue;
    if (!value.IsFinite())
      throw new ArgumentOutOfRangeException(nameof(threshold), "Disparity threshold.Value must be finite");
    return new Measurement(value, MeasurementUnit.Disparity);
  }

  public static IReadOnlyList<Projection> Compare1(Sample x, IReadOnlyList<Threshold> thresholds, string? seed)
  {
    Assertion.NonWeighted("x", x);
    Assertion.NotNullOrEmpty("thresholds", thresholds);
    Assertion.ItemNotNull("thresholds", thresholds);

    foreach (var threshold in thresholds)
    {
      if (threshold.Metric is Metric.Shift or Metric.Ratio or Metric.Disparity)
        throw new ArgumentException(
          $"Metric {threshold.Metric} is not supported by Compare1. Use Compare2 instead.",
          nameof(thresholds));
    }

    foreach (var threshold in thresholds)
    {
      if (!threshold.Value.NominalValue.IsFinite())
        throw new ArgumentOutOfRangeException(nameof(thresholds), "threshold.Value must be finite");
    }

    var normalizedValues = new Measurement[thresholds.Count];
    for (int i = 0; i < thresholds.Count; i++)
    {
      var spec = GetSpec(Compare1Specs, thresholds[i].Metric);
      normalizedValues[i] = spec.ValidateAndNormalize(thresholds[i], x, null);
    }

    return Execute(Compare1Specs, x, null, thresholds, normalizedValues, seed);
  }

  public static IReadOnlyList<Projection> Compare2(
    Sample x, Sample y, IReadOnlyList<Threshold> thresholds, string? seed)
  {
    Assertion.NonWeighted("x", x);
    Assertion.NonWeighted("y", y);
    Assertion.CompatibleUnits(x, y);
    Assertion.NotNullOrEmpty("thresholds", thresholds);
    Assertion.ItemNotNull("thresholds", thresholds);

    foreach (var threshold in thresholds)
    {
      if (threshold.Metric is Metric.Center or Metric.Spread)
        throw new ArgumentException(
          $"Metric {threshold.Metric} is not supported by Compare2. Use Compare1 instead.",
          nameof(thresholds));
    }

    foreach (var threshold in thresholds)
    {
      if (!threshold.Value.NominalValue.IsFinite())
        throw new ArgumentOutOfRangeException(nameof(thresholds), "threshold.Value must be finite");
    }

    var normalizedValues = new Measurement[thresholds.Count];
    for (int i = 0; i < thresholds.Count; i++)
    {
      var spec = GetSpec(Compare2Specs, thresholds[i].Metric);
      normalizedValues[i] = spec.ValidateAndNormalize(thresholds[i], x, y);
    }

    return Execute(Compare2Specs, x, y, thresholds, normalizedValues, seed);
  }

  private static MetricSpec GetSpec(MetricSpec[] specs, Metric metric)
  {
    foreach (var spec in specs)
      if (spec.Metric == metric) return spec;
    throw new ArgumentException($"No spec found for metric {metric}");
  }

  private static IReadOnlyList<Projection> Execute(
    MetricSpec[] canonicalSpecs,
    Sample x,
    Sample? y,
    IReadOnlyList<Threshold> thresholds,
    Measurement[] normalizedValues,
    string? seed)
  {
    var results = new Projection[thresholds.Count];

    var byMetric = thresholds
      .Select((t, i) => (t, i, normalizedValues[i]))
      .GroupBy(item => item.t.Metric)
      .ToDictionary(g => g.Key, g => g.ToList());

    foreach (var spec in canonicalSpecs)
    {
      if (!byMetric.TryGetValue(spec.Metric, out var entries)) continue;
      var estimate = spec.Estimate(x, y);
      foreach (var (threshold, inputIndex, normalizedValue) in entries)
      {
        var bounds = (seed != null && spec.SeededBounds != null)
          ? spec.SeededBounds(x, y, threshold.Misrate, seed)
          : spec.Bounds(x, y, threshold.Misrate);
        var verdict = ComputeVerdict(bounds, normalizedValue);
        results[inputIndex] = new Projection(threshold, estimate, bounds, verdict);
      }
    }

    return results;
  }

  private static ComparisonVerdict ComputeVerdict(Bounds bounds, Measurement normalizedThreshold)
  {
    double t = normalizedThreshold.NominalValue;
    if (bounds.Lower > t) return ComparisonVerdict.Greater;
    if (bounds.Upper < t) return ComparisonVerdict.Less;
    return ComparisonVerdict.Inconclusive;
  }
}

Notes

Verdict Boundary Condition

When $L = t$ (bounds lower equals threshold), the verdict is $\mathrm{Inconclusive}$ , not $\mathrm{Greater}$ . When $U = t$ (bounds upper equals threshold), the verdict is $\mathrm{Inconclusive}$ , not $\mathrm{Less}$ . The verdict is $\mathrm{Greater}$ only when $L > t$ (strictly).

This conservative choice reflects the discrete nature of confidence bounds: the true value could plausibly equal the boundary.

From Hypothesis Testing to Practical Thresholds

Compare1 embodies the Inversion Principle: instead of asking Can I reject the hypothesis that Center equals zero?, Compare1 answers Is Center reliably greater than my practical threshold?

Traditional hypothesis testing against zero may declare a $0.01\%$ difference statistically significant with large enough sample sizes, even when the difference is practically irrelevant. Compare1 forces explicit specification of practical thresholds and returns a ternary verdict ( $\mathrm{Less}$ , $\mathrm{Greater}$ , $\mathrm{Inconclusive}$ ) that respects both statistical uncertainty and practical relevance.

Tests

The $\operatorname{Compare1}$ test suite contains 18 test cases (5 demo + 3 multi-threshold + 1 order + 3 misrate + 3 natural + 3 error). All tests use seed "compare1-tests" for reproducibility. Each test case output is a JSON object with a projections array; each projection has estimate, lower, upper, and verdict fields.

Demo examples ( $n = 10$ , $\mathbf{x} = (1, \ldots, 10)$ ) single threshold, clear verdicts:

demo-center-less: center threshold above the upper bound → $\mathrm{Less}$
demo-center-greater: center threshold below the lower bound → $\mathrm{Greater}$
demo-center-inconclusive: center threshold inside the bounds → $\mathrm{Inconclusive}$
demo-spread-less: spread threshold above the upper bound → $\mathrm{Less}$
demo-spread-greater: spread threshold below the lower bound → $\mathrm{Greater}$

Multi-threshold ( $n = 10$ ) multiple thresholds per call:

multi-center-spread: one center threshold and one spread threshold → $[\text{Less}, \text{Greater}]$
multi-two-centers: two center thresholds in one call → $[\text{Less}, \text{Greater}]$
multi-mixed: mixed center/spread thresholds → $[\text{Greater}, \text{Less}, \text{Less}]$

Input order preservation verifies output order matches input order, not canonical order:

order-spread-center: spread threshold listed before center threshold → output[0] = spread projection, output[1] = center projection

Misrate variation ( $n = 20$ , $\mathbf{x} = (1, \ldots, 20)$ , $\operatorname{Center}$ threshold at $10$ ):

3 tests spanning progressively stricter fixture misrates, from narrower to wider bounds.

These validate that smaller misrates produce wider bounds.

Natural sequences:

natural-10: $n = 10$ , $\operatorname{Center}$ threshold at $5.5$
natural-15: $n = 15$ , $\operatorname{Center}$ threshold at $8$
natural-20: $n = 20$ , $\operatorname{Center}$ threshold at $10.5$

Error cases inputs that violate assumptions:

error-empty-x: $\mathbf{x} = ()$ → $\text{validity}(x)$
error-single-x-center: $|\mathbf{x}| = 1$ , $\operatorname{Center}$ threshold → $\text{domain}(x)$ (requires $n \geq 2$ )
error-constant-spread: $\mathbf{x} = (5, 5, 5, 5, 5, 5)$ , $\operatorname{Spread}$ threshold → $\text{sparity}(x)$