Shift

\operatorname{Shift}(\mathbf{x}, \mathbf{y}) = \underset{1 \leq i \leq n\; 1 \leq j \leq m}{\operatorname{Median}} (x_i - y_j)

Robust measure of location difference between two samples.

Also known as — Hodges-Lehmann estimator for two samples
Asymptotic — median of the difference between random measurements from $X$ and $Y$
Complexity — $O((m+n) \log L)$
Domain — any real numbers
Unit — same as measurements

Properties

Self-difference $\operatorname{Shift}(\mathbf{x}, \mathbf{x}) = 0$
Shift equivariance $\operatorname{Shift}(\mathbf{x} + k_x, \mathbf{y} + k_y) = \operatorname{Shift}(\mathbf{x}, \mathbf{y}) + k_x - k_y$
Scale equivariance $\operatorname{Shift}(k \cdot \mathbf{x}, k \cdot \mathbf{y}) = k \cdot \operatorname{Shift}(\mathbf{x}, \mathbf{y})$
Antisymmetry $\operatorname{Shift}(\mathbf{x}, \mathbf{y}) = -\operatorname{Shift}(\mathbf{y}, \mathbf{x})$

Example

Shift([0, 2, 4, 6, 8], [10, 12, 14, 16, 18]) = -10
Shift(y, x) = -Shift(x, y)

$\operatorname{Shift}$ measures how much one group differs from another. When comparing response times between version A and version B, $\operatorname{Shift}$ tells by how many milliseconds A is faster or slower than B. A negative result means the first group tends to be lower; positive means it tends to be higher. Unlike comparing means, $\operatorname{Shift}$ handles outliers gracefully and works well with skewed data. The result comes in the same units as your measurements, making it easy to interpret.

Algorithm

The $\operatorname{Shift}$ estimator measures the median of all pairwise differences between elements of two samples. Given samples $\mathbf{x} = (x_1, x_2, \ldots, x_n)$ and $\mathbf{y} = (y_1, y_2, \ldots, y_m)$ , this estimator is defined as:

\operatorname{Shift}(\mathbf{x}, \mathbf{y}) = \operatorname{median}_{1 \leq i \leq n, 1 \leq j \leq m} (x_i - y_j)

This definition represents a special case of a more general problem: computing arbitrary quantiles of all pairwise differences. For samples of size $n$ and $m$ , the total number of pairwise differences is $n \times m$ . A naive approach would materialize all differences, sort them, and extract the desired quantile. With $n = m = 10000$ , this approach creates 100 million values, requiring quadratic memory and $O(n m \log(n m))$ time.

The presented algorithm avoids materializing pairwise differences by exploiting the sorted structure of the samples. After sorting both samples to obtain $x_1 \leq x_2 \leq \ldots \leq x_n$ and $y_1 \leq y_2 \leq \ldots \leq y_m$ , the key insight is that its possible to count how many pairwise differences fall below any threshold without computing them explicitly. This counting operation enables a binary search over the continuous space of possible difference values, iteratively narrowing the search range until it converges to the exact quantile.

The algorithm operates through a value-space search rather than index-space selection. It maintains a search interval $[\text{searchMin}, \text{searchMax}]$ , initialized to the range of all possible differences: $[x_1 - y_m, x_n - y_1]$ . At each iteration, it selects a candidate value within this interval and counts how many pairwise differences are less than or equal to this threshold. For the median (quantile $p = 0.5$ ), if fewer than half the differences lie below the threshold, the median must be larger; if more than half lie below, the median must be smaller. Based on this comparison, the search space is reduced by eliminating portions that cannot contain the target quantile.

The counting operation achieves linear complexity via a two-pointer sweep. For a given threshold $t$ , the number of pairs $(i,j)$ satisfying $x_i - y_j \leq t$ is counted. This is equivalent to counting pairs where $y_j \geq x_i - t$ . For each row $i$ in the implicit matrix of differences, a column pointer advances through the sorted $y$ array while $x_i - y_j > t$ , stopping at the first position where $x_i - y_j \leq t$ . All subsequent positions in that row satisfy the condition, contributing $(m - j)$ pairs to the count for row $i$ . Because both samples are sorted, the column pointer advances monotonically and never backtracks across rows, making each counting pass $O(n + m)$ regardless of the total number of differences.

During each counting pass, the algorithm tracks boundary values: the largest difference at or below the threshold and the smallest difference above it. When the count exactly matches the target rank (or the two middle ranks for even-length samples), these boundary values provide the exact answer without additional searches. For Type-7 quantile computation (Hyndman & Fan 1996), which interpolates between order statistics, the algorithm collects the necessary boundary values in a single pass and performs linear interpolation: $(1 - w) \cdot \text{lower} + w \cdot \text{upper}$ .

Real datasets often contain discrete or repeated values that can cause search stagnation. The algorithm detects when the search interval stops shrinking between iterations, indicating that multiple pairwise differences share the same value. When the closest difference below the threshold equals the closest above, all remaining candidates are identical and the algorithm terminates immediately. Otherwise, it uses the boundary values from the counting pass to snap the search interval to actual difference values, ensuring reliable convergence even with highly discrete data.

The binary search employs numerically stable midpoint calculations and terminates when the search interval collapses to a single value or when boundary tracking confirms convergence. Iteration limits are included as a safety mechanism, though convergence typically occurs much earlier due to the exponential narrowing of the search space.

The algorithm naturally generalizes to multiple quantiles by computing each one independently. For $k$ quantiles with samples of size $n$ and $m$ , the total complexity is $O(k(n + m) \log L)$ , where $L$ represents the convergence precision. This is dramatically more efficient than the naive $O(n m \log(n m))$ approach, especially for large $n$ and $m$ with small $k$ . The algorithm requires only $O(1)$ additional space beyond the input arrays, making it practical for large-scale statistical analysis where memory constraints prohibit materializing quadratic data structures.

namespace Pragmastat.Algorithms;

using System;
using System.Collections.Generic;
using System.Linq;

public static class FastShift
{
  /// <summary>
  /// Computes quantiles of all pairwise differences { x_i - y_j }.
  /// Time: O((m + n) * log(precision)) per quantile. Space: O(1).
  /// </summary>
  /// <param name="p">Probabilities in [0, 1].</param>
  /// <param name="assumeSorted">If false, collections will be sorted.</param>
  public static double[] Estimate(IReadOnlyList<double> x, IReadOnlyList<double> y, double[] p, bool assumeSorted = false)
  {
    if (x == null || y == null || p == null)
      throw new ArgumentNullException();
    if (x.Count == 0 || y.Count == 0)
      throw new ArgumentException("x and y must be non-empty.");

    foreach (double pk in p)
      if (double.IsNaN(pk) || pk < 0.0 || pk > 1.0)
        throw new ArgumentOutOfRangeException(nameof(p), "Probabilities must be within [0, 1].");

    double[] xs, ys;
    if (assumeSorted)
    {
      xs = x as double[] ?? x.ToArray();
      ys = y as double[] ?? y.ToArray();
    }
    else
    {
      xs = x.OrderBy(v => v).ToArray();
      ys = y.OrderBy(v => v).ToArray();
    }

    int m = xs.Length;
    int n = ys.Length;
    long total = (long)m * n;

    // Type-7 quantile: h = 1 + (n-1)*p, then interpolate between floor(h) and ceil(h)
    var requiredRanks = new SortedSet<long>();
    var interpolationParams = new (long lowerRank, long upperRank, double weight)[p.Length];

    for (int i = 0; i < p.Length; i++)
    {
      double h = 1.0 + (total - 1) * p[i];
      long lowerRank = (long)Math.Floor(h);
      long upperRank = (long)Math.Ceiling(h);
      double weight = h - lowerRank;
      if (lowerRank < 1) lowerRank = 1;
      if (upperRank > total) upperRank = total;
      interpolationParams[i] = (lowerRank, upperRank, weight);
      requiredRanks.Add(lowerRank);
      requiredRanks.Add(upperRank);
    }

    var rankValues = new Dictionary<long, double>();
    foreach (long rank in requiredRanks)
      rankValues[rank] = SelectKthPairwiseDiff(xs, ys, rank);

    var result = new double[p.Length];
    for (int i = 0; i < p.Length; i++)
    {
      var (lowerRank, upperRank, weight) = interpolationParams[i];
      double lower = rankValues[lowerRank];
      double upper = rankValues[upperRank];
      result[i] = weight == 0.0 ? lower : (1.0 - weight) * lower + weight * upper;
    }

    return result;
  }

  // Binary search in [min_diff, max_diff] that snaps to actual discrete values.
  // Avoids materializing all m*n differences.
  private static double SelectKthPairwiseDiff(double[] x, double[] y, long k)
  {
    int m = x.Length;
    int n = y.Length;
    long total = (long)m * n;

    if (k < 1 || k > total)
      throw new ArgumentOutOfRangeException(nameof(k));

    double searchMin = x[0] - y[n - 1];
    double searchMax = x[m - 1] - y[0];

    if (double.IsNaN(searchMin) || double.IsNaN(searchMax))
      throw new InvalidOperationException("NaN in input values.");

    const int maxIterations = 128; // Sufficient for double precision convergence
    double prevMin = double.NegativeInfinity;
    double prevMax = double.PositiveInfinity;

    for (int iter = 0; iter < maxIterations && searchMin != searchMax; iter++)
    {
      double mid = Midpoint(searchMin, searchMax);
      CountAndNeighbors(x, y, mid, out long countLessOrEqual, out double closestBelow, out double closestAbove);

      if (closestBelow == closestAbove)
        return closestBelow;

      // No progress means we're stuck between two discrete values
      if (searchMin == prevMin && searchMax == prevMax)
        return countLessOrEqual >= k ? closestBelow : closestAbove;

      prevMin = searchMin;
      prevMax = searchMax;

      if (countLessOrEqual >= k)
        searchMax = closestBelow;
      else
        searchMin = closestAbove;
    }

    if (searchMin != searchMax)
      throw new InvalidOperationException("Convergence failure (pathological input).");

    return searchMin;
  }

  // Two-pointer algorithm: counts pairs where x[i] - y[j] <= threshold, and tracks
  // the closest actual differences on either side of threshold.
  private static void CountAndNeighbors(
    double[] x, double[] y, double threshold,
    out long countLessOrEqual, out double closestBelow, out double closestAbove)
  {
    int m = x.Length, n = y.Length;
    long count = 0;
    double maxBelow = double.NegativeInfinity;
    double minAbove = double.PositiveInfinity;

    int j = 0;
    for (int i = 0; i < m; i++)
    {
      while (j < n && x[i] - y[j] > threshold)
        j++;

      count += (n - j);

      if (j < n)
      {
        double diff = x[i] - y[j];
        if (diff > maxBelow) maxBelow = diff;
      }

      if (j > 0)
      {
        double diff = x[i] - y[j - 1];
        if (diff < minAbove) minAbove = diff;
      }
    }

    // Fallback to actual min/max if no boundaries found (shouldn't happen in normal operation)
    if (double.IsNegativeInfinity(maxBelow))
      maxBelow = x[0] - y[n - 1];
    if (double.IsPositiveInfinity(minAbove))
      minAbove = x[m - 1] - y[0];

    countLessOrEqual = count;
    closestBelow = maxBelow;
    closestAbove = minAbove;
  }

  private static double Midpoint(double a, double b) => a + (b - a) * 0.5;
}

Tests

\operatorname{Shift}(\mathbf{x}, \mathbf{y}) = \operatorname{median}_{1 \leq i \leq n, 1 \leq j \leq m} (x_i - y_j)

The $\operatorname{Shift}$ test suite contains 60 correctness test cases stored in the repository (42 original + 18 unsorted), plus 1 performance test that should be implemented manually (see Test Framework).

Demo examples ( $n = m = 5$ ) — from manual introduction, validating properties:

demo-1: $\mathbf{x} = (0, 2, 4, 6, 8)$ , $\mathbf{y} = (10, 12, 14, 16, 18)$ , expected output: $-10$ (base case)
demo-2: $\mathbf{x} = (0, 2, 4, 6, 8)$ , $\mathbf{y} = (0, 2, 4, 6, 8)$ , expected output: $0$ (identity property)
demo-3: $\mathbf{x} = (7, 9, 11, 13, 15)$ , $\mathbf{y} = (13, 15, 17, 19, 21)$ (= demo-1 + [7,3]), expected output: $-6$ (location equivariance)
demo-4: $\mathbf{x} = (0, 4, 8, 12, 16)$ , $\mathbf{y} = (20, 24, 28, 32, 36)$ (= 2 × demo-1), expected output: $-20$ (scale equivariance)
demo-5: $\mathbf{x} = (10, 12, 14, 16, 18)$ , $\mathbf{y} = (0, 2, 4, 6, 8)$ (= reversed demo-1), expected output: $10$ (anti-symmetry)

Natural sequences ( $[n, m] in {1, 2, 3} \times {1, 2, 3}$ ) — 9 combinations:

natural-1-1: $\mathbf{x} = (1)$ , $\mathbf{y} = (1)$ , expected output: $0$
natural-1-2: $\mathbf{x} = (1)$ , $\mathbf{y} = (1, 2)$ , expected output: $-0.5$
natural-1-3: $\mathbf{x} = (1)$ , $\mathbf{y} = (1, 2, 3)$ , expected output: $-1$
natural-2-1: $\mathbf{x} = (1, 2)$ , $\mathbf{y} = (1)$ , expected output: $0.5$
natural-2-2: $\mathbf{x} = (1, 2)$ , $\mathbf{y} = (1, 2)$ , expected output: $0$
natural-2-3: $\mathbf{x} = (1, 2)$ , $\mathbf{y} = (1, 2, 3)$ , expected output: $-0.5$
natural-3-1: $\mathbf{x} = (1, 2, 3)$ , $\mathbf{y} = (1)$ , expected output: $1$
natural-3-2: $\mathbf{x} = (1, 2, 3)$ , $\mathbf{y} = (1, 2)$ , expected output: $0.5$
natural-3-3: $\mathbf{x} = (1, 2, 3)$ , $\mathbf{y} = (1, 2, 3)$ , expected output: $0$

Negative values ( $[n, m] = [2, 2]$ ) — sign handling validation:

negative-2-2: $\mathbf{x} = (-2, -1)$ , $\mathbf{y} = (-2, -1)$ , expected output: $0$

Mixed-sign values ( $[n, m] = [2, 2]$ ) — validates anti-symmetry across zero:

mixed-2-2: $\mathbf{x} = (-1, 1)$ , $\mathbf{y} = (-1, 1)$ , expected output: $0$

Zero values ( $[n, m] in {1, 2} \times {1, 2}$ ) — 4 combinations:

zeros-1-1, zeros-1-2, zeros-2-1, zeros-2-2: all produce output $0$

Additive distribution ( $[n, m] in {5, 10, 30} \times {5, 10, 30}$ ) — 9 combinations with $\underline{\operatorname{Additive}}(10, 1)$ :

additive-5-5, additive-5-10, additive-5-30
additive-10-5, additive-10-10, additive-10-30
additive-30-5, additive-30-10, additive-30-30
Random generation: $\mathbf{x}$ uses seed 0, $\mathbf{y}$ uses seed 1

Uniform distribution ( $[n, m] in {5, 100} \times {5, 100}$ ) — 4 combinations with $\underline{\operatorname{Uniform}}(0, 1)$ :

uniform-5-5, uniform-5-100, uniform-100-5, uniform-100-100
Random generation: $\mathbf{x}$ uses seed 2, $\mathbf{y}$ uses seed 3

The natural sequences validate anti-symmetry ( $\operatorname{Shift}(\mathbf{x}, \mathbf{y}) = -\operatorname{Shift}(\mathbf{y}, \mathbf{x})$ ) and the identity property ( $\operatorname{Shift}(\mathbf{x}, \mathbf{x}) = 0$ ). The asymmetric size combinations test the two-sample algorithm with unbalanced inputs.

Algorithm stress tests — edge cases for fast binary search algorithm:

duplicates-5-5: $\mathbf{x} = (3, 3, 3, 3, 3)$ , $\mathbf{y} = (3, 3, 3, 3, 3)$ (all identical, expected output: $0$ )
duplicates-10-10: $\mathbf{x} = (1, 1, 2, 2, 3, 3, 4, 4, 5, 5)$ , $\mathbf{y} = (1, 1, 2, 2, 3, 3, 4, 4, 5, 5)$ (many duplicates)
parity-odd-7-7: $\mathbf{x} = (1, 2, 3, 4, 5, 6, 7)$ , $\mathbf{y} = (1, 2, 3, 4, 5, 6, 7)$ (odd sizes, 49 differences, expected output: $0$ )
parity-even-6-6: $\mathbf{x} = (1, 2, 3, 4, 5, 6)$ , $\mathbf{y} = (1, 2, 3, 4, 5, 6)$ (even sizes, 36 differences, expected output: $0$ )
parity-asymmetric-7-6: $\mathbf{x} = (1, 2, 3, 4, 5, 6, 7)$ , $\mathbf{y} = (1, 2, 3, 4, 5, 6)$ (mixed parity, 42 differences)
parity-large-49-50: $\mathbf{x} = (1, 2, \ldots, 49)$ , $\mathbf{y} = (1, 2, \ldots, 50)$ (large asymmetric, 2450 differences)

Extreme asymmetry — tests with very unbalanced sample sizes:

asymmetry-1-100: $\mathbf{x} = (50)$ , $\mathbf{y} = (1, 2, \ldots, 100)$ (single vs many, 100 differences)
asymmetry-2-50: $\mathbf{x} = (10, 20)$ , $\mathbf{y} = (1, 2, \ldots, 50)$ (tiny vs medium, 100 differences)
asymmetry-constant-varied: $\mathbf{x} = (5, 5, 5, 5, 5)$ , $\mathbf{y} = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)$ (constant vs varied)

Unsorted tests — verify independent sorting of each sample (18 tests):

unsorted-x-natural-{n}-{m} for $(n,m) in {(3,3), (4,4), (5,5)}$ : X unsorted (reversed), Y sorted (3 tests)
unsorted-y-natural-{n}-{m} for $(n,m) in {(3,3), (4,4), (5,5)}$ : X sorted, Y unsorted (reversed) (3 tests)
unsorted-both-natural-{n}-{m} for $(n,m) in {(3,3), (4,4), (5,5)}$ : both unsorted (reversed) (3 tests)
unsorted-reverse-3-3: $\mathbf{x} = (3, 2, 1)$ , $\mathbf{y} = (3, 2, 1)$ (both reversed)
unsorted-x-shuffle-3-3: $\mathbf{x} = (2, 1, 3)$ , $\mathbf{y} = (1, 2, 3)$ (X shuffled, Y sorted)
unsorted-y-shuffle-3-3: $\mathbf{x} = (1, 2, 3)$ , $\mathbf{y} = (3, 1, 2)$ (X sorted, Y shuffled)
unsorted-both-shuffle-4-4: $\mathbf{x} = (3, 1, 4, 2)$ , $\mathbf{y} = (4, 2, 1, 3)$ (both shuffled)
unsorted-duplicates-mixed-5-5: $\mathbf{x} = (3, 3, 3, 3, 3)$ , $\mathbf{y} = (3, 3, 3, 3, 3)$ (all identical)
unsorted-x-unsorted-duplicates: $\mathbf{x} = (2, 1, 3, 2, 1)$ , $\mathbf{y} = (1, 1, 2, 2, 3)$ (X has unsorted duplicates)
unsorted-y-unsorted-duplicates: $\mathbf{x} = (1, 1, 2, 2, 3)$ , $\mathbf{y} = (3, 2, 1, 3, 2)$ (Y has unsorted duplicates)
unsorted-asymmetric-unsorted-2-5: $\mathbf{x} = (2, 1)$ , $\mathbf{y} = (5, 2, 4, 1, 3)$ (asymmetric sizes, both unsorted)
unsorted-negative-unsorted-3-3: $\mathbf{x} = (-1, -3, -2)$ , $\mathbf{y} = (-2, -3, -1)$ (negative unsorted)

These tests are critical for two-sample estimators because they verify that $\mathbf{x}$ and $\mathbf{y}$ are sorted independently. The variety includes cases where only one sample is unsorted, ensuring implementations dont incorrectly assume pre-sorted input or sort samples together.

Performance test — validates the fast $O((m+n) \log L)$ binary search algorithm:

Input: $\mathbf{x} = (1, 2, 3, \ldots, 100000)$ , $\mathbf{y} = (1, 2, 3, \ldots, 100000)$
Expected output: $0$
Time constraint: Must complete in under 5 seconds
Purpose: Ensures that the implementation uses the efficient algorithm rather than materializing all $m n = 10$ billion pairwise differences

This test case is not stored in the repository because it generates a large JSON file (approximately 1.5 MB). Each language implementation should manually implement this test with the hardcoded expected result.

References

Estimates of Location Based on Rank Tests

Hodges, J. L., Lehmann, E. L. (1963)

The Annals of Mathematical Statistics

DOI: 10.1214/aoms/1177704172

Theory of rank tests

Sidak, Zbynek, Sen, Pranab Kumar, Hajek, Jaroslav (1999)

Sample Quantiles in Statistical Packages

Hyndman, Rob J, Fan, Yanan (1996)

The American Statistician

DOI: 10.2307/2684934