Skip to content

Conversation

@NabJa
Copy link

@NabJa NabJa commented Aug 28, 2025

This PR updates the implementation of the challenge score metric to compute the exact expected TPR under random tie-breaking at the selection cutoff using the hypergeometric distribution.

Motivation

Previously, the metric relied on Monte Carlo permutations (num_permutations = 10^4) to approximate the expected confusion matrix when ties occurred in the model outputs. This was both computationally expensive and stochastic.

The new implementation uses the hypergeometric expectation to compute the result exactly, with no sampling. With the exact formulation, the metric is now reproducible, consistent, and fast. This implementation eliminates the num_permutations loop, reducing runtime from O(num_permutations * n log n) to O(n log n).

Explanation

The updated metric computes the exact expected TPR under uniform random tie-breaking at the cutoff capacity.

1. Handle edge cases and transform to numpy arrays

  • Edge cases:
    • Capacity = 0 -> TPR = 0 (nothing selected)
    • Capacity >= n -> TPR = 1 (everything selected)
    • No positives -> TPR = NaN (undefined)

2. Handle a split tie at the boundary

  • Find the maximal contiguous block [start, end) where all scores equal v_incl (within tie_tol). This is the boundary tie group.
  • Define:
    • g = end - start (group size)
    • k = sum(y_sorted[start:end] == 1) (positives in the group)
    • m = capacity - start (how many items must be selected from this group to reach capacity)
    • pos_before = sum(y_sorted[:start] == 1) (positives strictly above the tie group)

3. Exact expectation for the tie group

  • Under uniform random tie-breaking within the group, the number of positives selected from the group follows a hypergeometric distribution with expectation:
    • E[TP_from_tie] = m * (k / g)

4. Combine contributions

  • E[TP] = pos_before + E[TP_from_tie] = pos_before + m * (k / g)
  • TPR = E[TP] / P

@NabJa
Copy link
Author

NabJa commented Aug 28, 2025

Some tests and exploration comparing the old implementation with the new one:
https://colab.research.google.com/drive/1oi1PE6h5l-RbCR-bmbIxno31f_zcWj4L?usp=sharing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant