Implement exact tie-breaking for compute_challenge_score using hypergeometric distribution #4

NabJa · 2025-08-28T14:05:21Z

This PR updates the implementation of the challenge score metric to compute the exact expected TPR under random tie-breaking at the selection cutoff using the hypergeometric distribution.

Motivation

Previously, the metric relied on Monte Carlo permutations (num_permutations = 10^4) to approximate the expected confusion matrix when ties occurred in the model outputs. This was both computationally expensive and stochastic.

The new implementation uses the hypergeometric expectation to compute the result exactly, with no sampling. With the exact formulation, the metric is now reproducible, consistent, and fast. This implementation eliminates the num_permutations loop, reducing runtime from O(num_permutations * n log n) to O(n log n).

Explanation

The updated metric computes the exact expected TPR under uniform random tie-breaking at the cutoff capacity.

1. Handle edge cases and transform to numpy arrays

Edge cases:
- Capacity = 0 -> TPR = 0 (nothing selected)
- Capacity >= n -> TPR = 1 (everything selected)
- No positives -> TPR = NaN (undefined)

2. Handle a split tie at the boundary

Find the maximal contiguous block [start, end) where all scores equal v_incl (within tie_tol). This is the boundary tie group.
Define:
- g = end - start (group size)
- k = sum(y_sorted[start:end] == 1) (positives in the group)
- m = capacity - start (how many items must be selected from this group to reach capacity)
- pos_before = sum(y_sorted[:start] == 1) (positives strictly above the tie group)

3. Exact expectation for the tie group

Under uniform random tie-breaking within the group, the number of positives selected from the group follows a hypergeometric distribution with expectation:
- E[TP_from_tie] = m * (k / g)

4. Combine contributions

E[TP] = pos_before + E[TP_from_tie] = pos_before + m * (k / g)
TPR = E[TP] / P

NabJa · 2025-08-28T14:49:54Z

Some tests and exploration comparing the old implementation with the new one:
https://colab.research.google.com/drive/1oi1PE6h5l-RbCR-bmbIxno31f_zcWj4L?usp=sharing

Exact computation of challenge metric

96da345

NabJa mentioned this pull request Sep 1, 2025

Challenge score metric is slow and unstable due to stochastic tie-breaking #5

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement exact tie-breaking for compute_challenge_score using hypergeometric distribution #4

Implement exact tie-breaking for compute_challenge_score using hypergeometric distribution #4

Uh oh!

NabJa commented Aug 28, 2025

Uh oh!

NabJa commented Aug 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Implement exact tie-breaking for compute_challenge_score using hypergeometric distribution #4

Are you sure you want to change the base?

Implement exact tie-breaking for compute_challenge_score using hypergeometric distribution #4

Uh oh!

Conversation

NabJa commented Aug 28, 2025

Motivation

Explanation

Uh oh!

NabJa commented Aug 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant