Implement class masking using the post-processing framework #999

mohamedelabbas1996 · 2025-10-14T19:04:33Z

Summary

This PR implements Class Masking as part of the post-processing framework.

List of Changes

TBD

Related Issues

TBD

Detailed Description

TBD

How to Test the Changes

TBD

Screenshots

TBD

Deployment Notes

TBD

Checklist

I have tested these changes appropriately.
I have added and/or modified relevant tests.
I updated relevant documentation or comments.
I have verified that this PR follows the project's coding standards.
Any dependent changes have already been merged to main.

…en creating a terminal classification with the rolled up taxon

…un() call

…gress, and algorithm binding

…f.logger and progress updates

…g and progress tracking

…mission

…sing

…h job context

…ing-framework

…ss-masking branch)

netlify · 2025-10-14T19:05:10Z

✅ Deploy Preview for antenna-preview canceled.

Name	Link
🔨 Latest commit	`1b8700e`
🔍 Latest deploy log	https://app.netlify.com/projects/antenna-preview/deploys/68f0aec81240f300089fbf9f

…-class-masking

…cation

Copilot

Pull Request Overview

Implements class masking as a post-processing task that recalculates classifications by masking out classes not present in a provided taxa list and updates occurrences accordingly.

Adds ClassMaskingTask to the post-processing framework and registers it.
Filters and recalculates logits/scores per taxa list, creates new terminal classifications, and updates occurrences.
Minor logging update in job runner.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 10 comments.

File	Description
ami/ml/post_processing/class_masking.py	New class masking task and supporting functions to filter classifications by taxa list and recompute softmax.
ami/ml/post_processing/init.py	Registers the new class_masking task module.
ami/jobs/models.py	Improves log line to print only the task config for post-processing.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-15T03:25:53Z

ami/ml/post_processing/class_masking.py

+        top_index = scores.index(max(scores))
+        top_taxon = category_map_with_taxa[top_index][
+            "taxon"
+        ]  # @TODO: This doesn't work if the taxon has never been classified
+        print("Top taxon: ", category_map_with_taxa[top_index])  # @TODO: REMOVE
+        print("Top index: ", top_index)  # @TODO: REMOVE


Argmax is computed across all categories, so an excluded class can still be selected as the top taxon. If all categories are excluded, the current approach will select an arbitrary class. Restrict the selection to indices whose taxa are in taxa_in_list and handle the 'all-excluded' case gracefully (skip creating a new classification or mark appropriately). For example:

Build allowed_indices = [i for i, c in enumerate(category_map_with_taxa) if c['taxon'] in taxa_in_list]

Mask logits for non-allowed indices with -np.inf, recompute softmax over the allowed set, and if allowed_indices is empty, skip this classification.

Copilot · 2025-10-15T03:25:53Z

ami/ml/post_processing/class_masking.py

+    logger.info(f"Found {len(classifications)} terminal classifications with scores to update.")
+
+    if not classifications:


len(classifications) executes an extra COUNT query and if not classifications triggers a potentially expensive truthiness evaluation on a QuerySet. Use count() once for logging and a zero check (or exists() if you don't need the exact number) to avoid double evaluation, e.g., count = classifications.count(); if count == 0: ...

Suggested change

logger.info(f"Found {len(classifications)} terminal classifications with scores to update.")

if not classifications:

count = classifications.count()

logger.info(f"Found {count} terminal classifications with scores to update.")

if count == 0:

Copilot · 2025-10-15T03:25:54Z

ami/ml/post_processing/class_masking.py

+        scores, logits = classification.scores, classification.logits
+        # Set scores and logits to zero if they are not in the filtered category indices
+
+        import numpy as np


Importing inside the processing loop adds overhead each iteration. Move these imports to the module top and prefer using np.exp and np.sum directly for consistency, e.g., import numpy as np at the top and use np.exp / np.sum.

Copilot · 2025-10-15T03:25:54Z

ami/ml/post_processing/class_masking.py

+        from numpy import exp
+        from numpy import sum as np_sum


Importing inside the processing loop adds overhead each iteration. Move these imports to the module top and prefer using np.exp and np.sum directly for consistency, e.g., import numpy as np at the top and use np.exp / np.sum.

Copilot · 2025-10-15T03:25:54Z

ami/ml/post_processing/class_masking.py

+            "taxon"
+        ]  # @TODO: This doesn't work if the taxon has never been classified
+        print("Top taxon: ", category_map_with_taxa[top_index])  # @TODO: REMOVE
+        print("Top index: ", top_index)  # @TODO: REMOVE


Avoid print statements in production code; use logger.debug(...) to keep logs consistent and configurable.

Suggested change

print("Top index: ", top_index) # @TODO: REMOVE

logger.debug(f"Top index: {top_index}")

Copilot · 2025-10-15T03:25:54Z

ami/ml/post_processing/class_masking.py

+        assert new_classification.detection.occurrence is not None
+        occurrences_to_update.add(new_classification.detection.occurrence)
+
+        logging.info(


This uses the root logging module instead of the module logger or the provided task_logger, making log output inconsistent. Replace with logger.info(...) or task_logger.info(...).

Suggested change

logging.info(

task_logger.info(

Copilot · 2025-10-15T03:25:55Z

ami/ml/post_processing/class_masking.py

+    # Get the classifications for the occurrence in the collection
+    classifications = Classification.objects.filter(
+        detection__occurrence=occurrence,
+        terminal=True,
+        algorithm=algorithm,
+        scores__isnull=False,
+    ).distinct()


You validate that logits is a list later and raise if not, but the query doesn't exclude classifications with null logits. Add logits__isnull=False to avoid unnecessary processing failures.

Copilot · 2025-10-15T03:25:55Z

ami/ml/post_processing/class_masking.py

+        terminal=True,
+        # algorithm__task_type="classification",
+        algorithm=algorithm,
+        scores__isnull=False,


Mirror the logits presence guard here as well to avoid raising later when logits is missing: add logits__isnull=False to the filter.

Suggested change

scores__isnull=False,

scores__isnull=False,

logits__isnull=False,

Copilot · 2025-10-15T03:25:55Z

ami/ml/post_processing/class_masking.py

+            updated_at=timestamp,
+        )
+        if new_classification.taxon is None:
+            raise (ValueError("Classification isn't registered yet. Aborting"))  # @TODO remove or fail gracefully


The error message is unclear for the actual failure mode. Clarify to something actionable, e.g., raise ValueError('Unable to determine top taxon after class masking (no allowed classes). Aborting.').

Suggested change

raise (ValueError("Classification isn't registered yet. Aborting")) # @TODO remove or fail gracefully

raise ValueError("Unable to determine top taxon after class masking (no allowed classes). Aborting.")

Copilot · 2025-10-15T03:25:55Z

ami/ml/post_processing/class_masking.py

+    if classifications_to_update:
+        logger.info(f"Bulk updating {len(classifications_to_update)} existing classifications")
+        Classification.objects.bulk_update(classifications_to_update, ["terminal", "updated_at"])
+        logger.info(f"Updated {len(classifications_to_update)} existing classifications")
+
+    if classifications_to_add:
+        # Bulk create the new classifications
+        logger.info(f"Bulk creating {len(classifications_to_add)} new classifications")
+        Classification.objects.bulk_create(classifications_to_add)
+        logger.info(f"Added {len(classifications_to_add)} new classifications")


Consider wrapping the bulk_update and bulk_create in a single transaction to keep updates atomic and avoid partial state if an error occurs later (e.g., during occurrence updates). For example: with transaction.atomic(): ... bulk_update ... bulk_create ....

…sing tasks

…reation

…agement

…-class-masking

mihow · 2025-10-15T15:17:23Z

ami/ml/post_processing/class_masking.py

+    # Update the occurrence determinations
+    logger.info(f"Updating the determinations for {len(occurrences_to_update)} occurrences")
+    for occurrence in occurrences_to_update:
+        occurrence.save(update_determination=True)


@mohamedelabbas1996 here is how I updated all of the determinations previously

…ocessing-class-masking

mohamedelabbas1996 added 30 commits September 18, 2025 10:20

feat: add base and runner classes for generic post-processing framework

f46e88c

feat: add post-processing framework base post-processing task class

d86ea4d

feat: add small size filter post-processing task class

2c0f78f

feat: add post processing job type

ffba709

feat: trigger small size filter post processing task from admin page

63cd84b

feat: add a new algorithm task type for post-processing

cab62bf

chore: deleted runner.py

6d0e284

feat: add migration for creating a new job type

4cfe2d8

fix: fix an import error with the AlgorithmTaskType

b42e069

feat: update identification history of occurrences in SmallSizeFilter

cb7c83a

feat: add rank rollup

10103db

feat: add class masking post processing task

2e81d90

feat: trigger class masking from admin page

0baf8ce

fix: modified log messages

f3caa18

fix: set the classification algorithm to the rank rollup Algorithm wh…

65d4fef

…en creating a terminal classification with the rolled up taxon

feat: trigger rank rollup from admin page

e13afc1

Remove class_masking.py from framework branch

7ecc18c

fix: initialize post-processing tasks with job context and simplify r…

f214025

…un() call

feat: add permission to run post-processing jobs

20ff4b6

chore: remove class_masking import

5b66ae3

refactor: redesign BasePostProcessingTask with job-aware logging, pro…

0419eff

…gress, and algorithm binding

refactor: adapt RankRollupTask to new BasePostProcessingTask with sel…

1ad1e76

…f.logger and progress updates

refactor: update SmallSizeFilter to use BasePostProcessingTask loggin…

d97e8e0

…g and progress tracking

migrations: update Project options to include post-processing job per…

2922c86

…mission

migrations: update Algorithm.task_type choices to include post-proces…

9012d7f

…sing

Merge branch 'main' into feat/postprocessing-framework

319bb3d

migrations: merged migrations

787ac0b

refactor: refactor job runner to initialize post-processing tasks wit…

5e85b75

…h job context

chore: rebase feat/postprocessing-class-masking onto feat/postprocess…

88ffba8

…ing-framework

chore: remove class masking trigger (moved to feat/postprocessing-cla…

9519600

…ss-masking branch)

mohamedelabbas1996 force-pushed the feat/postprocessing-class-masking branch from 0b77504 to 88ffba8 Compare October 14, 2025 19:08

mohamedelabbas1996 added 3 commits October 14, 2025 15:40

feat: improved progress tracking

21e6648

Merge branch 'feat/postprocessing-framework' into feat/postprocessing…

7135e15

…-class-masking

feat: add applied_to field to Classification to track source classifi…

6632c31

…cation

mihow requested a review from Copilot October 15, 2025 03:20

Copilot AI reviewed Oct 15, 2025

View reviewed changes

mohamedelabbas1996 added 5 commits October 15, 2025 00:13

tests: added tests for small size filter and rank roll up post-proces…

23f80fb

…sing tasks

fix: create only terminal classifications and remove identification c…

336636a

…reation

refactor: remove inner transaction.atomic for cleaner transaction man…

0d90cde

…agement

tests: fixed small size filter test

23469e2

Merge branch 'feat/postprocessing-framework' into feat/postprocessing…

001464e

…-class-masking

mihow reviewed Oct 15, 2025

View reviewed changes

Base automatically changed from feat/postprocessing-framework to main October 16, 2025 06:06

mihow added 2 commits October 15, 2025 23:14

Merge branch 'main' of github.com:RolnickLab/antenna into feat/postpr…

916d652

…ocessing-class-masking

draft: work towards class masking in new framework

1b8700e

		logger.info(f"Found {len(classifications)} terminal classifications with scores to update.")

		if not classifications:

-    logger.info(f"Found {len(classifications)} terminal classifications with scores to update.")
-    if not classifications:
+    count = classifications.count()
+    logger.info(f"Found {count} terminal classifications with scores to update.")
+    if count == 0:

	print("Top index: ", top_index) # @TODO: REMOVE
	logger.debug(f"Top index: {top_index}")

	scores__isnull=False,
	scores__isnull=False,
	logits__isnull=False,

	raise (ValueError("Classification isn't registered yet. Aborting")) # @TODO remove or fail gracefully
	raise ValueError("Unable to determine top taxon after class masking (no allowed classes). Aborting.")

Implement class masking using the post-processing framework #999

Are you sure you want to change the base?

Implement class masking using the post-processing framework #999

Uh oh!

Conversation

mohamedelabbas1996 commented Oct 14, 2025

Summary

List of Changes

Related Issues

Detailed Description

How to Test the Changes

Screenshots

Deployment Notes

Checklist

Uh oh!

netlify bot commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for antenna-preview canceled.

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

mihow Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

netlify bot commented Oct 14, 2025 •

edited

Loading