New harder task #405

josephdviviano · 2025-10-03T16:47:25Z

I've read the .github/CONTRIBUTING.md file
My code follows the typing guidelines
I've added appropriate tests
I've run pre-commit hooks locally

Description

Added 3 new hypergrid tasks which should be more challenging. Note that the specifics are very much up for debate. I tried to identify environments which were easy to divide + conquer vs those which require compositional knowledge (and therefore some amount of knowledge sharing among agents in a multi-agent setting).
Added mode verification logic (to ensure that your particular configuration actually contains modes to find).
Added lots of tests around these new rewards.
Added visualizations of the reward landscape for these various rewards.

…d visualizations

younik

I am not able to review 1,000+ math-dense LOC for hypergrid.py :(
If you want a careful review, consider splitting this.

younik · 2025-10-06T10:19:56Z

src/gfn/gym/hypergrid.py

+                        self._n_modes_via_ids_estimate = float(torch.unique(ids).numel())
+                        self._mode_stats_kind = "approx"
+            except Exception:
+                warnings.warn("+ Warning: Failed to compute mode_stats (skipping).")


better to use logger.exception here, to print the exception as well

Also it would be better to avoid catching Exception in general. Why this can fail?

this would catch the ValueError in "exact" branch as well. Is this what we want? Should we catch at all?

younik · 2025-10-06T10:25:12Z

src/gfn/gym/hypergrid.py

+            # Cheap exact threshold (up to ~200k states)
+            if self.n_states <= 200_000:
+                axes = [
+                    torch.arange(self.height, dtype=torch.long) for _ in range(self.ndim)
+                ]
+                grid = torch.cartesian_prod(*axes)
+                rewards = self.reward_fn(grid)


how did you come up with this number? Doing the cartesian product seems memory intensive

this number might need to be lowered. It was arbitrary.

younik · 2025-10-06T10:26:31Z

src/gfn/gym/hypergrid.py

+        except Exception:
+            # Fall back to heuristic paths below
+            pass


maybe add a logger
I don't think in general it is a good idea to mask a lot of stuff to the user. Sometimes we compute the exact mode existence, sometimes we use heuristic

yes, agreed

younik · 2025-10-06T10:30:08Z

src/gfn/gym/hypergrid.py

+        for col in range(m):
+            # Find pivot
+            piv = None
+            for r in range(row, k):
+                if A[r, col]:
+                    piv = r
+                    break
+            if piv is None:
+                continue
+            # Swap
+            if piv != row:
+                A[[row, piv]] = A[[piv, row]]
+                c[[row, piv]] = c[[piv, row]]
+            # Eliminate below
+            for r in range(row + 1, k):
+                if A[r, col]:
+                    A[r, :] ^= A[row, :]
+                    c[r] ^= c[row]
+            row += 1
+            if row == k:
+                break
+        # Check for inconsistency: 0 = 1 rows
+        for r in range(k):
+            if not A[r, :].any() and c[r]:
+                return False
+        return True


I didn't check the details tbh, but it seems quite inefficient and not easily readable. Can we rely to scipy for these stuffs?

https://stackoverflow.com/questions/15638650/is-there-a-standard-solution-for-gauss-elimination-in-python

I'll look into it

younik · 2025-10-06T10:33:24Z

src/gfn/gym/hypergrid.py

+        """
+        with torch.no_grad():
+            device = torch.device("cpu")
+            B = min(2048, max(128, 8 * self.ndim))


what are these numbers? Maybe use constant to improve clarity

younik · 2025-10-06T10:34:37Z

src/gfn/gym/hypergrid.py

+        try:
+            all_states = self.all_states
+            if all_states is not None:
+                mask = self.mode_mask(all_states)
+                ids = self.mode_ids(all_states)
+                ids = ids[mask]
+                ids = ids[ids >= 0]
+                return int(torch.unique(ids).numel())
+        except Exception:
+            pass
+        if self._mode_stats_kind == "exact" and self._n_modes_via_ids_exact is not None:
+            return int(self._n_modes_via_ids_exact)
+        if (
+            self._mode_stats_kind == "approx"
+            and self._n_modes_via_ids_estimate is not None
+        ):
+            return int(self._n_modes_via_ids_estimate)
+
        return 2**self.ndim


do we need to recompute this every time?

no you're right it should be stored.

younik · 2025-10-06T10:35:24Z

src/gfn/gym/hypergrid.py

+        except Exception:
+            pass


similar to other comment, this is not nice for debuggability

josephdviviano · 2025-10-06T14:28:41Z

Hi @younik - I hear you, this is a big PR. The "splits" would have to be along tasks, though, so the resulting PRs would still be large.

I appreciate your comments on the code. I think it would make sense to also look at the tasks (the stuff that's plotted in the notebook) to see if they make sense. I'm not convinced by all of the tasks.

I would be open to removing a task or two. I think the one that works best for it's intended purpose is the coprime reward.

hyeok9855 · 2025-10-07T11:13:48Z

In the above commit, I fixed the comments of Deceptive Reward and also fixed a pyright error.

hyeok9855 · 2025-10-07T11:19:23Z

I would be open to removing a task or two. I think the one that works best for it's intended purpose is the coprime reward.

I do think Template Minkowski and Bitwise/XOR rewards are not very interesting to benchmark, especially if you care about the mode coverage. Multiplicative/Coprime seems challenging, but you may want to increase the reward for further modes from the origin.

…_harder_task

hypergrid refactor

saleml

This is high-quality research code with excellent mathematical foundations and thorough testing. The main concerns are:

Complexity barrier for new users
Performance documentation gaps
Some missing edge-case handling

A few questions and suggestions

The new reward functions are mathematically sophisticated (GF(2) algebra, prime factorization, etc.). While excellent for research, the barrier to entry is high
Suggestion: Add a "Quick Start" section to the documentation showing simple use cases before diving into the mathematical details.
The _solve_gf2_has_solution method uses Gaussian elimination which could be slow for large constraint systems
Suggestion: Add performance warnings in docstrings
Why GF(2)? The choice is elegant but not obvious. Could you add a paragraph in the documentation explaining why linear algebra over GF(2) is natural for compositional structure?
What happens if a user picks "impossible" preset with ndim=2, height=16? Should the factory functions validate compatibility?
Can you consider adding type hings for kwargs? something like

class BitwiseXORRewardKwargs(TypedDict, total=False):
    R0: float
    tier_weights: list[float]
    dims_constrained: list[int]
    bits_per_tier: list[tuple[int, int]]
    parity_checks: list[dict] | None

What do you think of adding visualization helpers

def visualize_mode_structure(env: HyperGrid, sample_size: int = 10000):
    """Generate 2D/3D plots of mode distribution."""
    # Auto-generate plots similar to notebook but as API

I'd also like to suggest a structural consideration:
The original HyperGrid has become a pedagogical entrypoint of the GFlowNets library:

It's the first environment new users encounter
Its simplicity (grid + distance-based reward) makes it ideal for teaching core concepts
Tutorial code often uses it as the "Hello World" of GFlowNets
The cognitive load is intentionally minimal: "navigate a grid, reach high-reward corners"

Can we consider creating a separate file src/gfn/gym/compositional_hypergrid.py that:

Inherits from HyperGrid to reuse the core grid mechanics
Houses the new reward families (BitwiseXOR, MultiplicativeCoprime, TemplateMinkowski)
Includes the sophisticated mode validation and statistics machinery
Keeps the original simple and focused on accessibility

saleml · 2025-10-15T06:38:59Z

src/gfn/gym/hypergrid.py

+            mode_stats_samples: Number of random samples used when
+                `mode_stats="approx"`.
        """
-        if height <= 4:


This was removed but the condition is still relevant. Should this warning be reinstated or is it now handled by validate_modes

saleml · 2025-10-15T06:40:28Z

src/gfn/gym/hypergrid.py

+            ax = (idx / Hm1 - 0.5).abs()
+            pdf = (1.0 / sqrt(2 * pi)) * torch.exp(-0.5 * (5 * ax) ** 2)
+            per_dim_discrete = float(((torch.cos(50 * ax) + 1.0) * pdf).max())
+            per_dim_base = per_dim_discrete if self.height > 4 else per_dim_peak


Magic number: height > 4 threshold. Should this be documented or parameterized?

saleml · 2025-10-15T06:41:18Z

src/gfn/gym/hypergrid.py

+            return bool((rr >= thr - EPS_REWARD_CMP).any().item())
+
+    @staticmethod
+    def _solve_gf2_has_solution(A: torch.Tensor, c: torch.Tensor) -> bool:


Could you explain the GF(2) algorithm in the docstring?
Also, this could be slow for large constraint systems
Suggestion: Add complexity note in docstring (O(k·m²) for k×m matrix

josephdviviano added 3 commits October 3, 2025 01:22

added new harder hypergrid variant - tests failing - with notebook an…

4a1f943

…d visualizations

formatting

ace32d0

added verifiable mode visualization for hypergrid

a46f6b9

josephdviviano requested review from hyeok9855, saleml and younik October 3, 2025 16:47

josephdviviano self-assigned this Oct 3, 2025

josephdviviano added the enhancement New feature or request label Oct 3, 2025

documentation improvements, removal of magic numbers

9e39720

younik approved these changes Oct 6, 2025

View reviewed changes

fix comments and pyright error

3ca16b2

hyeok9855 and others added 6 commits October 7, 2025 13:25

fix test

8844310

added documentation on the hypergrid tasks

ca13536

Merge branch 'new_harder_task' of github.com:GFNOrg/torchgfn into new…

07756f7

…_harder_task

Merge pull request #402 from GFNOrg/hypergrid_refactor

e420072

hypergrid refactor

Merge branch 'master' of github.com:GFNOrg/torchgfn into new_harder_task

cb631d7

fixed tests

b994544

saleml requested changes Oct 15, 2025

View reviewed changes

New harder task #405

Are you sure you want to change the base?

New harder task #405

Uh oh!

Conversation

josephdviviano commented Oct 3, 2025

Description

Uh oh!

younik left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

josephdviviano commented Oct 6, 2025

Uh oh!

hyeok9855 commented Oct 7, 2025

Uh oh!

hyeok9855 commented Oct 7, 2025

Uh oh!

saleml left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants