Skip to content

Conversation

@ausrasul
Copy link

The edge buffer size is too small when dealing with large images.
So I added the option to make it user defined.

Also when converting large images where the x,y coordinates are larger than the max value of int16, the coordinates become invalid/capped to the max value of int16.
It is changed to int64.

Allows large images to be marked with higher values.
That fixes the bugs when "trace" can't find the end node due to integer limit of the x,y address.
@jmon12
Copy link

jmon12 commented Jan 18, 2023

@ausrasul Thank you for your quick fix, I was about to do something similar.

About the int16 issue, I didn't dive into the code yet, but why are you using int types instead of uint types? I suppose the index values must be positive. In that case, a uint32 would cover the same positive value range of int64 for two times less memory.

Is the concern about memory consumption a valid one? In the PR #19 tackling the same issue @yxdragon seems to acknowledge that.

I'll be testing your fork and give a feedback.

@jmon12
Copy link

jmon12 commented Jan 19, 2023

I've been testing your implementation @ausrasul with my data and encountered an other overflow, of uint16 this time. I opened a PR on your fork.

jmon12 and others added 2 commits April 9, 2025 10:22
An overflow was happening for reasonably big images (a dimension greater
than 2^16). It is now `uint32`.

Note that `mark_node` hasn't been touched because it's not involved in
the graph generation.
@ausrasul
Copy link
Author

ausrasul commented Apr 9, 2025

Hi, I've added your PR and also moved buf_size to the end of the args list to prevent breaking old code that uses positional args.

Regarding why int64 instead of uint64, at line sknw.py line 46 it tries to add an int from neighbors with a uint64 which result in float64 that cannot be used as an index.
One could cast back into uint but it requires deeper understanding of the code and performance implications.

Your concern about memory is valid, but the problem not int or uint, but rather it being 64 instead of 16.
an image with coordinates reaches the range of uint32 would require 16exabytes (rough estimate).
So that is more appropriate alternative to uint16 which is too small.
uint64 as I see it now is unnecessary memory hog.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants