Refactor `auto_hash.zig` #25722

SeanTUT · 2025-10-28T00:16:11Z

Applied Zig-style naming conventions to functions and values
- Remove redundant namespacing
  - std.hash.autoHash -> std.hash.auto
  - std.hash.autoHashStrat -> std.hash.autoStrat
  - std.hash.HashStrategy -> std.hash.Strategy
- Correct capitalization
  - std.hash.HashStrategy.Shallow -> std.hash.Strategy.shallow
  - std.hash.HashStrategy.Deep -> std.hash.Strategy.deep
  - std.hash.HashStrategy.DeepRecursive -> std.hash.Strategy.deep_recursive
- All of the old identifiers are still available as deprecated aliases.
Bug fix: Slices are now detected when nested within error unions, optionals, and arrays. As a consequence of this, std.hash.auto may result in a compile error in places where it previously did not. The previous behavior was a bug, but this change is still technically breaking.
Optimization: In general, auto_hash.zig avoids copying large values, preferring to hash them in place. Moreover, auto_hash.zig is generally smarter about directly using calling hasher.update on values which have a unique representation. For instance, slices and arrays of values with unique representations will undergo a direct @ptrCast into a slice of bytes and hash all elements at once rather than doing this individually for every element in the span.
Cleaned up the implementation and tests, applying more current style and language features where appropriate

* Applied Zig-style naming conventions to functions and values * Remove redundant namespacing * `std.hash.autoHash` -> `std.hash.auto` * `std.hash.autoHashStrat` -> `std.hash.autoStrat` * `std.hash.HashStrategy` -> `std.hash.Strategy` * Correct capitalization * `std.hash.HashStrategy.Shallow` -> `std.hash.Strategy.Shallow` * `std.hash.HashStrategy.Deep` -> `std.hash.Strategy.Deep` * `std.hash.HashStrategy.DeepRecursive` -> `std.hash.Strategy.deep_recursive` * All of the old identifiers are still available as deprecated aliases. * Bug fix: Slices are now detected when nested within error unions, optionals, and arrays. As a consequence of this, `std.hash.auto` may result in a compile error in places where it previously did not. The previous behavior was a bug, but this change is still technically breaking. * Optimization: In general, `auto_hash.zig` avoids copying large values, preferring to hash them in place. Moreover, `auto_hash.zig` is generally smarter about directly using calling `hasher.update` on values which have a unique representation. For instance, slices and arrays of values with unique representations will undergo a direct `@ptrCast` into a slice of bytes and hash all elements at once rather than doing this individually for every element in the span. * Cleaned up the implementation and tests, applying more current style and language features where appropriate

This file made use of an auto hash map of SemanticVersions. Due to the recent fixes in `auto_hash.zig`, this is now a compile error, as `SemanticVersion` contains slices. This oversight previously went undetected, and simply hashed the slice slices by value if they were present. With this fix, the slices are explicitly hashed deeply.

SeanTUT · 2025-10-28T01:11:15Z

As a consequence of this, std.hash.auto may result in a compile error in places where it previously did not. The previous behavior was a bug, but this change is still technically breaking.

For example, in src/Builtin.zig, the hash function performs an auto hash on an OS version range. This includes SemanticVerions, which would be silently hashed by value rather than alerting the user of the ambiguity. Additionally, src/lib/glibc.zig included use of an AutoArrayHashMap of SemanticVersions, with the same issue taking effect.

Apologies for not not testing these changes locally, I have been unable to build Zig with LLVM, and thus have to rely on the CI to test building the compiler with LLVM enabled.

…nto auto-hash-refactor

SeanTUT · 2025-10-28T01:44:21Z

Upon closer inspection, it looks like the CI is failing due to the bug which will be addressed in #25713. A SemanticVersion with a stale reference to a stack buffer for its pre and build fields is hashed as part of the key for the builtin_modules map.
EDIT: Now that it is merged, this should just work. All bugs/CI failures past this point are my fault!

SeanTUT · 2025-10-29T15:26:44Z

I kept the logic for how values are hashed mostly the same, but there are a few details that we may want to address in follow up issues:

When an optional is null, it doesn't update the hasher at all. This is usually fine, but in the case of optional optionals, there would be no difference when hashing @as(??T, null) and @as(??T, @as(?T, null)).
- Furthermore, types like void or [0]T do not update the hasher, meaning that @as(?void, {}) and @as(?void, null) will result in the same hash.
- The same issue applies to error unions. @as(anyerror!anyerror!void, anyerror.Foo) hashes the same as @as(anyerror!anyerror!void, @as(anyerror!void, error.Foo)), although this is admittedly a more contrived example
- For these use cases, it might be a good idea to add key == null or isError(key) to the hash, which would distinguish all of these values.
Comptime struct fields are not included in hashes. This is consistent with the behavior of hashing structs by their bytes when possible, and it also doesn't make much sense to include data which never changes in the hash (before the refactor, this behavior was inconsistent; structs that had a unique representation omitted comptime fields, and structs that did not included them). This does present one problem: anonymous tuples. In some places in the compiler, I noticed that when hashing multiple values, rather than make multiple calls to std.hash.auto / std.hash.autoStrat, a single call would be made on a tuple literal containing all of the data to be hashed. When these are all runtime values, that works fine (in fact, when the tuple ends up having a unique representation, it's even able to hash everything at once with a single call to hasher.update). However, since comptime-known fields of anonymous struct literals, including tuple literals, are represented with comptime fields, this means that any comptime-known elements in an anonymous tuple literal are excluded from the hash.
- I'm not sure what the best way to handle this is. On one hand, we could include comptime fields in hashes, but this is almost always going to be redundant, and then if we want anonymous struct literals to hash the same regardless of if their fields are comptime, then we can pretty much never optimize the hashing of a struct into a single hasher.update call (since a struct with comptime fields will be hashed field-by-field, we would have to always hash structs that way to maintain parity with a struct that does have comptime fields, even the struct we're hashing has no comptime fields). On the other hand, we could leave the behavior as it is now, change all of the call sites to not hash tuple literals, and discourage this usage, but it would be inconvenient and cause a footgun.`
- My current proposal is this: do not hash comptime fields, but introduce a comptime T: type parameter to std.hash.auto / std.hash.autoStrat such that when you're hashing a type with comptime fields, you have to explicitly specify that type, with no room to accidentally omit data by making it a comptime field. Since this change would be breaking, I have not included it.

SeanTUT and others added 4 commits October 27, 2025 19:53

Fix typo in glibc AutoHashMap fix

56cc5f6

Merge branch 'master' into auto-hash-refactor

281cd29

SeanTUT added 2 commits October 27, 2025 21:16

Add index parameter to glibc.zig VersionSet

ccccf12

Apologies for not not testing these changes locally, I have been unable to build Zig with LLVM, and thus have to rely on the CI to test building the compiler with LLVM enabled.

Merge branch 'auto-hash-refactor' of https://github.com/seantut/zig i…

dcf4d8b

…nto auto-hash-refactor

Merge branch 'master' into auto-hash-refactor

630a6c2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Refactor `auto_hash.zig` #25722

Refactor `auto_hash.zig` #25722

Uh oh!

SeanTUT commented Oct 28, 2025

Uh oh!

SeanTUT commented Oct 28, 2025 •

edited

Loading

Uh oh!

SeanTUT commented Oct 28, 2025 •

edited

Loading

Uh oh!

SeanTUT commented Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Refactor auto_hash.zig #25722

Are you sure you want to change the base?

Refactor auto_hash.zig #25722

Uh oh!

Conversation

SeanTUT commented Oct 28, 2025

Uh oh!

SeanTUT commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SeanTUT commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SeanTUT commented Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Refactor `auto_hash.zig` #25722

Refactor `auto_hash.zig` #25722

SeanTUT commented Oct 28, 2025 •

edited

Loading

SeanTUT commented Oct 28, 2025 •

edited

Loading