Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use standard hash function for Z.hash and add Z.seeded_hash #150

Merged
merged 2 commits into from
Jan 3, 2024

Conversation

xavierleroy
Copy link
Contributor

For consistency with other integer types in the standard library (modules Int, Int32, Int64, Nativeint), let's use the standard hash function (Hashtbl.hash) for Z.hash instead of our variant. This is a bit slower but has several benefits:

  • 32/64 bit compatibility;
  • better mixing of the bits of the result.

While we're at it, add a Z.seeded_hash function, defined as Hashtbl.seeded_hash, so that the Z module can be used as the argument to the Hashtbl.MakeSeeded functor.

Also: improve documentation of Z.hash and add some more @since tags.

Closes #145 as this addresses the needs expressed in #145, while being more consistent with the OCaml stdlib modules.

For consistency with other integer types in the standard library
(modules Int, Int32, Int64, Nativeint), let's use the standard
hash function (`Hashtbl.hash`) for `Z.hash` instead of our variant.

This is a bit slower but has several benefits (see ocaml#145):
- 32/64 bit compatibility
- better mixing of the bits of the result.

While we're at it, add a `Z.seeded_hash` function, defined as
`Hashtbl.seeded_hash`, so that the `Z` module can be used as the argument
to the `Hashtbl.MakeSeeded` functor.
@antoinemine
Copy link
Collaborator

Yes, I agree it is better to use the same function for Z.hash and Hashtbl.hash. Thanks!
The original Z.hash called the custom hash function probably in a (misguided) attempt to do that, but ignored the 32/64-bit issue, the mixing function and the seed.

If that's OK with @vouillon as well, we could merge.

@xavierleroy
Copy link
Contributor Author

Based on the current documentation comment and my vague recollections, I probably thought that the current implementation of Z.hash would be faster than generic hashing but just as good for use with the Hashtbl.Make functor. (Since OCaml 4.00, the high bits of the hash value are ignored, so I thought our hash function didn't have to mask them off itself.) Plus, there was no hash function in the stdlib integer modules, so no doctrine to guide us :-)

I'll merge this PR in a week or so.

@xavierleroy xavierleroy merged commit 1898327 into ocaml:master Jan 3, 2024
8 checks passed
@xavierleroy xavierleroy deleted the revised-hash branch January 3, 2024 09:09
@vouillon
Copy link
Member

vouillon commented Mar 4, 2024

@xavierleroy Note that this does not give use 32/64 bit compatibility since small integers are represented as regular OCaml integers when possible. So, an integer that can be represented using 64 bits but not using 32 bits while have a different hash value depending on the architecture.

@xavierleroy
Copy link
Contributor Author

32 bit/64 bit compatibility is is hard to achieve, see #148 for an example, and less and less of a priority as 32-bit platforms are abandoned one after the other. (Yes, I know JS-of-OCaml is here to stay, but it will soon be the only 32-bit platform that matters.)

This said, in this particular case, it might be possible to tweak the hash function attached to type Z.t so that big integers that are heap-allocated on a 32-bit platform and that would be immediate integers on a 64-bit platform get the same hash value in both cases. I might look into this when I have more free time, or you could contribute some code if you feel strongly about this compatibility issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants