Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hash function improvements #145

Closed
wants to merge 1 commit into from
Closed

Hash function improvements #145

wants to merge 1 commit into from

Conversation

vouillon
Copy link
Member

@vouillon vouillon commented Aug 2, 2023

  • make sure that the hash of zero is not 0
  • finalize the hash and fold the result to a positive 31-bit integer

- make sure that the hash of zero is not 0
- finalize the hash and fold the result to a positive 31-bit integer
vouillon added a commit to ocaml-wasm/zarith_stubs_js that referenced this pull request Aug 3, 2023
- Hash chunks in the same order as the C implementation.
  (submitted a change to the C stubs to get the same result on all platforms:
   ocaml/Zarith#145)
- Call caml_hash_mix_final to compute the final hash value and restrict
  the result to a positive 31-bit integer.

Signed-off-by: Jérôme Vouillon <[email protected]>
vouillon added a commit to ocaml-wasm/zarith_stubs_js that referenced this pull request Sep 15, 2023
- Hash chunks in the same order as the C implementation.
  (submitted a change to the C stubs to get the same result on all platforms:
   ocaml/Zarith#145)
- Call caml_hash_mix_final to compute the final hash value and restrict
  the result to a positive 31-bit integer.

Signed-off-by: Jérôme Vouillon <[email protected]>
vouillon added a commit to ocaml-wasm/zarith_stubs_js that referenced this pull request Sep 30, 2023
- Hash chunks in the same order as the C implementation.
  (submitted a change to the C stubs to get the same result on all platforms:
   ocaml/Zarith#145)
- Call caml_hash_mix_final to compute the final hash value and restrict
  the result to a positive 31-bit integer.

Signed-off-by: Jérôme Vouillon <[email protected]>
@xavierleroy
Copy link
Contributor

I could use more context:

  • What's wrong with zero hashing to 0 ?
  • Is it intended that Z.hash x should be equal to Hashtbl.hash x? I'm not sure this is the case even with the proposed patch, as an extra mixing with 0 seems to be missing.

Clarifications welcome!

@vouillon
Copy link
Member Author

vouillon commented Oct 25, 2023

My initial motivation was to harmonize hash results over 32-bits architectures (between zarith_stubs_js and zarith C stubs).

Beside that:

  • I don't really care about zero hashing to 0. I can special-case zero in zarith_stubs_js instead.
  • It probably makes sense that Z.hash returns a positive integer, since this is less error-prone, and that it returns the same result on all architectures. For this, anding with 0x3FFFFFFF should be sufficient.
  • It might be nice that Hashtbl.hash and Z.hash coincide. But I got this wrong...

@xavierleroy
Copy link
Contributor

My initial motivation was to harmonize hash results over 32-bits architectures

Agreed. Let's do that.

I don't really care about zero hashing to 0. I can special-case zero in zarith_stubs_js instead.

I still don't understand why a special case is needed for zero...

It might be nice that Hashtbl.hash and Z.hash coincide. But I got this wrong...

If we really want this identity, we can just define Z.hash as Hashtbl.hash + a type constraint, like we do in the OCaml stdlib for String.hash, Int32.hash, etc. While we're at it, we could also define Z.seeded_hash. I'm tempted!

vouillon added a commit to ocaml-wasm/zarith_stubs_js that referenced this pull request Nov 23, 2023
- Hash chunks in the same order as the C implementation.
  (submitted a change to the C stubs to get the same result on all platforms:
   ocaml/Zarith#145)
- Call caml_hash_mix_final to compute the final hash value and restrict
  the result to a positive 31-bit integer.

Signed-off-by: Jérôme Vouillon <[email protected]>
vouillon added a commit to ocaml-wasm/zarith_stubs_js that referenced this pull request Nov 27, 2023
- Hash chunks in the same order as the C implementation.
  (submitted a change to the C stubs to get the same result on all platforms:
   ocaml/Zarith#145)
- Call caml_hash_mix_final to compute the final hash value and restrict
  the result to a positive 31-bit integer.

Signed-off-by: Jérôme Vouillon <[email protected]>
vouillon added a commit to ocaml-wasm/zarith_stubs_js that referenced this pull request Nov 28, 2023
- Hash chunks in the same order as the C implementation.
  (submitted a change to the C stubs to get the same result on all platforms:
   ocaml/Zarith#145)
- Call caml_hash_mix_final to compute the final hash value and restrict
  the result to a positive 31-bit integer.

Signed-off-by: Jérôme Vouillon <[email protected]>
vouillon added a commit to ocaml-wasm/zarith_stubs_js that referenced this pull request Nov 30, 2023
- Hash chunks in the same order as the C implementation.
  (submitted a change to the C stubs to get the same result on all platforms:
   ocaml/Zarith#145)
- Call caml_hash_mix_final to compute the final hash value and restrict
  the result to a positive 31-bit integer.

Signed-off-by: Jérôme Vouillon <[email protected]>
xavierleroy added a commit to xavierleroy/Zarith that referenced this pull request Dec 28, 2023
For consistency with other integer types in the standard library
(modules Int, Int32, Int64, Nativeint), let's use the standard
hash function (`Hashtbl.hash`) for `Z.hash` instead of our variant.

This is a bit slower but has several benefits (see ocaml#145):
- 32/64 bit compatibility
- better mixing of the bits of the result.

While we're at it, add a `Z.seeded_hash` function, defined as
`Hashtbl.seeded_hash`, so that the `Z` module can be used as the argument
to the `Hashtbl.MakeSeeded` functor.
@xavierleroy
Copy link
Contributor

After thinking some more about it, I believe Z.hash should be an alias for the standard hash function Hashtbl.hash, like we do for other standard library modules that work with integers (Int, Int32, Int64, Nativeint). See #150 for more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants