Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add murmur3 32-bit #101

Merged
merged 1 commit into from
Dec 21, 2018
Merged

Add murmur3 32-bit #101

merged 1 commit into from
Dec 21, 2018

Conversation

vmx
Copy link
Member

@vmx vmx commented Dec 18, 2018

Also rename the existing murmur to murmur-128.

Also rename the existing `murmur` to `murmur-128`.
@vmx vmx requested review from pgte, Stebalien and daviddias December 18, 2018 13:18
@ghost ghost assigned vmx Dec 18, 2018
@ghost ghost added the in progress label Dec 18, 2018
@vmx vmx mentioned this pull request Dec 18, 2018
@@ -62,7 +62,8 @@ keccak-224,keccak has variable output length. The number specifies the core leng
keccak-256, , 0x1B
keccak-384, , 0x1C
keccak-512, , 0x1D
murmur3, , 0x22
murmur3-128, , 0x22
murmur3-32, , 0x23
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's be careful about allocating another single-byte code. Really, we shouldn't have allocated single byte codes to most of these but, IIRC, we did that before we combined the various codec tables.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Damn.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's even in the multihash RFC draft: https://tools.ietf.org/html/draft-multiformats-multihash-00

@Stebalien
Copy link
Member

Addressing @Kubuxu's comment (#45 (comment)) on the previous PR: Murmur3-64 is actually just a truncated murmur3-128. Given how we use these hashes in sharded directories, this doesn't make a difference (we might as well use murmur3-128).

So, the only issues here are:

  1. Wasting an additional single byte code. However, given the fact that we've had murmur3-32 defined for a while, I can't really object to this.
  2. The fact that murmur3-128 technically has different implementations on 64bit architectures and 32bit architectures: Murmur3-128 output differs between 32-bit and 64-bit implementations by design multihash#85. I believe we have to define these as murmur3-32x32 and murmur3-128x64.

@pgte make sense? Which murmur hash are you using?

@pgte
Copy link

pgte commented Dec 21, 2018

@Stebalien the only time I remember needing murmur was when implementing dir sharding here:
https://github.com/ipfs/js-ipfs-unixfs-engine/blob/907a6381b0a3054523b55945ba1d9debb7b60b6d/src/importer/dir-sharded.js#L17
(also please read the comments on line 21).

@Stebalien
Copy link
Member

(also please read the comments on line 21).

As far as I can tell, we use as many bytes as we need. Using 8 bytes (64 bits) would mean an 8-deep hamt (which we shouldn't normally have).

However, I've actually just changed the go implementation to support using all 16 bytes of the 128 bit hash (to reduce confusion): ipfs/go-unixfs#53.

@Stebalien Stebalien merged commit 58f16be into master Dec 21, 2018
@ghost ghost removed the in progress label Dec 21, 2018
@Stebalien
Copy link
Member

Merged as this:

  1. Matches the multihash table (before it was removed).
  2. Matches the RFC.

If we want to free up some of these IDs, we can consider that separately (but probably shouldn't at this point).

@daviddias daviddias deleted the murmur3 branch January 3, 2019 08:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants