Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect Hash Outputted on Certain Files #23

Open
tworeimage opened this issue Mar 23, 2020 · 4 comments
Open

Incorrect Hash Outputted on Certain Files #23

tworeimage opened this issue Mar 23, 2020 · 4 comments
Labels

Comments

@tworeimage
Copy link

Thanks for the great work on this library! One of the issues that I'm seeing is that when I run this implementation on a malicious file, I'm seeing slightly different results than what I see in VirusTotal. I've also compiled the official SSDEEP implementation, and they also show the same result as what VT shows.

This Implementation: 96:o8kUse54dWD+Kmu2+GOWemu2+GOWemu2+GOWemuDJvNSt+pV2NLiOw4GdlopXh1:o45AgJUEpV2NLW4GdlakpZ8Oda
Virustotal: 96:o8kUse54dWD+Kmu2+GOWemu2+GOWemu2+GOWemuDJvNSt+pV2NLiOw4GdlopXh1r:o45AgJUEpV2NLW4GdlakpZ8Oda

The subtle difference is that the first part of the hash is missing an 'r' at the end of it. I have been debugging this for about two hours, but I can't see any obvious bug occurring, so I won't be able to submit a PR at this time.

I suspect that it might be the way that the blockSize variable is calculated, but that's just a hunch. I tried a bunch of stuff to see if I could fix it but none of it worked.

Attached Zip with password of "infected"
5403252175699968.zip This is a malicious file so please do not execute it. (Malicious VBA script)

@glaslos
Copy link
Owner

glaslos commented May 23, 2020

Thanks for bringing this up. Mismatch in signature is definitely a bug. IIRC I have seen issues like this before. Could be related how remaining data is handled which doesn't fit into a block 🤔

@glaslos glaslos added the bug label May 23, 2020
@glaslos
Copy link
Owner

glaslos commented May 23, 2020

@tworeimage if you want to get started on solving this issue, create a test for this case (which should fail now obviously). Then look into how we decided when to finish the hash. Then compare this to the original SSDEEP implementation.

@glaslos
Copy link
Owner

glaslos commented Apr 28, 2022

I think in the reference implementation this is called the last hash. For some reason I was off by one 🤔 give this branch a try: #29 It seems very fragile and I'm not sure why this works.

@glaslos
Copy link
Owner

glaslos commented Oct 24, 2022

@tworeimage did you had a chance to give this a try?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants