Saturday, July 4, 2015

Hashing The Good, The Bad and The Similar

The Good!!

In most cases we use hashing for verifying integrity. However, there are situation in which we are not so much concern about whether 2 files are the same but more about their similarity.

Let's dig deeper. For the purpose of this post I will make a copy of my "/var/log/user.log" file and name it "hashing_lab.txt". You may ask why I choose this file basically, I need a file with at least 4K to complete this scenario. This file will have more than that so that's good enough.


Let's verify my file size

root@securitynik:~# ls -al hashing_lab.txt



From the above we see the file "hashing_lab.txt" has a size of 9612 bytes. This is good enough for us.

Let's move on.


As mentioned previously hashing is typically used for verifying integrity. So let's take a md5 hash of our file "hashing_lab.txt"
root@securitynik:~# md5sum hashing_lab.txt

This returned the following

93c3baaa84c734f343136851b75db1a7  hashing_lab.txt

now let's take a copy of this file

root@securitynik:~# cp hashing_lab.txt hashing_lab.txt.copy

From this copied file, let's grab the hash

root@securitynik:~# md5sum hashing_lab.txt.copy

From the above we get the result

93c3baaa84c734f343136851b75db1a7  hashing_lab.txt.copy

Let's put both files together for clarity.







From this perspective we can see the good of hashing. We were able to verify the integrity of these files are intact. Basically "hashing_lab.txt.copy" is an exact copy of "hashing_lab.txt"

See you in the next post for the bad.

Reference:

http://jessekornblum.com/presentations/htcia06.pdf
http://ssdeep.sourceforge.net/usage.html
http://www.fastcolabs.com/3025246/what-is-polymorphic-code

No comments:

Post a Comment