Forensic researchers create digital fingerprint of files with an algorithm often called a “hash function” or a “message digest.” This mathematical process takes a file as input and returns a short binary value that’s often 64, 128 or 256 bits. This result or “hash value” is always the same for each unique file which makes it possible to compare files by just checking if their hash values match. Many lists of dangerous or suspicious files contain lists of the hash values, an approach that saves space and time while also avoiding storing complete copies of dangerous or sensitive information. In practice, it’s rare for two unique files to produce the same hash value but it can happen in some cases. Some of the most commonly used hash functions are MD5, SHA-1 and SHA-2.

Related