With the rapid growth of using electronic files, manually verifying the content of every file in a file system not only time consuming, but also lead to human-error during checking and therefore infeasible.
In the early days of computing forensics, verifying file integrity has played an important role. As the data stored in a suspected disk is vulnerable and retained for evidential use, forensic specialists are often required to acquire an exact mirror image of a suspect's disk drive for comprehensive examination. For this reason, a strong cryptographic hash function is required which can offer a useful and handy way for an examiner to verify data integrity.
There are several well-known hashing algorithms used in cryptography. These include the
The mathematical theories of hash functions guarantee the following properties:
MD5 is a message digest algorithm that was developed by Professor Ronald L. Rivest of MIT. MD5 hash algorithm is used to verify data integrity, it takes as input a message of arbitrary length, and produces as output a 128-bit "fingerprint" or "message digest" of the input. It is conjectured that it is computationally infeasible to produce two messages having the same message digest, or to produce any message having a given prespecified target message digest.
The MD5 algorithm is designed to be quite fast on 32-bit machines. In addition, the MD5 algorithm does not require any large substitution tables; the algorithm can be coded quite compactly. The MD5 algorithm is an extension of the MD4 message-digest algorithm. MD5 is slightly slower than MD4, but is more "conservative" in design. MD5 was designed because it was felt that MD4 was perhaps being adopted for use more quickly than justified by the existing critical review; because MD4 was designed to be exceptionally fast, it is "at the edge" in terms of risking successful cryptanalytic attack.
For more detailed description on MD5, please refer to http://www.ietf.org/rfc/rfc1321.txt?number=1321.
The Secure Hash Algorithm (SHA) was developed by NIST and is specified in the Secure Hash Standard (SHS, FIPS 180). Although slower than MD5, this larger digest size makes it stronger against brute force attacks. The Secure Hash Algorithm takes a message of less than 264 bits in length and produces a 160-bit message digest which is designed so that it should be computationally expensive to find a text which matches a given hash.
The SHA-1 is called secure because it is computationally infeasible to find a message which corresponds to a given message digest, or to find two different messages which produce the same message digest.
For more detailed description on SHA-1, please refer to http://www.ietf.org/rfc/rfc3174.txt?number=3174
File Integrity Checking Application
The following scenario and figure illustrate the most common way on how to use the uniqueness hash value to verify if files have been modified.
1. An officer has built a set of hash values for Folder A and save this set of values in a database (Hash Set A).
2. After a few months, he would like to verify if any unauthorized personnel have tampered his files.
3. He uses the same hash function to build a new hash set (Hash Set B) for Folder A again and use a file integrity tool to do a comparison between these two hash sets (Hash Set A and Hash Set B).
4. The mismatched results are reported after the comparison process.
figure 1: Hash Values Comparison Example
There are numerous file integrity products and technologies available in the security industry. They use various cryptographic hashing algorithms to produce a unique hash value to detect changes to the files and the file systems. The cyber crime forensic tool DESK (Digital Evidence Search Kit) is one of the examples providing some useful features for examiners to verify file integrity, using the technique addresses in this web page. DESK is a software system developed by the Center for Information Security and Cryptography, The University of Hong Kong, in collaboration with the Hong Kong Police. It is designed to assist law enforcement agencies to examine a seized computer system. For more information on DESK, please email to email@example.com.Back to top