Checksums
The checksum for a file is a short value, something like a fingerprint of the file. There is a very small possibility of getting two identical checksums of two different files. This feature can be useful both for comparing the files and their integrity control. Let us imagine a situation that will help to understand how the checksums work.
Alice and Bob have two similar huge files. How do we know that they are different without sending them to each other? We simply have to calculate the checksums of these files and compare them.
Checksum properties
Checksum consists of a small amount of binary data, typically no more than 128 bits. All checksum values share the following properties:
Checksum length
Length of the checksum value is determined by the type of used algorithm, and its length does not vary with the size of the message. The most common checksum value lengths are either 128 or 160 bits.
Non-discoverability
Every pair of non-identical messages is translated into a completely different checksum value, even if the two messages differ only by a single bit. Using today's technology, it is not feasible to discover a pair of messages that translate into the same checksum value.
Repeatability
Each time a particular message is checksumed using the same algorithm, the exact same checksum value will be produced.
Irreversibility
All checksuming algorithms are one-way. Given a checksum value, it is infeasible to discover the password. In fact, none of the properties of the original message can be determined given the checksum value alone.