## Wednesday, February 8, 2012

### TEncryption Hash Algorithm Part 2

In 2009, I made a post about turning a 4-bit hash algorithm into an 8-bit hash algorithm.

My idea was:
"My Password" --> 1100 --> {"11", "00"} --> {0100, 1011} --> 01001011

However, after reading it over, I realized that there was a gaping flaw. The probability of a collision wouldn't change!

Let's say "Password a" and "Password b" both hash to a 4-bit value of 0101. They would both hash to the same 8-bit value in my implementation. This is a major issue, because in theory, an 8-bit hash algorithm should have significantly less collisions (2^4 / 2^8 = 1/16 less chance of a collision).

So how to avoid this issue? Simple. Don't use the 4-bit hash as the basis for the 8-bit hash. Rather, use permutations of the password as the basis for the 8-bit hash. This can include reversing the password, truncating the password, or rearranging the password. This will, in theory, decrease the number of collisions.

Previous implementation with collision:

"Password a" --> 1110 --> {"11", "10"} --> {1100, 1011} --> 11001011
"Password b" --> 1110 --> {"11", "10"} --> {1100, 1011} --> 11001011

New implementation with the same possible collision:

The point is just because one permutation of a password results in a collision with another password, with a good hash function, there is no indication that another permutation will also result in a collision.

Still, though, concatenation means that precomputing hashes of unsalted, known, commonly used passwords will help you figure out half the resulting hash is problematic. In the above example, the first four bits are 1110 in both passwords because of the collision. So to overcome that issue, you can salt the passwords with the username, or file name, or whatever, which should really be done anyway.

New implementation with salting

New implementation with salting