SHA-256/384/512 Released 22
The Right Brute writes "It appears that the successors to the SHA-1 cryptographic digest algorithm have been released. FIPS 180-2 can be found here which I believe is the final version of the SHA-256/384/512 algorithm (it does not appear to have changed since the last draft). I have an implementation that I did as a CWEB literate programming example that might serve as a good companion to the specification."
May not be patent-free (Score:3, Interesting)
Re:May not be patent-free (Score:3, Interesting)
"Implementations of the secure hash algorithms in this standard may be covered by U.S. or foreign patents." (emphasis mine)
All evidence indicates that the SHA algorithms themselves are not patented. Specific implementations of them may be; there's no restriction on that.
And as far as that goes, I have no problem at all licensing algorithms for things like this. In many cases-- not all, but many-- your choices are (1) license-free or (2) secure, and the two are mutually exclusive.
Re:May not be patent-free (Score:2, Interesting)
Re:May not be patent-free (Score:2)
How about RSA? Of course, the patent has expired now, but as of a few years ago, it was only available with a license.
Re:May not be patent-free (Score:1)
Re:May not be patent-free (Score:3, Insightful)
Re:May not be patent-free (Score:2)
No, no, no, no, no. You're putting words in my mouth. I never suggested that licensing "makes the algorithm secure." I'm saying that having a secure algorithm is more important to me than having one that's unrestricted by patent or other rights. As I said, in many cases-- not all, but many-- the best algorithms are patented. The original poster said, "too bad," or something like that, implying that patents qua patents are a bad thing. I'm saying that the fact that an algorithm is patented, and that I have to pay a fee to use it, isn't the only deciding factor for me. I'm not trying to say that only patented algorithms are good, or vice versa. I'm just trying to say that I will happily pay a licensing fee to use a trusted algorithm when the alternative is an algorithm that's not so trusted.
Re:May not be patent-free (Score:2)
A perfect example to back this up is the issue with RSA before the patent expired. There were other perfectly acceptable, patent-free algorithms such as Blowfish and 3DES. Yet by adopting RSA for SSL it held back the adoption of SSL enabled web servers by those that couldn't afford the license fees.
If something happens that makes these hash algorithms widely used and accepted to the degree that it's in someone's best interest to use it, the patents could become a sticking point for many people that lack the funds to purchase a licence but still need to interoperate with others that are using the hash. And, as you originally pointed out, it's only the implementations that may be patented and not the algorithm itself. But I ask, what is the difference? If I decided to implement the algorithm, how can I be sure that my implementation and the way that I think and solve problems such as this aren't infringing on someone else's patented implementation of this algorithm? It would be hard for me to do so without spending a lot of money on lawyers.
I think it is wiser to encourage people to avoid patent-encumbered or even possibly patent-encumbered algorithms in lieu of other solutions that involve less overall risk. This is just my opinion, though.
Algorithm Flaw (Score:2, Interesting)
"In the following sections, SHA-512 is described before SHA-384. That is because the SHA-384 algorithm is identical to SHA-512, with the exception of using a different initial hash value and truncating the final hash value to 384 bits."
Is it just me, or is there an inherant insecurity in this?
By truncating the final hash value, you are losing 128 bits of message digest. Now in theory I can therefore change the message content, so long as I ensure that the first 384 bits of the digest remain the same. I've just defeated the entire purpose of a secure message digest.
From my own research, using a beowolf cluster in the university where I work, anytime that you have a data set with a larger range of possible values than the size of the message digest, it is possible (but very difficult) to create two messages with the same digest value.
The entire reason I have a strong interest in this, is not just for security, but for file checksums on large downloads. The entire thing that got me started, was a downloaded Slackware ISO from an unofficial mirror, that had the correct checksum, but was hopelessly corrupt due to transmission errors close to my side. There was enough change in the ISO that by fluke chance, the MD5 checksum was identical. That is already a 512 bit checksum that was defeated, albeit in-advertantly.
Re:Algorithm Flaw (Score:4, Informative)
The MD5 checksum is 128 bit, not 512 bit. Weaknesses have been shows in it, but so far noone has been able to produce two files with the same MD5 checksum. If you have the corrupted ISO with the same checksum, you have the chance to become famous. Until I see proof I will remain rather sceptical.
Oh and about the 384 bit checksum made from a 512 bit checksum, yes of course the 384 bit checksum is weaker. Otherwise people would use it all the time. There is no reason to think that it is any weaker than a checksum giving 384 bit directly, though. It is believed that if you chop off half of the output bits of SHA-160 (the old SHA) you will have an 80-bit checksum with no weaknesses except for brute force.
Re:Algorithm Flaw (Score:3, Insightful)
To do this would require trying an impossible amount of random message texts, to find one that hashed the same. Each message (of whatever length) has approximatly a 2^(-384) chance of being the same specific hash output. That is about 3940200619639447921227904010014361380507973927046
If you COULD do what you suggest (more easily than by trying 2^n calculations, for n>112, typically), than just about all cryptographic protocols in use today would probably crumble.
But you are correct, a 384 bit hash that was truncated from 512 is almost certainly less secure, but still impossible to spoof unless a shortcut to brute force searching is ever found.
SHA-1 is 160 bits, and considered more secure by design than MD5 (which is faster), but no one has even found any practical way to spoof MD5 messages (as far as I know).
Your "corrupt" iso did not have same MD5 sum as the uncorrupted image, by any fluke. That is simply impossible. More likely there was something else going on.
And yes, I do mean impossible... I'd bet ~2^120 dollars to your $10 on it (if I had it).
Re:Algorithm Flaw (Score:2)
I can't figure your numbers, though. Assuming 700 Megabytes max, for all possible byte lengths, I get 2^((sum of 1 to 1024*1024*700) * 8) possible CDs ~ 2^(2 billion billion) possible CD contents (american billion, btw). I assume you dropped out the *8. However, I don't know where you get the 2^(5 billion) matches.
Since there are 2^128 total MD5 sums, the total number of CDs divided into that (comparatively) small amount, means each hash value has nearly 2^(2 billion billion) CDs that will hash to it (ie. subtracting 128 from 2 billion billion doesn't really change it.)
So yes, for any specific CD there are an enormous amount of other possible CDs that could produce the same MD5 hash, and yet it is nigh-impossible to actually find such a match.
LIAR is kind of strong (Score:2)
Re:LIAR is kind of strong (Score:1)
The person putting the ISO together screwed up and didn't test it, and so computed a valid md5 hash on a corrupt ISO.
It isn't clear that the md5 hash he had matched an evetual working ISO, or if it just matched the md5 value listed with the bad ISO.
Re:Algorithm Flaw (Score:3, Informative)
It's just you. After the Dobbertin attack on MD5, pretty much everyone with a brain moved to SHA-1. This created some difficulties for apps which were hardcoded to only accept 128-bit hashes, but the accepted method of replacing MD5 in that case was, and remains, to use the first 128 bits of an SHA-1 hash. There is no inherent insecurity here for any well-designed hash function.
A well-designed hash function is one whose output is indistinguishable from random noise. To see what I mean, let's say I have a one-bit hash function. You put in your message and you get out either a 1 or a 0. You can't predict whether a certain message will result in a 1 or a 0, though, unless you actually go through the entire process of hashing.
Now let's say I take a SHA-1 hash of the same message and discard 159 bits. I'm left with
Truncating a trusted hash to a shorter hash length is every bit as trusted as using a hash which was designed for that shorter length in the first place.
The converse is unequivocally not true. One way people like to create larger hashes is to first hash the data, then append the hash to the data and re-hash, then concatenate the two hashes together. It works, yes, but it's not an effective method. If you're using MD5, for instance, there are no more than 128 bits of entropy in your hash--even if you chain MD5 operations together to get a longer hash, you're still limited to 128 bits of entropy. If you want a longer hash size than the one you currently have, your choices are (a) chain your current hash, which will get you the size you need but at no additional security, or (b) use a different hash which is designed for the additional size.
The entire thing that got me started, was a downloaded Slackware ISO from an unofficial mirror, that had the correct checksum, but was hopelessly corrupt due to transmission errors close to my side.
If so, you're the luckiest person ever likely to walk the earth. The odds of any two messages hashing out to identical values are 1 in 3.4 * 10**38. The odds of this happening are 100,000,000,000,000,000,000,000,000,000 Schiffers to 1 against. That's steep.
(What's a Schiffer? Well, it's simple... out of the roughly 10**9 men on Earth, Claudia Schiffer is only sleeping with one of them--her husband. The likelihood that any given guy will be sleeping with Claudia Schiffer is roughly one in a billion. If you tell someone "the odds of this are one in 10**38", they'll just look at you blankly. If you tell them that it's 1<28-zeros>0 times more unlikely than them boffing Claudia Schiffer tonight... that, guys understand.)