Catch up on stories from the past week (and beyond) at the Slashdot story archive


Forgot your password?
Math Privacy Encryption

Improperly Anonymized Logs Reveal Details of NYC Cab Trips 192

mpicpp (3454017) writes with news that a dump of fare logs from NYC cabs resulted in trip details being leaked thanks to using an MD5 hash on input data with a very small key space and regular format. From the article: City officials released the data in response to a public records request and specifically obscured the drivers' hack license numbers and medallion numbers. ... Presumably, officials used the hashes to preserve the privacy of individual drivers since the records provide a detailed view of their locations and work performance over an extended period of time.

It turns out there's a significant flaw in the approach. Because both the medallion and hack numbers are structured in predictable patterns, it was trivial to run all possible iterations through the same MD5 algorithm and then compare the output to the data contained in the 20GB file. Software developer Vijay Pandurangan did just that, and in less than two hours he had completely de-anonymized all 173 million entries.
This discussion has been archived. No new comments can be posted.

Improperly Anonymized Logs Reveal Details of NYC Cab Trips

Comments Filter:
  • by Opportunist ( 166417 ) on Monday June 23, 2014 @09:33PM (#47302541)

    You can contract it out to the lowest bidder without a problem. There only have to be 2 clauses in the contract:

    1) You have a GOOD ITSEC company audit the shit out of it before it goes live.
    2) If the audit reveals that the company taking the contract don't know jack about security, THEY will pay for the audit and THEY will improve the software until they think it's finally good enough.

    1 and 2 are repeated until 1 turns out good.

    I worked for a very long time in government. And I learned one thing: You are not supposed to know shit. You are supposed to buy knowledge.

  • by Opportunist ( 166417 ) on Monday June 23, 2014 @09:49PM (#47302639)

    True that.

    I am in the fortunate situation of having near unlimited funds. I was joking that I need a rubber stamp labeled "for security reasons", because whenever I want something, these three magic words will brush aside nearly all objections (ok, within reason, but anything 5 digits or less is nearly certainly mine if I "rubber stamp" it that way).

    The most recent draft of the security procedures I did I peppered liberally with "insanity" as I call it. It's a political thing. You demand stuff that you don't really want but is so terribly obstructive to everyone else that they'll agree with what you actually want just to get the insane levels of "security" (read: obstruction and red tape) out of the way. To my unending horror (and slight amusement) they signed it off without changing a comma. Now find out how to argue why you want your own requirements out of the crap...

    The reason isn't that our board suddenly found out how much they love security or how important the confidentiality of the (considerably sensitive, I should add) private data we hold here is. What changed is simply that our government upped the fines and punishment for data breeches considerably, up to and including jail time for board members if negligence can somehow be tacked to them. In a nutshell, unless you can show that you tried to stay on top of security when holding highly sensitive data, you should prepare to take a longer vacation, all expenses paid, in a holiday resort of your government's choice.

    I guess when your ass is on the line, you get very willing to spend money.

  • by penix1 ( 722987 ) on Monday June 23, 2014 @10:21PM (#47302855) Homepage

    From TFS...

    City officials released the data in response to a public records request and specifically obscured the drivers' hack license numbers and medallion numbers...

    How many of you here have had to deal with a Freedom Of Information Act (FOIA) request which is what a "public records request" is? I have had the pleasure over a dozen times. You have 10 days to respond to that request in my state. Some states it is even less. Failure to do so can result in stiff penalties. 10 days is hardly enough time to contract out to someone and have the job "done right".

    It means you hire knowledge and experience, you hire expert skills, and those cost money.

    And you are happy to have your taxes raised to pay those fees? Riiiight!

  • by Vellmont ( 569020 ) on Monday June 23, 2014 @11:16PM (#47303101) Homepage

    Taking MD5, it's published, and tweaking a few points (though who ever did this needs to be very competent) would have been sufficient.

    No, that would have been stupid. It's unlikely someone would have reverse engineered your hacked md5 algorithm, but it's also possible you could screw it up.

    The solution is VERY simple. Generate a random 256 bit string. Hash random-string+data, and use the output as the identifier. Throw away the random 256 bit string.

    Some manager probably said any work for addition security wasn't worth the cost. Ooops!

    No, some developer didn't know what the hell they were doing. You'd be surprised (but shouldn't be) how little most developers know about security, especially encryption.

  • by Opportunist ( 166417 ) on Tuesday June 24, 2014 @12:04AM (#47303327)

    Fines in a corporate world are a matter of risk management: How likely is it that it happens, what's the fine if it happens and how much do we save by not giving a damn? If this unholy trinity comes up with the "don't give a damn" on top, you don't give a damn and the fine becomes part of the operation cost. The more I get to play with C-Levels, the more I get the nagging feeling that I'm the only one weighed down by a consciousness.

    Actually, I think it's more insidious. It's a blame shifting game where everyone can claim he's doing it for the "greater good", because "being bad" is actually "being good". Take the scenario where some people have to be laid off. The floor manager knows them personally. He knows every single one of them, he knows their personal life, their family situation and it really breaks his heart to let one of them go, but he knows he has to. Either he fires one of them or he might have to fire them all because they won't be profitable anymore with the new requirements, and that could lead to the shutdown of the entire branch. His superior may not know the people anymore, but he has to do it because he himself doesn't make that decision, that's been decided further up. He can't simply ignore an order from C-Level. The C's don't need to be psychopaths (though it sure helps, it seems...), they can even be compassionate, but they know that the investors will only keep their money in the company if they perform well and if the cash flow is to their liking. He can easily brush any troubles with his consciousness aside when he fires a few people now, since if he didn't their quarter figures won't look nice, stock would plummet and investors will jump ship, and then he'd have to lay off even more people. But you can't even blame the investment bankers. Because they have to pick the best performing stocks, it's not their money, it's money from investors, money they put aside for their retirement, the investors have a responsibility towards the people that entrust them with their money (ok, recent history shows that most don't give a shit, but let's assume we find an investment banker with a consciousness... it's just a thought experiment, remember). The people investing money don't even know WHAT they invest in, they just toss money onto their investor with the order to "make more of it". And they're not "evil" either, they just want to prepare for their retirement. That people could well be the same that get fired now for the sake of more profit. Essentially, they're firing themselves without knowing it.

    But I ramble.

    What this is supposed to show is that in the corporate world it's easy to play the blame shifting game and use the "but I have to!" excuse. It's sad but it seems the only escape from that game is to actually grab them at the nuts and tell them that they won't be shifting the blame anywhere. And behold, it works.

    Of course that also means that I have to watch my back or it's going to be my ass that's going to jail. But fortunately all I have to do is heed the laws. And that's easy enough, surprisingly.

  • by Buzer ( 809214 ) on Tuesday June 24, 2014 @03:57AM (#47304195) Homepage

    Salts do provide protection against that. Salts are secret if you want them to be (you can protect the plain text salt same way as you do protect your plain text keys for encryption), you only need to share them when other party has to be able to hash their original data.

    Here are some sha1 hashes:

    • 4c2199828f355281e0f6eccb76d9df609f99ed0e salt+"123"
    • 458183225b77f6baff7c4c439b0ed3a5e7278e8a salt+"456"
    • ed974fc96c530639cccc9b18315396789d93a697 salt+"789"
    • f87a2fa039a20d01032f19b5852868343f3d06b9 salt+"???"

    So, how about you tell me what that last number combination is? I can give you a hint that it matches regex /^[1-9]{3}$/ (so there are only 729 possibilities). The salt is 60 character string. If you cannot do it, then OPs post was correct.

1 1 was a race-horse, 2 2 was 1 2. When 1 1 1 1 race, 2 2 1 1 2.