Exploiting Network Captures For Truer Randomness 189
First time accepted submitter ronaldm writes "As a composer who uses computers for anything and everything from engraving to live performance projects, it's periodically of some concern that computers do exactly what they're supposed to do — what they're told. Introducing imperfections into music to make it sound more 'natural' is nothing new: yet it still troubles me that picking up random data from /dev/random to do this is well, cheating. It's not random. It bugs me. So, short of bringing in and using an atomic source, here's a way to embrace natural randomness — and bring your packet captures to life!"
What universe does this guy live in? (Score:2)
computers do exactly what they're supposed to do — what they're told.
90% of my day job is a bunch of engineers standing around scratching our heads trying to brainstorm ways to figure out what the hell is going on with our system. We don't even know what it is doing, let along being able to tell it what to do.
Re: (Score:2)
Re: (Score:2)
As a function of their programming computers will always do what they are told to do - to suggest otherwise also suggests that computers have some form of intelligence.
The ocean does not do what it is told, but it is not intelligent.
Re: (Score:2)
Are you sure about those two statements?
However, you do have a point... while computer theoretical models do exactly what they're told to do, as soon as you introduce a physical implementation, the computer will do whatever its environment tells it to do -- this is not always the same thing as what the computer operator tells it to do.
Similarly, the ocean does exactly as it is told to do... of course, this interaction is so complex, that a mere human being would be unable to untangle all of the instructions
Re: (Score:2)
The ocean does not do what it is told
I am 100% certain that each particle within the ocean follows one very simple rule. They all follow the path of least resistance. Each and every one of them.
Re: (Score:2)
No, I am sorry to tell you, that you are wrong.
Computers will not always do what you tell them. Sometimes they instead to something else.
This is what you call a bug.
If this happens you will stand there, scratching your head, and try to figure out what is going on.
Your argument, that this is only because the programmer told them to do that, is flawed. It could be a hardware bug. In that case it is because the hardware engineer "told" the computer to do it. Or is could be a broken part, in which case the manu
Re: (Score:2)
Re: (Score:2)
Which still has the effect that it does not do, what the user told him to do.
To the user, the computer is a package of physics, hardware and software. To him it does not matter at which of these points it fails, the effect is the same.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
More likely a Media Access Control address...
Re: (Score:2)
Re: (Score:2)
90% of my day job is a bunch of engineers standing around scratching our heads trying to brainstorm ways to figure out what the hell is going on with our system. We don't even know what it is doing, let along being able to tell it what to do.
Oh, you work for Microsoft?
No I work in a place where our internal library has a copy of I sing the body electronic [amazon.com] and we all laugh about it knowingly.
Re: (Score:2)
While complex, distributed systems may be deterministic, its hard to prove that they are. The systems I have seen are on the bleeding edge of maintainability. They sit on that edge because customers and marketing demand certain functions, while the producer of the software demands features which increase market share and product definition. Engineers have very little say in the matter.
Re: (Score:2)
If you have more then one thread running and they interact you no longer necessarily have a deterministic system. If you distribute it it gets even worse. Even in small embedded realtime systems you can get non-deterministic behaviour.
Re: (Score:2)
By interact I mean access a shared resource. (memory or other)
By distribute I mean distribute over multiple processors.
Isolated data is not a shared resource.
Synchronisation is normally used to provide consistency of the shared resource but not often used to provide overall ordering with regards to accessing it. Hence non determinism. If you can't tell which thread gets to access the shared resource and in which order, the program is no longer deterministic.
non-determinism of this type isn't inherently bad
Random (Score:5, Insightful)
The imperfections in music aren't perfectly random either, so what's the big deal?
Re:Random (Score:5, Insightful)
Most insightful comment on this story. Period.
Even if we could get perfect randomness in our art, it wouldn't really matter because the humans who see it or hear it will just try to impose some order on that randomness. It's what we do.
Instead of randomness, what I seek to add to my sounds in the music I make is complexity. That's what makes for a rich sound.
For example, if you look at the harmonics in a struck piano string or plucked guitar or bowed violin, they appear at predictable places. Now look at the harmonics in a free reed instrument, such as a chromatic harmonica. All sorts of weird places, strange ratios. It's what gives the chromatic such a distinctive, heart-rending timbre. Listen to the album Affinity by Bill Evans and Toots Theilemans and you can see why Evans decided to record his masterwork with a "trivial" instrument like the chromatic harp. It's basically a shaped noise generator with pitch.
Similarly, listen to the digital sound used in "Sky Saw" in Brian Eno's Another Green World album. A simple waveform made extremely complex using god knows what filthy circuitry and it feels like someone is sticking the motor from a pair of hair clippers up your butt (not that I would know what that feels like since I would never, ever do such a thing since I turned 40).
It can be easy, or hard, using pseudo-random algorithms in MaxDSP but it's the complexity that makes the sound do it's business. Except when it's simple, like a flute which is basically a sine wave. Oh never mind. I hate thinking about this stuff. It's a waste of time and I left my days as a theorist behind me. I'll let the young guys like the lad in the article worry about how pure the randomness is in the sounds he uses. It'll keep him occupied until inspiration comes along.
Re: (Score:2)
Whether it's Sly and Robbie or George Shearing (sorry new music fans - I really don't like most modern crap), it's that slight movement ahead or behind the beat, and the control of it that adds emotion and a certain thrill to a tune.
On the harmonica - give me Larry Adler any day :-P
Re: (Score:2)
Of course, Adler is terrific. The album he made with Sir George Martin producing of Gershwin tunes is just spectacular (especially "But Not for Me" with vocal by Elvis Costello - yes, you read that right).
But you have got to hear Hendrik Meurkens. He's a German (or maybe Belgian?) chromatic harmonica player who is wonderful. I have to be careful how much I listen to his recordings or I'm liable to toss my rather expensive Gregoire Maret chrom right out
Re: (Score:2)
Re: (Score:2)
Yes it is.
Re: (Score:2)
For crypto, you need *Perfect* random indeed, but for music, a pseudorandom generator should surely be enough?
Re: (Score:2)
Re: (Score:2)
The imperfections in music aren't perfectly random either, so what's the big deal?
http://dilbert.com/strips/comic/2001-10-25/ [dilbert.com]
How is that more random? (Score:3)
The vast majority of traffic is either html or email. Very structured data. It's sufficiently random to use for a video game or the like, but it's definitely not random from a cryptography point of view. So you're doing things the hard way with no discernible benefit. Total waste of time.
Comment removed (Score:5, Insightful)
Re: (Score:2)
Well, to be fair, hes taking the packet checksums, but in theory those could be predictable as well. They probably wouldnt be "ordered", however.
Re: (Score:2)
The packet checksums are about as non-random as you can get. There is no timing/jitter/... at all in these! This is really stupid. Even /dev/urandom is of far, far superior quality.
I strongly advise the OP to actually try understand the issue before posting such utterly clueless nonsense.
/dev/random (Score:3, Informative)
This seems like a fairly lame variant of the environmental entropy gathering which *is* what /dev/random does...
Mod parent up (Score:4, Interesting)
/dev/random is already gathering environmental entropy from hardware sources and (except if you're running it on a virtual machine), it should produce data with good entropy that's truly random and is not comping from a pseudo RNG algorithm.
Now, of course, if you XOR it with the network data you might increase entropy, but if it happens that /dev/random already uses it, you're not gaining anything, or in fact make things worse.
But, please, if you think that /dev/random isn't providing data that's random enough, suggestions and patches would be welcome. Even if they don't get accepted in the mainline kernel, you can still distribute them.
Another issue: I'd encrypt the data from the network source or XOR it with a pseudo RNG, because otherwise you might be leaking sensitive data through your "random" numbers.
Re: (Score:2)
Another issue: I'd encrypt the data from the network source or XOR it with a pseudo RNG, because otherwise you might be leaking sensitive data through your "random" numbers.
I bet everyone was wondering why all of his music sounded like bad Internet porn videos lately...
Re: (Score:2)
Indeed. And of much, much lower quality. The OP is a clueless hack.
Re: (Score:2)
No. Just really pissed off at this BS getting a /. story. The editors should know better.
I do detect some OC behavior on you part though...
I thought /dev/random already looked for entropy.. (Score:2)
Re: (Score:3)
It does. An a simple "man 4 random" will give you that information. It seems the OP could not even be bothered to do that before posting his clueless BS.
Re: (Score:2)
Indeed.
Re: (Score:2)
Why not atomic? (Score:3)
The lavarnd.org folks have all the source you need and a reference implementation that literally is webcam stuffed in a dark can. When you can get such high quality entropy for less than US $30, it seems like anything else must just be for fun. Some opaque tape over the camera on many laptops should work fine too.
not nearly as "random" as /dev/random (Score:5, Informative)
/dev/random on most OS'ed these days uses an entropy pool generated from a bunch of different sources - timing of keystrokes, mouse movements, disk seeking - and yes, network information. Then it uses cryptographic hashes on those.
Your implementation basically uses one of those entropy sources, and then doesn't even hash it...
Re:not nearly as "random" as /dev/random (Score:5, Insightful)
In brief:
"The generation of random numbers is too important to be left to chance."
Anyone trying to create a new random number generator with the intent of producing more random numbers, without an extensive and specialized education, is guaranteed to fail.
Re: (Score:2)
In addition, the problem is solved and there is absolutely no need for a "new random number generator". None at all. What people get consistently wrong is the use of the RNGs that are there, are implemented well and do work well. The RNGs are completely fine.
What there also is a need for is for people to READ THE F****** DOCUMENTATION before putting complete and utter nonsense as "story" on /. !
Re: (Score:2)
/dev/random on most OS'ed these days uses an entropy pool generated from a bunch of different sources - timing of keystrokes, mouse movements, disk seeking - and yes, network information. Then it uses cryptographic hashes on those.
Your implementation basically uses one of those entropy sources, and then doesn't even hash it...
As I remember, OpenBSD used network details to produce entropy, but later stopped, because it allowed a remote attacker the ability to potentially poison the entropy source by carefully sending just the right packets at the right time. Cryptographically secure randomness for Theo de Radt was only satisfactory when it required physical access to the machine to manipulate.
Re: (Score:2)
As I remember, OpenBSD used network details to produce entropy, but later stopped, because it allowed a remote attacker the ability to potentially poison the entropy source by carefully sending just the right packets at the right time. Cryptographically secure randomness for Theo de Radt was only satisfactory when it required physical access to the machine to manipulate.
Something's wrong or lost in communication here. The entropy pool in a /dev/random implementation is designed so that even if an attacker
Re: (Score:2)
Something's wrong or lost in communication here. The entropy pool in a /dev/random implementation is designed so that even if an attacker can add a known source of numbers to it, it still doesn't decrease the real entropy in the pool. As long as my entropy estimates are correct, I could let you pick half the bits (or 99% of the bits) going into /dev/random's entropy pool and that still wouldn't help you guess the output.
Yes, but in a server most often there is no keyboard or mouse involved. So, the machines get the vast majority of their entropy from the network.
And we're talking about Theo de Radt here... it doesn't have to be a RATIONAL threat, it just has to be a theoretical one.
Going to the source [openbsd.org]:
The OpenBSD kernel uses the mouse interrupt timing, network data interrupt latency, inter-keypress timing and disk IO information to fill an entropy pool.
So, they do use "network data interrupt latency", but not the time between sequential packets, or packet data, or anything that a remote attacker could control.
Re: (Score:2)
And we're talking about Theo de Radt here... it doesn't have to be a RATIONAL threat, it just has to be a theoretical one.
But that's the point, as long as you set its entropy count to zero it's not even a theoretical threat. It could potentially improve randomness and can't possibly hurt. That's how entropy pools are designed.
The OpenBSD kernel uses the mouse interrupt timing, network data interrupt latency, inter-keypress timing and disk IO information to fill an entropy pool.
That makes more sense than i
Re: (Score:2)
There is nothing missing. You can only estimate entropy, and for that you need to make assumptions. What gets broken if an attacker controls the network traffic is the relevant assumptions. With the border condition that an attacker can control all network traffic, the only valid assumption is an entropy content of exactly zero, so you can drop it from entropy gathering.
Re: (Score:2)
the only valid assumption is an entropy content of exactly zero, so you can drop it from entropy gathering.
This is the part that's nonsensical. The usual course of action with something that's relatively high volume and probably contributes entropy but possibly is under attacker control is to lower the estimated entropy count to zero but continue mixing the source into the pool. The worst-case scenario is no gain (but no loss), but it's likely you get some gain and it hedges against accidental overestima
Re: (Score:2)
But you cannot get entropy that is there but estimated as zero out of the pool! When reading speed from /dev/random is concerned, this does exactly nothing. Also it does exactly nothing for the amount of other entropy you have to get. So, even though it is hard to understand, you can drop it with no adverse effects and a reduction in code complexity on the plus side.
Entropy gathering is not a guessing game, if the quality needs to be high. There is no "hedging" involved when this is done right. The estimate
Re: (Score:2)
/dev/random on most OS'ed these days uses an entropy pool generated from a bunch of different sources - timing of keystrokes, mouse movements, disk seeking - and yes, network information. Then it uses cryptographic hashes on those.
Your implementation basically uses one of those entropy sources, and then doesn't even hash it...
As I remember, OpenBSD used network details to produce entropy, but later stopped, because it allowed a remote attacker the ability to potentially poison the entropy source by carefully sending just the right packets at the right time. Cryptographically secure randomness for Theo de Radt was only satisfactory when it required physical access to the machine to manipulate.
And he has it exactly right. Even network timing is suspicious. Network packet content is almost completely non-random and out as a source. And other sources are suspect as well. For example keyboard input timing only has a 70ms resolution (I measured this on two different keyboards as scan-delay), so gives you probably less of 1 bit of entropy per key pressed. Mouse movements are better, as you can use the absolute positions, bit they still need to be used very conservatively.
Confusion... (Score:4, Insightful)
/dev/random is about as random as you'll get. I presume your issue is that the pool is exhausted for the given desire. /dev/urandom is your endless of supply of 'good-enough' random for something like this. If your criticism is that it isn't really 'random', it's no less random than your pcap stream. Besides, given the application 'true' randomness will not be distinguishable from good pseudo-random.
If you wanted to be random and artistic, then maybe point a webcam at a fireplace or something as an entropy source.
Re: (Score:3)
Or just grab a computer with a VIA Nano CPU - they have a built-in true random-number generator, based on thermal and/or electric variances inside the processor. They claim "up to 1600 kilobits per second", so it should provide more than enough for music, provided you aren't adding bit-for-bit random noise in real time.
Re: (Score:2)
If anything, that's not a bad thing though. It means they're both "random enough".
Re: (Score:3, Informative)
First and foremost, Slashdot (as you know) unfortunately chooses the URL for your particular story. "Truer[sic] Randomness" is not in fact what I'm going out to somehow magically solve (with my absolute non-background in cryptography etc.). As to why they chose to enter the title of the story as such - I don't know. A bit of sensationalism, perhaps? In any case, I'd originally titled this "Musical
Re: (Score:3)
Sorry that you feel like your corn flakes have been pissed in, but you can't go blaming this on the editor's bad choice of headlines. Your own submission says "[...] yet it still troubles me that picking up random data from /dev/random to do this is well, cheating. It's not random. It bugs me." Then you go on to describe a mechanism that's far, far less random than /dev/random or any halfway decent pseudo-random number generator.
Your blog post doesn't actually try to say that the network captures are ra
Re: (Score:3)
Then, I'm XORing the five sources together, which produces a stream with entropy good enough to satisfy my randomness needs.
I tell you all that you do not want to read about parents "needs". Disturbing!
Randomness is not an objective thing (Score:2)
Something is random if you don't have the information to predict it. Distinguishing between "false" and "true" randomness is pointless.
Re: (Score:2)
True: So random that the information to predict it does not exist anywhere, even if you had a hypercomputer and knew the positions of every particle in the universe down to the limits of uncertainty.
Re: (Score:2)
Re: (Score:3)
Distinguishing between "false" and "true" randomness is pointless.
Not really, it's done all the time for many different purposes.
Take, for example, how computer scientists define it: roughly, a sequence is random if it can't be compressed, that is, any (program+data) that generates it must be at least as large than the sequence itself. It distinguishes between "random" and "not having enough information to predict it": it doesn't matter if it looks random to YOU; if it could in principle be compressed, it's not random.
That's not pointless hair splitting, it has real conse
Re: (Score:2)
The problem with your theory is that a truly random source can and will generate compressible data sometimes, or else it isnt a fucking truly random source.
This is why great minds have coined phrases like "Random numbers are too important to be left to chance" (Robert Coveyou)
In practice people want certain constraints, such as a guarantee not to generate a sequence of 10000 zeros.
"For your convenience we have generated a random PIN for you. Your randomly generated PIN is 0000. Please do
Re: (Score:3)
It's not my theory; maybe you heard about a guy named Kolmogorov that lived in the last century? I bet the great mind of Robert Coveyou studied a lot of his theory :).
But, more seriously, of course a random source will output compressible data sometimes. What happens is this: as you collect more output from a truly random source, the probability of it being compressible goes to zero very fast.
But the point is that it *is* useful to distinguish between "false" and "true" randomness, otherwise it wouldn't be
Re: (Score:2)
It's not my theory; maybe you heard about a guy named Kolmogorov that lived in the last century?
You think I'm a noob, doncha?
What happens is this: as you collect more output from a truly random source, the probability of it being compressible goes to zero very fast.
Kolmogorov invented new terminology because he knew that the terms 'random' and 'entropy' didnt fit with his work on describing sequences. Thats why his theory is called 'Kolmogorov Complexity' ..... Complexity being short for 'algorithmic entropy' .. not simply entropy.. not simply randomness or stochastic.. specifically algorithmic entropy, aka complexity.
This may seem like splitting hairs to you, but it isn't because this is a technical subject. There is a reason that new
Re: (Score:2)
"Random" and "entropy" are already used in computer science with the meanings you seem to not want them to have (like [wikipedia.org] this [wikipedia.org]). Maybe someone should complain to the president of computer science. (I'm sorry, I couldn't resist. This discussion is too silly.)
Re: (Score:2)
How compressible is:
4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4?
http://xkcd.com/221/ [xkcd.com]
Interestingly, my original version of this comment (with lots more 4's) threw this error:
Your comment violated the "postercomment" compression filter. Try less whitespace and/or less repetition.
Yet another use of the methods that you mention!
Re: (Score:2)
Well, by this definition everything is random as the Universe is governed by quantum mechanics.
WTF? Random??? (Score:2)
"As a composer who uses computers for anything and everything from engraving..."
What kind of "composer" does engraving, and why does he need a random number generator? And yeah, I read TFA, and it had nothing about applications.
Re: (Score:2)
http://en.wikipedia.org/wiki/Music_engraving [wikipedia.org]
Re: (Score:2)
Random is as random does (Score:2)
http://www.random.org/faq/ [random.org]
Q2.1: How can you be sure the numbers are really random?
Oddly enough, it is theoretically impossible to prove that a random number generator is really random. Rather, you analyse an increasing amount of numbers produced by a given generator, and depending on the results, your confidence in the generator increases (or decreases, as the case may be). This is explained in more detail on my Statistical Analysis page, which also contains two studies of the numbers generated by RANDOM.OR
Re: (Score:3)
http://www.random.org/faq/ [random.org]
Q2.1: How can you be sure the numbers are really random?
Oddly enough, it is theoretically impossible to prove that a random number generator is really random.
http://dilbert.com/strips/comic/2001-10-25/ [dilbert.com]
Define cheating (Score:2)
"bring your packet captures to life!" (Score:2)
That explains why my packets disappear when they have too many neighbors.
Seriously? (Score:2)
Oh... Ronald. I'm sorry dude.
Minor problem (Score:2)
Most of my network traffic involves downloading porn.
Your music is going to come out sounding like a strip club.
Impossible to prove true randomness exists. (Score:2, Insightful)
Let U be the universe that you believe in, and let R source of true randomness for that universe. Then the universe that you believe in is U(R).
Let R' be one of the pseudo random algorithms that is too computationally complex for you to detect. How ever computationally advanced you are there will be an infinite number of these.
It will be impossible for you to prove that the real universe is not one of the U(R'). Occam's razor is on
Re: (Score:2)
There is no randomness, it is you who must be random.
Re: (Score:2)
It's impossible to prove anything at all, aside from abstract mathematics and "I think therefore I am" and such. But we shouldn't let philosophy and arguments about human conventions get in the way of the fact that Occam's Razor and the acceptance of unprovable theories is actually incredibly useful.
Better ways to do random (Score:3)
What do i do? if I don't really care if it's random, I use the RPG from the programming language I'm using, or /dev/random. If I really, really care that it's random, I download a chunk of data off random.org, and either use that for the numbers, or use it to seed my RNG. For the most part, anything more than that is overkill.
Complete BS (Score:2)
The OP is clueless. /dev/random has full entropy and is random. /dev/urandom is the watered-down version, which still has some entropy in it.
There is no need for "better randomness". There is need for people to find out what actually exists and is implemented and use it properly.
Re: (Score:2)
Re: (Score:2)
Re:If I would (Score:5, Informative)
Actually, many people would sell you the answer. And they don't have nobel-prices[sic].
See http://en.wikipedia.org/wiki/Hardware_random_number_generator [wikipedia.org] for an overview of the devices you're looking for.
Re: (Score:2)
Re: (Score:2)
Those using quantum effects cannot be predicted even if you had a device to monitor the complete surroundings.
Re: (Score:2)
Re: (Score:2)
Still not random. If you can (and I am glad to admit this is impossible hard as far as I know) capture the 'surroundings' one on one, this is still not random enough. But still a good read and link.
'Capturing the surroundings' still won't help you do any predictions for sources with quantum randomness. At best you can say that a source would exhibit a certain behaviour x% of the time. Quantum systems are not deterministic so even with perfect state information, you can only give probabilities that certain things happen. If you know otherwise, feel free to let others know and collect your nobel prize(s).
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Still not random.
A true Nobel prize awaits your proof that quantum randomness is not truly random. We are in awe.
Re: (Score:2)
Re: (Score:2)
That's rather a random task to take on as an odd job.
Re: (Score:2)
Re: (Score:2)
You will be awarded a nobel-price.
A nobel-price? What is that, the cost of a stick of dynamite?
Re: (Score:2)
What utter nonsense. There is nothing to solve here and plenty of sources for true randomness available.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
You can re-seed them anytime if you're paranoid.
Certainly not. Thats one of the worst things that you can do. Seed it once and then use it. Period.
Re: (Score:2)
But you qualified the re-seeding as 'random'
Re: (Score:2)