First 500 Terabytes Transmitted via LHCGlobal Grid 244
neutron_p writes "When the LHC Computer Grid starts operating in 2007, it will be the most data-intensive physics instrument on the planet. Today eight major computing centers successfully completed a challenge to sustain a continuous data flow of 600 megabytes per second on average for 10 days from CERN in Geneva, Switzerland to seven sites in Europe and the US. The total amount of data transmitted during this challenge -- 500 terabytes -- would take about 250 years to download using a typical 512 kilobit per second household broadband connection."
Couch potate (Score:1, Interesting)
So that makes this data rate equal to 46603 hours of maximum data rate HDTV. Hmm as soon as pr0n adopts it then it will be a success just like how the regular internet evolved.
Dark Fibre (Score:5, Interesting)
Well going outside the US is a different story. I really don't know how we connect to Europe etc.
rr (Score:3, Interesting)
But seriously. What do you transfer then? I mean, how many Libraries of Congress do you need sitting around on disk.
Re:rr (Score:2, Interesting)
Eh, nevermind...it's a pissing contest.
BROADER band? (Score:3, Interesting)
Will this allow you to fileshare so fast that no one can even track it?? Now that would be interesting!
Seriously though, after reading the article and the miscellaneous links. The numbers were astounding! In comparison to my own broadband, I can get 5 or 6 gigs downloaded in a VERY good day at most. Whereas this network enabled traffic of up to 50 terabytes a DAY! Woot woot! When can I hook up for it?
512 kb? (Score:3, Interesting)
Re:rr (Score:3, Interesting)
As for what to send, Physics info works great, as thats what the story is about. For residential use, its all about p2p.
When 10/10 is standard(VERY capped fiber) expect a nice p2p mounted filesystem, where instead of the traditional p2p process (search, trim results, download file, wait, watch/listen to file) it will be navigate filesystem, find what you want (probably sorted by metadata), watch it live, and store a cache of the file so as to not waste bandwidth when watching it again and so you can share it too.
This could be done today on large lans (colleges, lan partys, etc), but noone has developed the tech. Using LUFS would make it pretty easy on linux, though for windows it will still be a pain, and no windows support cuts off a lot of potential files.
Then theres always having the ability to up the density. We're already moving around hdtv divx for tvrips, but why not do full hdtv losslessly compressed if we had the bandwidth? divx is good enough to watch, but if you had the choice of getting rid of the pixelation on motion blur, why wouldnt you?
Re:Cost (Score:3, Interesting)
Re:This pales in comparison to... (Score:3, Interesting)
From TFA:
"When the LHC starts operating in 2007, it will be the most data-intensive physics instrument on the planet, producing more than 1500 megabytes of data every second for over a decade."
"Scientists working at over two hundred other computing facilities...will access the data via the Grid."
Re:Dark Fibre (Score:3, Interesting)
Re:woooow.... (Score:3, Interesting)
Or am I plain wrong? Links anyone?
Re:Dark Fibre (Score:5, Interesting)
Re:Not really. (Score:4, Interesting)
600 Megs a second. I'd be interested in seeing what sort of disk technology can handle that level of throughput. They must have some amount of buffering going on, hand in hand with the bonus that they're probably able to just stream the data to arrays of disks without really being too concerned about placement (I'm assuming the data transfer is essentially a sequential stream of data, not sodding great numbers of small files, of course).
Lofar project (Score:4, Interesting)
The data rate [lofar.org] might even be bigger than at Cern: 20 terrabit/sec straight after the A/D converters and still a mighty 0.4 terrabit/sec after the initial data reduction (DSPs + FPGAs). All the remaining data will be transfered over a dedicated fiber network to a central computer. To reduce all this data they need a big fat supercomputer, this will be a IBM Blue Gene [ibm.com] with serial number 2, to be handed over tomorrow [zdnet.com]. For the moment it will be the fastest computer in Europe and ranking somewhere in the top 10 of the world.
Re:Not sure why this is completely notable (Score:5, Interesting)
The accomplishment is not in the data rate, it's in the ability of the participating organisation's to get a stable network going. One that is close enough to the one that the real scientists will be using in a couple of years.
Consider that there's a large number of institutes, universities, etc. that all have their own IT departments, plus all the physicists that have to be involved because it's their grant money funding all this. It's thousands of people coordinating. And I would be surprised if they hadn't set up different service classes, priorisation schemes and what not.
Setting up a trivial network between a couple of sites that are all under your control is close to trivial: you just need to talk to you telco and buy the lines, and hook up the routers. But establishing a working network between these many institutions takes a lot more.
Re:This pales in comparison to... (Score:5, Interesting)
But to address your point, yes, tape can be slow. However, the best tape drive money can buy right now (a title claimed by HP's Ultrium 960 [hp.com]) is faster than most hard drives -- 160MB/sec according to the specs. It's not going to be that bad. Expensive, yes, but not slow.
Just a thought experiment: sending a terabyte of data via this tape solution would require (1,000,000 megs / 160 megs per sec) 6,250 seconds, or 104 minutes to write to tape. This assumes 2:1 compression of course, but the actual compressability is unknown.
Sending 500 terabytes in this fashion would require 866 hours (36 days) to write and that same amount to read back onto disk. 72 days sounds like a lot, but this could be shrunk down to as little as 104 minutes if you're willing to employ 500 simultaneously-operating Ultrium 960 tape drives. Expensive, yes, but this is a fun thought experiment where dollars don't matter. Let's assume you use ten drives in an array on both ends (ship the drives with the media to save buying double drives), shrinking your backup/restore times to 86.6 hours (3.6 days) each.
7.2 days plus FedEx Priority Overnight transit time of about 16 hours yields a total transfer time of 7.87 days (7 days, 20 hours, 52 minutes, 48 seconds), or about 680,400 seconds to transfer 500,000,000 megabytes. This gives us a sustained transfer rate of 734MB/sec. This is 22% better performance than the link in the article. The time could be shrunk to as little as one day (the vast majority of it FedEx transit time) if you have 500 tape drives operating all at once.
Total expenditure for such an enterprise would be 10 Ultrium 960 drives (10x$6,190 each = $61,900) and 625 tape cartridges (625x$129 each = $80,625), for a total hardware cost of $142,525. FedEx International Priority shipping costs for a box of tapes like this would be $603, bringing the grand total to $143,128.
Just for giggles, a 500-drive array would cost you $3,095,000 in drive hardware but still take only $80,625 in tapes. With shipping it's a mere $3,176,228.
I'm willing to bet the LHC network costs considerably more than that to operate. What's more, the "tape" network hardware costs need be borne only once. The only operating costs are FedEx shipping costs and replacement tapes if and when needed. It's actually a very efficient way to send huge sums of data from place to place when you think about it.
Note: I've done all this math off the cuff while doing about ten other things, so if my figures are off, don't try to have me drawn and quartered. It was a joke, and it's supposed to be mildly entertaining.
Re:Dark Fibre (Score:2, Interesting)
The net outcome here is that DWDM has made trans-Atlantic bandwidth surprisingly cheap. Assuming you're well connected to the right carrier hotels an OC3 or OC12 (..carried on a lambda on someone's DWDM equipment) between NYC and Amsterdam can often be had for substantially less than the same circuit between two arbitrary points within the US. With the kind of facilities available to major labs (..with major government backing in many cases) access to huge amounts of relatively cheap bandwidth is not the barrier it used to be.
Re:Standard terms (Score:2, Interesting)