IT At the LHC — Managing a Petabyte of Data Per Second 248
schliz writes "iTnews in Australia has published an interview with CERN's deputy head of IT, David Foster, who explains what last month's discovery of a 'particle consistent with the Higgs Boson' means for the organization's IT department, why it needs a second 'Tier Zero' data center, and how it is using grid computing and the cloud. Quoting: 'If you were to digitize all the information from a collision in a detector, it’s about a petabyte a second or a million gigabytes per second. There is a lot of filtering of the data that occurs within the 25 nanoseconds between each bunch crossing (of protons). Each experiment operates their own trigger farm – each consisting of several thousand machines – that conduct real-time electronics within the LHC. These trigger farms decide, for example, was this set of collisions interesting? Do I keep this data or not? The non-interesting event data is discarded, the interesting events go through a second filter or trigger farm of a few thousand more computers, also on-site at the experiment. [These computers] have a bit more time to do some initial reconstruction – looking at the data to decide if it’s interesting. Out of all of this comes a data stream of some few hundred megabytes to 1Gb per second that actually gets recorded in the CERN data center, the facility we call "Tier Zero."'"
Call the Interns! (Score:4, Funny)
Comment removed (Score:4, Funny)
Large Organization Has 2 Data Centers (Score:2, Funny)
Re: (Score:1)
Good point. Non-story. I can't see anything of interest to nerds here.
Re: (Score:2)
'Score: 3, Funny' - This is hilarious, from TFA:
'The Tier Zero facility is the central hub of the Worldwide LHC Computing Grid, which also connects to some dozen ‘Tier One’ data centres for near-real time storage and analysis of data and over 150 ‘Tier Two’ data centres for batch analysis of experiment data.'
Keeping us humble... (Score:3, Interesting)
My wife, a staff physicist at FermiLab in their computing division, manages to keep me humble when I talk about the "big data" work I'm doing in my commercial engineering position. I think having to deal with a billion or so data points per day is big... Not so much in her universe!
Re:Keeping us humble... (Score:5, Funny)
And we jokingly call our data center the "Large Software Collider". Not as funny when the real thing is even bigger!
Re: (Score:3)
Of course hadrons are bigger than softwares, not to mention a lot more fun in collisions.
Re: (Score:3)
My wife, a staff physicist at FermiLab in their computing division
Much like the HB itself, up until recently I assumed these were only theoretical...
You mean... (Score:1)
Scientists have discovered a way to get adequate performance out of Windows?
Re: (Score:1)
Not yet.
Large Hadron Collider - powered by Linux [internetnews.com]
Re: (Score:1)
I am from Citrix
Re: (Score:2)
VMWare is pretty widely recognized as the king of virtualization-- at least so long as you arent concerned with money. Its overhead is far far smaller than the others especially when dealing with huge numbers of connections, and it simply has more features than its competitors.
Of course, that assumes you're willing to pony up for vRAM entitlements and Enterprise Plus.
Re:You mean... (Score:5, Interesting)
Which doesn't mean those features are implemented well.
Not so long ago, I built an automated QA platform on top of Qumranet's KVM. Partway through the project, my employer was bought by Dell, a VMware licensee. As such, we ended up putting software through automated testing on VMware, manual testing on Xen (legacy environment, pre-acquisition), and deployment to a mix of real hardware and VMware.
In terms of accurate hardware implementation, KVM kicked the crap out of what VMware (ESX) shipped with at the time. We had software break because VMware didn't implement some very common SCSI mode pages (which the real hardware and QEMU both did), we had software break because of funkiness in their PXE implementation, and we otherwise just plain had software *break*. I sometimes hit a bug in the QEMU layer KVM uses for hardware emulation, but when those happened, I could fix it myself half the time, and get good support from the dev team and mailing list otherwise. With VMware, I just had to wait and hope that they'd eventually get around to it in some future release.
"King of virtualization"? Bah.
Re: (Score:2)
King of virtualization when it comes to things like "supports live migration of a VM's execution state and/or permenant storage", or "stability and speed of the networking layer".
I cant speak to KVM as my experience is limited to VMware, and some HyperV and XenServer testing. But just doing a check from RHEV's own fact sheet [redhat.com], there are a number of things that are missing that are quite useful:
*Storage live migration
*Hot add RAM, CPU
*Hot add NICs, disk (note that RHEV has it wrong-- this does not require an
Re: (Score:2)
RHEV is getting there, still lacking some features and still rough around the edges. For instance:
Re: (Score:2)
ESX's cost is a bit of a PITA-- theres essentials plus, but of course that lacks DRS; and theres the free version which truly is nice for a single-server solution... but there are a lot of good contenders out there for less.
Im not gonna say that the others are garbage; I took a peek at Xen and really like that they dont gouge you to death for basic things like "can manage several servers at once". Im just saying that from my experience, as well as from listening to others in the recent ArsTechnica discussi
Re: (Score:2)
I can't speak to RHEV -- I ran on bare KVM. RHEV eliminated any fea
Re: (Score:3)
The King Joffrey of virtualization, perhaps.
GRID ack (Score:4, Interesting)
Unfortunately (Score:2)
Unfortunately that isn't saying much.
pretty described on the LHC-CMS websites (Score:3)
Re: (Score:3)
What do you want to imply?
That, somehow, he who does not know how to debug the kernel should not play with bit operations?
Something like that?
Or, that we should stop researching the structure of the universe, and instead focus on what we usually do, which is making war, screwing other people and post photos of our dicks on teh internet?
Re: (Score:2)
Re: (Score:3)
News that matters? The human race is not even able to handle itself and it wants to play with atoms.
I assume you're using a computer to post that? Maybe own a cellphone....?
That makes you a hypocrite of the worst kind. Sorry, but there it is in black and white.
And Still. (Score:5, Funny)
The head researcher will STILL come to IT and ask them to please help him sync his outlook contacts to his phone.
grep (Score:2)
So they just used grep
Which amounts to... (Score:3)
Re: (Score:2)
Is that as much as billions and billions?
Re: (Score:2)
Quite a bit larger actually
1 billion = 1e9
Travelsonics number is ~3e51 which would be 3e6*(1e9)^5
or millions of billions of billions of billions of billions of billions
Not quite sure how well Sagan could pull that line off though
You're a little off there (Score:2)
Actually you're off by 26 orders of magnitude
1PB/s = 8e15 bits/s
8e15 bits/s *(3600s/h) *(24h/day)*(~365.25 days/year) ~= 2.5e23 bits/year
or 252,460,800,000,000,000,000,000 if you prefer counting zeros
even in stereo it'd only be 5e23.
Re: (Score:2)
Re: (Score:2)
Actually the other way around, you've got over twice as many zeros as you should have. You're right though, a mind boggling number regardless. Nowhere near a googolplex (10^googol) though, nor even a googol (10^100) bits. Using my number (~2.5e26) you need ~4e73 years to transfer just one googol bits, or 40 trillion trillion trillion trillion trillion trillion years, and the entire universe is currently only estimated to be ~15 billion years old right now. The universe will probably be so close to abs
Power limitations (Score:4, Informative)
Did a bunch of work with some stock exchanges a few years back. It was an interesting environment and I see that CERN had the same problems that the stock exchanges had. They even had the where the number one budgetary item wasn't cost but electric load.
You only had so much power physically available in the data centers next to the exchanges and server rooms inside them. Monetary cost was never an issue, but electric load was everything. It seems funny considering their load is strictly a science based load and not monetary, but their requirements and distribution remind me greatly of the exchanges.
Re: (Score:2)
They even had the where the number one budgetary item wasn't cost but electric load.
Probably true wherever you go, but the NYSE is in the middle of a dense urban area stretching for a hundred miles in every direction. Electricity, along with everything else, is painfully expensive there. I believe that's why so many data centers are built in relatively remote areas. Obviously, the NYSE has a physical location requirement... :\
Re: (Score:2)
On the other hand, at CERN the power used by their computing farm is probably a small trickle compared to what is being pumped into the components of the ring and its detectors.
Re: (Score:3)
That takes time. Time vs Space tradeoff.
Idea. (Score:1)
CERN network architecture (Score:2)
http://www.geant2.net/upload/pdf/LHC_networking_v1-9_NC.pdf [geant2.net]
Apparently, CERN uses BGP between T0 and T1, and uses only ACLs, no firewalls, for security.
And it's on Brocade's 100G Ethernet =) (Score:2)
http://www.enterpriseinnovation.net/content/brocade-delivers-100-gigabit-ethernet-solution-cern [enterpriseinnovation.net]
just say'n
(oh come on, can you blame me? It freaking COOL!)
FTFS: few hundred megabytes to 1Gb per second (Score:1)
Er, 1 Gb is 100 megabytes. b is bit, B is byte. So which is meant, one gigabit or one gigabyte? I'm guessing the latter, from simple consistency. If we're going to use abbreviations, we should at least get them right.
Not really that huge (Score:1)
The summary says it's 100 MByte to 1 Gbit, which is confusing in itself. I think "a few hundred megabytes" is correct. It's impressive to run at that rate continuously with high reliability, but it's nothing compared to Youtube and probably Facebook. If you say a "tweet" takes up 200 bytes including overhead, that's 500 000 tweets per second at 100 MB/s, so maybe even Twitter has to deal with that rate. The requirement for redundancy is probably stricter for the LHC, they have at least triply redundant st
Re: (Score:1)
maybe even Twitter has to deal with that rate.
Never mind, guys, still a few orders of magnitude lower (340 M messages/day according to WP)
oracle infrastructure details? (Score:1)
I'd like to know what their infrastructure looks like for storing that 1GB/s.
I was at OpenWorld in 2003 and they had some guy there from CERN giving a talk about how they were using Oracle9i (I read later that they upgraded to 10g, but no-doubt they upgrade to later versions relatively quickly), and he did mention that petabyte/s buzzword. It would be very interesting to know how it was all implemented, and how they manage to write 1GB/s to disk. Must be some serious RAC clustering going on, and some seri