Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Networking The Internet Science Linux

"Evolution of the Internet" Powers Massive LHC Grid 93

jbrodkin brings us a story about the development of the computer network supporting CERN's Large Hadron Collider, which will begin smashing particles into one another later this year. We've discussed some of the impressive capabilities of this network in the past. "Data will be gathered from the European Organization for Nuclear Research (CERN), which hosts the collider in France and Switzerland, and distributed to thousands of scientists throughout the world. One writer described the grid as a 'parallel Internet.' Ruth Pordes, executive director of the Open Science Grid, which oversees the US infrastructure for the LHC network, describes it as an 'evolution of the Internet.' New fiber-optic cables with special protocols will be used to move data from CERN to 11 Tier-1 sites around the globe, which in turn use standard Internet technologies to transfer the data to more than 150 Tier-2 centers. Worldwide, the LHC computing grid will be comprised of about 20,000 servers, primarily running the Linux operating system. Scientists at Tier-2 sites can access these servers remotely when running complex experiments based on LHC data, Pordes says. If scientists need a million CPU hours to run an experiment overnight, the distributed nature of the grid allows them to access that computing power from any part of the worldwide network"
This discussion has been archived. No new comments can be posted.

"Evolution of the Internet" Powers Massive LHC Grid

Comments Filter:
  • I mean, if even the supporting computer network is smashing particles into each other it's got to be 133+!
    • Re: (Score:1, Funny)

      by Anonymous Coward
      Yah dude, you know these "scientists" are gonna frag it up with their super-low fiber optic distributed ping. I bet they hack too.
  • by Anonymous Coward
    Hmmm... Just wait till this gets turned into a botnet... Oh, wait, it runs Linux. I guess we're safe.
    • Re: (Score:1, Informative)

      by Anonymous Coward
      there is a lot of quite fancy security stuff used. all users need a x.509 certificate to submit jobs.
  • Bitch... (Score:2, Insightful)

    by HetMes ( 1074585 )
    I suppose this will all be raw sensor data from the LHC itself, right? Must be a bitch to get anything meaningfull out of it.
    • In a word, yes. The data is filtered and processed at multiple stages, that is part of why the system is setup in a tiered architecture. The actual data that turns into a visual result (graph) is fairly small after all the stages are complete. It is like mining for diamonds where you discard almost everything you pull from the ground and keep only the smallest portion. After all, if the problem was easy there would be no need for all of this computing to be thrown at it.
    • Nah.
      This is already behind the "realtime" stage, which is made done directly in hardware and only picks up the 0.001%or so of events that are deemed worthwhile to analyse.
      Otherwise, they would need exabit connections...
      • Some Realtime (Score:5, Interesting)

        by Roger W Moore ( 538166 ) on Wednesday April 23, 2008 @01:14PM (#23174254) Journal
        Actually not all of it is offline. One of the things I have a research grant for is to develop a realtime remote farm for monitoring the detector. This is to catch subtle detector problems quickly before we end up collecting 2 weeks of useless data.

        For the Tier 1 a significant fraction of the data is raw 'sensor' (we call it detector) data. This allows reconstruction program converts the data into physics objects like electrons, muons, jets etc.) to be rerun on the data once bugs in the initial reconstruction program have been fixed.
  • Is the birth of Skynet, and will be the death of us all. (and scratch the ladies in the subject; forgot for a second what site this was...)
  • by Anonymous Coward on Wednesday April 23, 2008 @12:42PM (#23173898)
    ...did it have a "Vista capable" sticker?
    • by nawcom ( 941663 )
      Nope, unfortunately the servers don't come with dedicated video cards. So no Aero.
      Or wait.. that just means it's not "Vista Premium" capable...
      *dreams of the profits of selling 20K vista licenses would bring in*
      ..muahahaha...hahahah......HAHAHAHAHAHAHA*snort*HAHAHAHAHAAHAAA!
  • ...from SCO Germany, trying to get them to buy 20,000 SCOSource licenses.
    This is exactly the sort of asshattery I would expect from an organization headed by Ralph Yarro and Darl McBride.
  • 15 Petabytes (Score:2, Informative)

    by Anonymous Coward
    "The LHC collisions will produce 10 to 15 petabytes of data a year"

    The collisions will produce much more data, but "only" 15 PB of that will be permanently stored. That's a stack of CDs 20km high. Every. Year.
    • The hell? I thought we were using Libraries of Congresses now, not the height of a stack of CDs. Damn buzzword measurements.
      • by iNaya ( 1049686 )

        OK, let's put it into Libraries of Congresses.

        The James Madison building alone has about 424k m^3 of assignable space (assuming a height of 10 feet of assignable space). The stack of CD's takes up 288 m3 assuming 12x12cm packaging. So assuming that the other two library buildings burnt down, then that would be 1/2000 libraries of congress.

        Bah.

  • New fiber-optic cables with special protocols will be used to move data from CERN to 11 Tier-1 sites around the globe
    Are they talking about Internet2 or Tier-1 ISPs?
    • Re: (Score:3, Interesting)

      by Anonymous Coward
      I2 is a US organization. The owner of the transatlantic cables is called the "LHC OPN" (Optical Private Network), I think. The full build-out will be about 80Gbps.

      I suspect the "special protocols" they are referring to are about the data transfer protocols (GridFTP for data movement), not some wonky Layer-1 protocol. However, these folks, like I2, have been investing in dynamic-circuit equipment, meaning that sites could eventually get dedicated bandwidth between any two installations.
    • Tier 1 is generally refering to the "Top 10 peering ISP's", well 11 if you ask CERN and 9 if you ask Wikipedia. http://en.wikipedia.org/wiki/Tier_1_carrier#List_of_Tier_1_IPv4_ISPs [wikipedia.org]
    • Re: (Score:3, Informative)

      by vondo ( 303621 )
      It has nothing to do with ISPs. The Tier1 sites are the largest sites around the world with thousands of CPUs and petabytes of storage to hand the influx of data. Typically there is no more than one Tier 1/country/experiement. Tier 2's in this nomenclature are generally university sites that have O(100) CPUs and O(100) TB of disk.
  • It won't be a parallel internet until it too is saturated with porn.

    (Unless it's like the parallel Goatee Universe in ST:TOS. In which case all the women will be dressed opaquely from head to toe? Or they will all have beards?)
  • Besides the obvious cool factor (I recall back when earning my undergrad how a fellow student was so excited he could compile Firefox in under 10 hours by using a grid he set up in one of the labs) of being able to crunch massive amounts of data very, very quickly, I'm curious what sorts of applications could use this effectively? Will it be limited to strictly scientific research? Can some of those CPU cycles be sold off to for-profit corporations?

    Will pixar be able to render their movies overnight no
    • You are probably on to something here. I'm betting spam delivery is about to get 1000s of % better very soon. Either that or a CNN DDoS attack from the EU sponsored by particleH4X0R5smashers....

  • Yeah... (Score:3, Funny)

    by Uncle Focker ( 1277658 ) on Wednesday April 23, 2008 @12:55PM (#23174032)
    But how well does it play Cyrsis at full settings?
  • But does it run... (Score:5, Interesting)

    by RiotingPacifist ( 1228016 ) on Wednesday April 23, 2008 @12:56PM (#23174040)
    Oh wait ofc it does, youve basically got science which is fundamentally open source.
    Then youve got a bunch of scientists who are fundamentally geeks
    And its all being setup in Europe, which isnt as under the grip of MS

    As a bonus
    They need to ability to look back and explain all their analysis which means they have to see the source
    It costs a hell of a lot to get the data so they dont want to loose any data anywhere.
    They have a lot of results to analyse so they dont want to be waiting for the server to come back on-line.
    Could they of gone with BSD? probably, but most science tools are developed for linux.
    • If it produces a stable black hole, then yes, along with the rest of the planet. In the incredibly unlikely event that that does happen, I can only hope that one of the scientists' last words are "Hey, check this out!"
  • Hey, it's one of those good botnets we just heard about!
  • by OshMan ( 1246516 ) on Wednesday April 23, 2008 @01:22PM (#23174334)
    Perhaps we should give equal time to an alternate post about the Intelligent Design of the Internet.
  • You can help too (Score:5, Informative)

    by Danathar ( 267989 ) on Wednesday April 23, 2008 @01:29PM (#23174432) Journal
    What a lot of people don't know is that if you want to join a cluster to the Open Science Grid and you are a legit organization more than likely they would let you join. Just be sure you understand your responsibilities as it's more of an active participation. If you are a school or computer user group/club go to the open science grid website and start reading up.

    Warning: Although not for this crowd. Joining OSG (http://www.opensciencegrid.org/) is a bit more complicated than loading up BOINC or folding@home. It requires a stack of middleware that is distributed as part of OSG's software. Most of the sites I believe use Condor (http://www.cs.wisc.edu/condor/). If you would like to get Condor up and running quick the best way is using ROCKS (http://www.rocksclusters.org/wordpress/) with a Rocks Condor "Roll" (jargon for Rocks condor cluster). Then after getting your condor flock up and running you can load the Open Science Grid stuff on it.

    I'm currently running a small cluster of PC's that were destined to be excessed (P4's 3 or 4 years old) and have seen jobs come in and process on my computers! And...to boot you can configure BOINC to act as a backfill mechanism so that when the systems are not running jobs from OSG they can be running BOINC and whatever project you've joined through that project.

    BTW...all of the software mentioned is funded under grants from the National Science Foundation - primarily via the Office of CyberInfrastructure but some through other Directorates within NSF.
    • Re:You can help too (Score:4, Informative)

      by wart ( 89140 ) on Wednesday April 23, 2008 @02:35PM (#23175142) Homepage
      'active' is a bit of an understatement. You need to be willing to provide long term support for the resources that you volunteer to the OSG, including frequent upgrades of the OSG middleware. A resource that joins the OSG for 3 months and then leaves is not going to provide much benefit to the larger OSG community.

      It's also not for the faint of heart. While the OSG software installation process has gotten much better over the last couple of years, it still takes several hours for an experienced admin to get a new site up and running, and that's assuming you already have your cluster and batch system (such as Condor or PBS) already configured correctly. If you are new to the OSG, then it is likely to take a week or more before your site is ready for outside use.

      Our organization has found that it takes at least one full time admin to manage a medium-sized OSG cluster (~100 PCs), though you can probably get away with less effort for a smaller cluster.

      This isn't meant to be criticism against the OSG; I think they've done great work in building up a grid infrastructure in the US. I just want to emphasize that supporting a OSG cluster is a non-trivial effort.
      • 'active' is a bit of an understatement. You need to be willing to provide long term support for the resources that you volunteer to the OSG, including frequent upgrades of the OSG middleware. A resource that joins the OSG for 3 months and then leaves is not going to provide much benefit to the larger OSG community.

        It's also not for the faint of heart. While the OSG software installation process has gotten much better over the last couple of years, it still takes several hours for an experienced admin to get a new site up and running, and that's assuming you already have your cluster and batch system (such as Condor or PBS) already configured correctly. If you are new to the OSG, then it is likely to take a week or more before your site is ready for outside use.

        Our organization has found that it takes at least one full time admin to manage a medium-sized OSG cluster (~100 PCs), though you can probably get away with less effort for a smaller cluster.

        This isn't meant to be criticism against the OSG; I think they've done great work in building up a grid infrastructure in the US. I just want to emphasize that supporting a OSG cluster is a non-trivial effort.

        ABSOLUTELY.

        You could not of said it better. Much better than I did. Of course you don't necessarily have to run a BIG cluster. Even one with 10 or 20 processors can be of use to people.

  • by Anonymous Coward
    ... turn the Earth into a black hole.
  • ...the Open Science Grid, which oversees the U.S. infrastructure for the LHC network

    Wrong. Caltech oversees the infrastructure for the US LHC network. The OSG provides the middleware and grid operations center for the computing and storage resources in the US that are part of the LHC experiments. The OSG does not manage or oversee communications networks.
  • by xPsi ( 851544 ) on Wednesday April 23, 2008 @02:38PM (#23175174)
    Practically speaking, trickle-down technology of the sort mentioned in the article is one of the main reasons basic research on this massive scale even has a chance of getting funded with taxpayer dollars. Looking for the Higgs, supersymmetry, and a color glass condensate is cool (important!) scientifically, but it is hard to justify spending 10 billion dollars without some pragmatic output. I'm a high energy physicist by training and would like to think these projects could get funded on their own scientific merit, but I suspect funding agencies would disagree; regardless, technology offshoots of this sort are definitely a good thing.
  • The on-site data centres at CERN are actually terrible when it comes to cooling (at least they were when I went there). I was expecting the server rooms to be low-ceilinged rooms with AC units good enough to keep the rooms at least chilly, but they were actually swelteringly hot, and one of them seemed to be in an old warehouse with very high ceilings.
  • They're actually connecting to the fucking internet!?!?!

    who gave them their degrees?

The trouble with being punctual is that nobody's there to appreciate it. -- Franklin P. Jones

Working...