CERN Testing Cloud For Crunching the Universe's Secrets 67
Nerval's Lobster writes "The European Organization for Nuclear Research (known as CERN) requires truly epic hardware and software in order to analyze some of the most epic questions about the nature of the universe. While much of that computing power stems from a network of data centers, CERN is considering a more aggressive move to the cloud for its data-crunching needs. To that end, CERN has partnered with Rackspace on a hybrid cloud built atop OpenStack, an open-source Infrastructure-as-a-Service (IaaS) platform originally developed by Rackspace as part of a joint effort with NASA. Tim Bell, leader of CERN's OIS Group within its IT department, suggested in an interview with Slashdot that CERN and Rackspace will initially focus on simulations—which he characterized as 'putting into place the theory and then working out what the collision will have to look like.' CERN's private cloud will run 15,000 hypervisors and 150,000 virtual machines by 2015—any public cloud will likely need to handle similarly massive loads with a minimum of latency. 'I would expect that there would be investigations into data analysis in the cloud in the future but there is no timeframe for it at the moment,' Bell wrote in a follow-up email. 'The experiences running between the two CERN data centers in Geneva and Budapest will already give us early indications of the challenges of the more data intensive work.' CERN's physicists write their own research and analytics software, using a combination of C++ and Python running atop Linux. 'Complex physics frameworks and the fundamental nature of the research makes it difficult to use off-the-shelf [software] packages,' Bell added."
CPU vs GPU (Score:1)
Use of CPUs from cloud-based providers is not as efficient for computations as using multiple GPUs linked together on a custom built setup. Using hypervisors instead of barebones for computational work further reduces efficiency by another 10-15%. This is a waste of money, and poorly done systems analysis.
Re: (Score:1)
GPUs linked together in a custom built setup is small-time thinking, and wastes a lot of valuable researcher time. Having computing resources managed in the "cloud" really does make sense, and is going to happen, because it will be easier to maintain and write code for in the long run. It seems like there is a vocal minority of anti-cloud technology here on /. , get ready, though, it's the future, even if you like building your own mash-up of bargain bin hardware. ;)
Re: CPU vs GPU (Score:2)
I assume you don't use the cell phone or PSTN networks for anything personal, either?
Re: (Score:2)
No it doesn't. 'Cloud' crap is for things that require variable processing power, so you can OCCASIONALLY spike to high loads, without having to build a massive infrastructure yourself for that 2 hours that the spike happens once a year.
CERN crunches massive amounts of data ALL THE TIME. There are no peaks and valleys, there is no benefit to letting someone else charge you extra to run your software in a reserved hypervisor instance. Its the exact opposite of efficiency.
You do not use virtual machines fo
Re: (Score:1)
Since you are clearly more qualified to make development, porting and maintenance labor vs hardware cost trade off decisions than the the people involved at CERN, why don't you go help them out a bit?
While you are at it, feel free to train some of the world top physicists to stop writing "their own research and analytics software, using a combination of C++ and Python" and have them learn to code for GPUs, and port all their existing code. Clearly thats the best use of their time. This is research: theres a
Re: (Score:1)
aww thats cute, you expected them to be competent. They write heavy computational problems in Python.
>using a combination of C++ and Python
Re:CPU vs GPU (Score:4, Informative)
aww thats cute, you expected them to be competent. They write heavy computational problems in Python.
>using a combination of C++ and Python
Python is used only for configuration, interfacing (as a glue), and job steering. We are not that incompetent you know ;).
Re: (Score:2)
Python is used only for configuration, interfacing (as a glue), and job steering. We are not that incompetent you know ;).
Unless you're using PyPy, that's all that Python is used for anywhere, obviously.
Re: (Score:2)
So? Thats called intelligent design. You write processor heavy code in a low level language by expensive developers that take longer amounts of time, then have someone else, who costs less and can do more 'visible' work faster using a high level language.
I suppose you think the major AAA title games engines are written by incompetent developers too then, right?
Re: (Score:2)
I suppose you think the major AAA title games engines are written by incompetent developers too then, right?
Major AAA titles nowadays tend to be released on licensed engines written by competent people. But we do get a lot of hilariously badly written games like World Of Tanks (Python=single threaded, engine originally intended for Korean point and click mmrpgs), EVE Online (Python even server side = single threaded bottlenecks everywhere. Most recent "innovation" slows time to handle lag).
Its sad when places like Facebook have the best approach to solving computational problems (I especially like their disaggreg
My, aren't you special (Score:2, Funny)
My, aren't you special.
Telling the organization with a datacenter containing 65,000 cores, 30 petabytes of data and also, incidentally invented the Web, how to set up their computers.
Re: (Score:2)
Just because someone working for your organization 50 years ago did something great doesn't mean anything anyone is doing there now is impressive.
Not saying that CERN isn't doing impressive things, but you seem to not understand that organizations are not universally made up of the same people you read a news story about 20 years ago.
As a systems architect, you'll be hard pressed to convience me that moving 65k cores and 30 petabytes of data to the cloud is intelligent. You will still need the same number
Re: (Score:2)
Use of CPUs from cloud-based providers is not as efficient for computations as using multiple GPUs linked together on a custom built setup.
This assumes that GPUs are actually suitable for the task at hand. I work in a very different branch of the computational sciences, but I can testify that GPUs are near-useless for most of what we do. If a "systems analyst" gave us advice like yours, I'd be furious.
Re: (Score:2)
I work in a very different branch of the computational sciences, but I can testify that GPUs are near-useless for most of what we do.
What exactly is the problem in your application area?
Re: (Score:2)
Branching? You do realize GPUs absolutely suck ass at any sort of branch right? So ... say ... anything except raw number crunching, sucks on a GPU.
Go ahead and write a search algorithm that runs solely on GPUs ... then watch it get out performed by an Arduino.
Re: (Score:2)
Branching? You do realize GPUs absolutely suck ass at any sort of branch right?
CPUs today also suck ass at any sort of branching. If branching is what you want, go for Forth chips. You can branch randomly every few clock cycles and not notice it.
Go ahead and write a search algorithm that runs solely on GPUs
How is *that* a problem, unless the instruction set is completely botched? You'd have much more trouble with the memory subsystem than with the processor's inability to branch, since your ordinary GPU memory shines at coherent access but sucks at latency.
Re: (Score:2)
What exactly is the problem in your application area?
The main problem is that there's no single bottleneck where parallelization really helps. We do a lot of FFTs, but those only account for maybe 25% of total runtime - and they're mixed in with a lot of other calculations (and yes, branch points), mostly called by the LBFGS minimizer. The memory transfer overhead makes it especially difficult. We could probably figure out a way to make it work, at enormous cost (for us) in terms of manpower, but there a
Re: (Score:2)
Re: (Score:2)
"Use of CPUs from cloud-based providers is not as efficient for computations as using multiple GPUs linked together on a custom built setup."
Per spent dolar? On a "pay as you go" fashion?
"This is a waste of money, and poorly done systems analysis"
Of course yes. Because your silver bullet is the real silver bullter, of course.
Re: (Score:3)
But you're assuming CERNs going to be using 100% of capacity at all times. Which they're not, and their needs are going to change a lot as well. They probably have to have dedicated staff that just builds and maintains this shit all day long. If they can pay a SAS provider to handle it all, yea, it's less efficient, but it might be cheaper for them because the SAS provider could use the same equipment to do work for cancer researchers when CERN isn't using it. If they can get a way to price it based on calc
Just a thought along the side-line (Score:4, Informative)
Re: (Score:2)
And it's not as efficient as C++.
Except that C++ is not actually all that efficient, unless you do a lot of tweaky stuff by hand in it. There are a lot of things you can do with dynamic compilers that you can't do with precompiled libraries. Deep inlining and extensive IPO/IMO comes to my mind. People have hacked it onto C++ but that's like bolting extra legs onto a dog to turn it into an octopus.
Add to that the fact that Julia is homoiconic and supports much more expressive, arbitrary compile-time transformations and you're in for a treat
Re: (Score:2)
C++ via GCC is inefficient, pretty much every other compiler I've ever worked with does well.
Stop using shitty compilers and you'll find the language not so inefficient.
Re: (Score:2)
CERN has invested in about 5 million lines of C++ code (google GEANT4 and ROOT) - there is no backing out of C++ now. Python is nice because it can sit on top of the C++ backend and provide less buggy UI. It is also becoming the de facto standard for scientific computing (not just in HEP).
rackspace?! (Score:5, Insightful)
Rackspace?!
Wait, what?!
Rackspace is the most *horribly* run hosting service of all time. I could go on for hours and hours and HOURS describing how inept and incapable they are.
From months to source SSDs, to providing horrible support, and utter incompetence on the part of their staff... I mean, they're HORRIBLE! Just plain horrible. If any of their automation breaks down? Well, good luck getting help FAST. I mean, if a VM move fails, well.. maybe you'll get help in 24 *hours*.
Maybe. If it's the weekend, well.. or at night... well, after all, people only use the internet during the day!
And if anything is even slightly outside of the box? Good luck with that!
No, no, no. Not to mention, expensive. I was saddled with these boneheads when a PHB decided they were a great idea! Meanwhile, they take MORE time out of your day, than just maintaining hardware servers in a data center, because if anything goes wrong?
Well, emails, calls, conferences, blah blah blah. In 1/10th of the time it would take for rackspace to fix ANYTHING, I could just tell a traditional data center to reboot my box, or install a new one.
Hell, I've had VMs@Rackspace that were HUNG, that would NOT respond to the web console reboot command. TIme to get that fixed? HOURS. Christ, just GET IT FIXED.
And cost? COST! PHB made me use these boneheads. We leased two Dell R720s. For the cost of 3 MONTHS worth of the lease, I could have bought a better equipped R720! Or, hey, maybe TWO Supermicro servers!
Rackspace is a time sucking hole in the ground. It's "expert" admins will suck your time away. Hell, I had to put off dozens of projects, whilst I dealt with their constant and continual fuckups, the phone calls, the emails, the explaining to them how to fix simple thing!
Heck, don't even get me started with Rackconnect, good god. Worse, buggy as hell as it is (or at least was), they had all sorts of problems with their automated iptables scripts. I snag it, debug it, and realise that some conehead there can't write simple bash...
Fix it...
Report the fix...
And am still suck with months, I repeat MONTHS of their script being used on my boxes, with no way to replace it (it was scp'd in on boot), and therefore broken firewall rules all over the place. MONTHS, when I provided them with a fix! A ONE LINE FIX AT THAT!
No, no, no, no, NO they are horrible, stay away, run the other way, my god stay the hell away from Rackspace, the most useless company on the planet!
If any of you, I repeat ANY of you want more detailed info, please let me know.... I hope they burn in flames as they go down into a tarpit in hell!
Re: (Score:3)
The sad part is, I did hold back. Mostly, due to post length and the fact that I don't want to spend the next week writing it up.
Suffice it to say, that I have an archive of 100s of Rackspace emails, and 60 or 70 phone calls, all stored because we were positive we'd have to sue their ass.
Yes, they were that bad, and showed that much incompetence.
Re: (Score:2)
Not Rackspace.
There are lots of other fish out there.
Re: (Score:1)
Rackspace is the most *horribly* run hosting service of all time. I could go on for hours and hours and HOURS describing how inept and incapable they are.
I'll see your Rackspace and raise an Accenture.
All the competence of Rackspace for only 10x the cost!
since the NSA spys on everything (Score:2)
Re: (Score:2)
Re: (Score:2)
you can bet the cloud quickly being abandoned by almost everybody
Except that CERN probably isn't too worried about the NSA spying on their exciting particle detector analysis. Maybe if there was something extremely proprietary in there, they might care, but I suspect even most (American) companies won't give it a moment's thought. I hate to resort to the cliche "If you have nothing to hide, you shouldn't be afraid", but as far as scientific research is concerned this is largely true. I work for a governm
Re: (Score:2)
@home? (Score:2)
I wonder if there is any opportunity for public participation?
seti@home [berkeley.edu]
folding@home [stanford.edu]
GIMPS [mersenne.org]
cern@home ????
Cloud For Crunching the Universe's Secrets (Score:2)
I'll save them some time
42
IaaS (Score:2)
We used to call it "rental".
Gotta love "as a service" buzzwords. They have come full circle now :)
Virtual Universe (Score:2)
CERN IT is quite big... (Score:2)
The reason for using a cloud is consolidation of resources, manpower and experience. Most companies are better off outsourcing some things because they wouldn't utilise their on premises resources near 100 % (e.g. at night, in vacations). CERN can run simulations all of the time, so there is always demand, and they can hire many experts without them "idling" most of the time. I don't think public clouds are a must for them and I'm even skeptical of VM technology, because they are dealing with friendly code
Mo' computers, mo' problems (Score:2)
Re: (Score:2)
This term "The Cloud" makes me... (Score:1)
need some sort of radar to see where the hell I am.I recall a time before the bubble burst when it was being said tech start-ups in teh internent had their head in the sky, were not grounded in reality.... well tyehy still are but now they can't even see the ground. And there are mountains around called patents.
Grid Computing (Score:2)
I'm curious, what does this mean for Grid Computing? I thought it was the principal solution for distributing the analysis of CERN data to participating institutions around the world.
http://home.web.cern.ch/about/computing/worldwide-lhc-computing-grid [web.cern.ch]