Ask Slashdot: Best Language To Learn For Scientific Computing? 465
New submitter longhunt writes "I just started my second year of grad school and I am working on a project that involves a computationally intensive data mining problem. I initially coded all of my routines in VBA because it 'was there'. They work, but run way too slow. I need to port to a faster language. I have acquired an older Xeon-based server and would like to be able to make use of all four CPU cores. I can load it with either Windows (XP) or Linux and am relatively comfortable with both. I did a fair amount of C and Octave programming as an undergrad. I also messed around with Fortran77 and several flavors of BASIC. Unfortunately, I haven't done ANY programming in about 12 years, so it would almost be like starting from scratch. I need a language I can pick up in a few weeks so I can get back to my research. I am not a CS major, so I care more about the answer than the code itself. What language suggestions or tips can you give me?"
Python (Score:5, Insightful)
I have a friend who works for a company that does gene sequencing and other genetic research and, from what he's told me, the whole industry uses mostly python. You probably don't have the hardware resources that they do, but I'd bet you also don't have data sets that are nearly as large as theirs are.
You might also get better results from something less general purpose like Julia [julialang.org], which is designed for number crunching.
Fortran (Score:2, Insightful)
sorry to say, but that is a fact
Re:Python (Score:5, Insightful)
the whole industry uses mostly python
This is certainly the way of the future, not just for gene sequencing but many other quantitative sciences, although a complete answer would be Python and C++, because numpy/scipy can't do everything and Python is still very slow for number-crunching. It's best to start with just Python, but eventually some C++ knowledge will be helpful. (Or just plain C, but I can't see any good reason to inflict that on myself or anyone else.)
FORTRAN (Score:3, Insightful)
Seriously consider FORTRAN
Re:Python (Score:4, Insightful)
what the rest of your team uses (Score:5, Insightful)
And if you are not a member of a team then I seriously question the quality of your graduate program.
Re:FORTRAN (Score:2, Insightful)
Yeah, sure.
So that no one can ever check your models or replicate your results even if you publish code and initial data.
Re:Python (Score:5, Insightful)
Python is VB done right.
Profile (Score:5, Insightful)
A lot of people will propose a language because it is their favorite. Others because they believe it is very easy to learn. I will give you a third line of thought.
I would not look for a language in this case, I would look for a library, then teach myself whatever language is easiest/quickest to access it. I would try to profile what you are building, figure out where the bottlenecks are likely to be (profiling your existing mockup can help here but dont trust it entirely) and try to find the best stable well-designed high performance library for that particular type of code.
Re:R-language (Score:3, Insightful)
Re:Python (Score:3, Insightful)
The problem with using the mix (when you actually write the C++ code yourself) is that debugging it is a major pain in the ass
Only if you don't use the C/C++ code as an independent module, as it should be. If you *must* debug it in parallel, you're designing it wrong.
Re:Python (Score:3, Insightful)
Re:Python (Score:5, Insightful)
Perl is still in wide use.
Do not use Perl for this. I've been using Perl for 15-20 years, and I love it for "scripting", text processing, etc., but using it for scientific computing sounds like an exercise in masochism.
Re:Fortran (plus MPI and some CUDA) (Score:2, Insightful)
Fortran and learn some how to implement MPI and CUDA code is your work is parallelizable.
DO NOT USE CUDA
Use OpenCL
Re:Python (Score:5, Insightful)
Compared to C and C++, Fortran is actually more elegant for pure numerical computing.
Unsurprising - that's what Fortran was designed for...!
Re:Python (Score:4, Insightful)
No, it's a simple language that is easy for beginners to learn. But, unlike VB, it is not horribly designed, and is useful even once you grow out of the beginner phase.
C. Obviously. (Score:4, Insightful)
You know C. C is simple, as fast as any alternative, it's straightforward to optimize (aside from pointer abuse), and you always know what the compiler/runtime is doing. And threading libraries like pthreads or CUDA are best served via C/C++. Why use anything else?
Another thought: scientific libraries. If you need external services/algorithms then your chosen language should support the libraries you need. C/C++ are well served by many fast machine learning libs such as FANN, LIBSVM, OpenCV, not to mention CBLAS, LinPACK, etc.