Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Education Math Open Source Programming Stats

Interviews: Ask Author and Programmer Andy Nicholls About R 187

Andy Nicholls has been an R programmer and consultant for Mango Solutions since 2011 (where he currently manages the R consultancy team), after a long stint as a statistician in the pharmaceutical industry. He has a serious background in mathematics, too, with a Masters in math and another in Statistics with Applications in Medicine. Andy has taught more than 50 on-site R training courses and has been involved in the development of more than 30 R packages; he's also a regular contributor to events at LondonR, the largest R user group in the UK. But since not everyone can get to London for a user group meeting, you can get some of the insights he's gained as an R expert in Sams Teach Yourself R In 24 Hours (available in print or at Safari), of which he is the lead author. Today, though, you can ask Andy about the much-lauded statistics-oriented free software (GPL) language directly -- Why to use it, how to get started, how to get things done, and where those intriguing release names come from. (The about page is helpful, too.) As usual, please ask as many questions as you'd like, but one question at a time, please.
Note: Slashdot is always looking for interesting interview guests. Who do you want to ask? Let us know!
This discussion has been archived. No new comments can be posted.

Interviews: Ask Author and Programmer Andy Nicholls About R

Comments Filter:
  • R? (Score:5, Funny)

    by U2xhc2hkb3QgU3Vja3M ( 4212163 ) on Thursday February 11, 2016 @03:56PM (#51489361)

    Is that a pirates-only language?

  • Evolution of R (Score:4, Interesting)

    by patabongo ( 842730 ) on Thursday February 11, 2016 @04:17PM (#51489481) Homepage

    How has the way you use R changed over time? For myself, I don't think I've gone through an entire R session in the past six months without loading dplyr. Combine that with the pipeline operator and I think if you'd shown the R code I wrote yesterday to me of two years ago, I wouldn't have believed it was the same language.

  • by Anonymous Coward

    What's your take on the future of R? It used to be that it was a tool for statisticians, and now it's been discovered by programmers. As a statistician who's not a programmer, but who hangs out sometimes on slashdot and stackoverflow, it feels sometime like it's in danger of becoming just another language for programmers, instead of a tool for statisticians. Should I be worried? Can it be both? Is this mass inflow of programmers going to change it somehow? Or am I just having a "get off my lawn" moment?

    More

    • by Anonymous Coward

      The only reason any sane programmer uses it is because they have to write some stat code using some obscure test or analysis package only available in R.

    • by Pseudonym ( 62607 ) on Thursday February 11, 2016 @06:54PM (#51490875)

      As a statistician who's not a programmer, but who hangs out sometimes on slashdot and stackoverflow, it feels sometime like it's in danger of becoming just another language for programmers, instead of a tool for statisticians.

      As a programmer who used to research programming languages, here's no danger of that at all.

      It's not much of a stretch to say that no programmer really uses R. At most, programmers use the high-quality statistical libraries which only work with R. R is basically the best statistical packages every written bound together by one of the worst programming languages ever developed.

      • by Anonymous Coward

        It's not much of a stretch to say that no programmer really uses R. At most, programmers use the high-quality statistical libraries which only work with R. R is basically the best statistical packages every written bound together by one of the worst programming languages ever developed.

        This is it *exactly*!

      • I actually program exclusively in R and fine it OK once you learn the quirks. Where it excels is in sort of "jotting" down thoughts about programs. e.g. you can define a S3 class and then make one that only has a few of the properties, or claim your object is a class it is not. This would drive any Java programer bananas but it's super nice for going fast and loose.

        Similarly, the fact that it can recover your call in addition to the arguments you passed makes several functions work much better when you have

        • I actually program exclusively in R and fine it OK once you learn the quirks.

          I dunno -- there's an awful lot that's cumbersome about R and constantly does my head in. My pet bugbears:

          No native hash/dictionary construct (there is the third-party hash library, but that's not great for portability).
          It's not possible to define functions at the end of your code, making code difficult to read (or requiring you to source a separate script that contains your functions, but again, portability suffers).
          Variable scoping is ... odd (many people have written previously about R quirks in this re

  • Key advantages of R (Score:5, Interesting)

    by Compuser ( 14899 ) on Thursday February 11, 2016 @04:29PM (#51489549)

    In your view, what are the key advantages of R over other scientific computing languages, most notably Matlab (which has to be considered with its plethora of toolboxes of course)?

    • by TeknoHog ( 164938 ) on Thursday February 11, 2016 @07:52PM (#51491283) Homepage Journal

      In your view, what are the key advantages of R over other scientific computing languages, most notably Matlab (which has to be considered with its plethora of toolboxes of course)?

      Or Python with scipy/numpy, or Julia, given their open source nature in addition to the plethora of libraries.

    • While I am really only dipping my toe into R I decided to do some research on this question a while back.

      I have used python for a number of scientific applications and was attempting to determine if I should use Rpy2 (http://rpy2.bitbucket.org/). It initially made sense to keep all of the data retrieval, formatting and analysis in a few python scripts. However, it seems that the design of the R language intrinsically accounts for the problem solving methodology: "R is designed to operate the way that proble

  • by Anonymous Coward

    For those that are relatively new to R and hope to enter the field of statistics, where would you recommend focusing your R training efforts?

    For example, which programming concepts, or fields of application, or packages, etc. do you feel are especially worthy of attention?

    Similarly, what would you recommend we avoid?

    • Hoisting the AC for asking a good question.

      To add on: R is gaining massive traction in graduate programs but so many professors teach it like it's SPSS, almost as a cargo cult coding language, and so much of the documentation is written for people who are already experienced coders. Is there any decent introduction to R for someone that doesn't already know it (or another programming language) fluently?

      • I know it's not free, but Udemy has some truly excellent R courses. From the very basics progressing to actual data science.
  • by Shadow of Eternity ( 795165 ) on Thursday February 11, 2016 @05:24PM (#51489961)

    There's an entire book, the R Inferno, dedicated to R's many "quirks" and problems. Is there ever a plan to dedicate some time to focusing on cleaning up the language and making it less painful to use?

  • Harsh crowd (Score:2, Interesting)

    by Anonymous Coward

    In my experience (from searching for R advice online - I've never mailed the R discussion list myself) the R community is incredibly harsh and unforgiving of new users. Answers to beginners' questions are normally brusque - often extremely so. (I remember one exchange, where a user basically asked "I've read the documentation for par, and I don't understand ...", and the response was, in its entirety, "?par" -- which, for those unfamiliar with R, is the command to bring up the documentation for par.)

    On the

  • I encountered R via Johns Hopkins University's data science series of Coursera courses which I highly recommend. The first one is at https://www.coursera.org/learn... [coursera.org]

    As a mainly Python programer, but someone with an eclectic interest in programing languages (I enjoy Prolog, Lisp, ML...), I've found R very intriguing: it's a very "functional" programing language, but also object oriented (using dollar signs instead of the customary dots). I've also found R to be incredibly quick -- provided you know and use

  • by Anonymous Coward

    I am myself an R aficionado, but what do you answer to someone who says that Python has gone a long why to be a good contender for data analysis tasks (SciPy, Pandas, Scikit etc...)?

  • by twistedcubic ( 577194 ) on Friday February 12, 2016 @12:02AM (#51492355)
    What topic(s) in statistics do you think students can learn easier today using R than years ago when there was nothing like R widely available?
  • I think minitab is better. How would you convince me otherwise?

  • by GlobalEcho ( 26240 ) on Friday February 12, 2016 @10:23AM (#51493911)
    I feel that one of the weakest points of R is the error handling, reporting, and debugging available.  Do you have advice on tools or techniques for people coding in R (aside from using RStudio?  Are there plans for improvements in this area?  The current facilities are reminiscent, at least to me, of using gdb back in the 1990s.

    I have in mind cases like the following, in which a confusion about list access using the [ operator (when the [[ should have been used) provides a cryptic error message with no traceback available.

    > symlog_scaler <- list(linear_to=2.5,  abscissa=2.0,
    +    scaling_function=function(x,linear_to=2.5,abscissa=2.0){
    +        y <- x; linear_to = abs(linear_to); big_ix = (linear_to<x)
    +        y[big_ix] = linear_to + log(1+(x[big_ix] - linear_to), base=abscissa)
    +        small_ix = (-linear_to>x)
    +        y[small_ix] = -(linear_to + log(1+(-x[small_ix] - linear_to),base=abscissa))
    +        y})
    > symlog_scaler$scaling_function(-5:5)
    [1] -4.307355 -3.821928 -3.084963 -2.000000 -1.000000  0.000000  1.000000  2.000000  3.084963
    [10]  3.821928  4.307355
    > symlog_scaler['scaling_function'](-5:5)
    Error: attempt to apply non-function
    > traceback()
    No traceback available
    >
  • I have been impressed with the strong community surrounding R, and the excellent third party libraries that are available in the CRAN.

    What is your view on the various third party GUIs that exist for R, such as RStudio, Tinn-R and RExcel? Do you use or recommend any of them?

It is easier to write an incorrect program than understand a correct one.

Working...