R support group and the benefits of applied statistics
post by sixes_and_sevens · 2014-06-26T14:11:07.888Z · LW · GW · Legacy · 8 commentsContents
The General Case: R in particular: None 8 comments
Following the interest in this proposal a couple of weeks ago, I've set up a Google Group for the purpose of giving people a venue to discuss R, talk about their projects, seek advice, share resources, and provide a social motivator to hone their skills. Having done this, I'd now like to bullet-point a few reasons for learning applied statistical skills in general, and R in particular:
The General Case:
- Statistics seems to be a subject where it's easy to delude yourself into thinking you know a lot about it. This is visibly apparent on Less Wrong. Although there are many subject experts on here, there are also a lot of people making bold pronouncements about Bayesian inference who wouldn't recognise a beta distribution if it sat on them. Don't be that person! It's hard to fool yourself into thinking you know something when you have to practically apply it.
- Whenever you think "I wonder what kind of relationship exists between [x] and [y]", it's within your power to investigate this.
- Statistics has a rich conceptual vocabulary for reasoning about how observations generalise, and how useful those generalisations might be when making inferences about future observations. These are the sorts of skills we want to be practising as aspiring rationalists.
- Scientific literature becomes a lot more readable when you appreciate the methods behind them. You'll have a much greater understanding of scientific findings if you appreciate what the finding means in the context of statistical inference, rather than going off whatever paraphrased upshot is given in the abstract.
- Statistical techniques make use of fundamental mathematical methods in an applicable way. If you're learning linear algebra, for example, and you want an intuitive understanding of eigenvectors, you could do a lot worse than learning about principal component analysis.
R in particular:
- It's non-proprietary, (read "free"). Many competitive products are ridiculously expensive to license.
- Since it's common in academia, newer or more exotic statistical tools and procedures are more likely to have been implemented and made available in R than proprietary statistical packages or other software libraries.
- R skills are a strong signal of technical competence that will distinguish you from SPSS mouse-jockeys.
- There are many out-of-the-box packages for carrying out statistical procedures that you'd probably have to cobble together yourself if you were working in Python or Java.
- Having said that, popular languages such as Python and Java have libraries for interfacing with R.
- There's a discussion / support group for R with Less Wrong users in it. :-)
8 comments
Comments sorted by top scores.
comment by ShardPhoenix · 2014-06-27T01:21:07.350Z · LW(p) · GW(p)
There is also a 4 week Coursera course in R starting on July 7th (with several other sessions throughout the year): https://www.coursera.org/course/rprog
Replies from: coyotespike↑ comment by coyotespike · 2014-07-03T04:02:16.836Z · LW(p) · GW(p)
I'll be taking the Coursera course, since I have no experience in R. It's part of the Data Science Specialization. Reviews online suggest the course may be somewhat disorganized, but also said Code School's introduction was easier. So I might check out Code School as well.
The Coursera course will repeat, as ShardPhoenix said, so I'll report back when finished.
Replies from: coyotespike↑ comment by coyotespike · 2014-08-02T03:22:00.084Z · LW(p) · GW(p)
Okay, here's a preliminary update. I dropped the R Programming course on Coursera because after a basic introduction to R, the first substantive assignment jumped a couple levels in difficulty. In other words, there was a gap between the instruction and the assignment. This was frustrating. So be aware that you will need a bit of extra time to invest in order to get past this gap, either before or during the course. (I contrast this with the Introduction to Programming with Python course I'm taking on EdX from MIT, which is simply a flawless course, with a smooth and sure conceptual slope.)
comment by Emily · 2014-06-26T15:05:12.480Z · LW(p) · GW(p)
I signed up - thanks for setting this up! I'm an R user and definite fan of the software but don't have massive amounts of experience and I suspect I still sometimes make a bit of a hash of quite basic things. I've also not really tried to do anything Bayesian in R yet. (I've mainly used it for linear mixed effects modelling - it would be cool to explore the model comparison component of that.)
comment by [deleted] · 2015-07-12T18:13:47.738Z · LW(p) · GW(p)
I have about $100 bucks of microsoft azure service this month free. Anything I can do for someone here? I have it registered in my real life name and I dont want to link it to this account yet, so factor that in. Every application will probably be served until the credit runs out.
comment by buybuydandavis · 2014-06-27T10:03:09.959Z · LW(p) · GW(p)
Msoft's new Azure Machine Learning cloud data analytics solution is based in a R scripting front end. I think it's certain that many corporate ties will have a buzzwordgasm and be looking to "do something" with all their Big Data.
Get the trendy new thing while it's hot.
I'm currently a vendor at Msoft, and have been using Azure ML a little for the last few months. I've used R years prior to this, but I can see this making ML, and probably more importantly, just some decent data analytics language, part of mainstream IT.
Replies from: IlyaShpitser↑ comment by IlyaShpitser · 2014-06-27T17:06:20.764Z · LW(p) · GW(p)
גם זה יעבור
Replies from: Dr_Manhattan↑ comment by Dr_Manhattan · 2014-06-27T19:55:06.642Z · LW(p) · GW(p)
"This too shall pass"