UPDATE: I’ve modified the title of this post a bit to clarify what I was really thinking when I wrote it. What I was really thinking was which programming language to choose to teach some fellow researchers how to get into the absolute basics of programming, out of the very limited set of languages I know. The tasks they need to do need only a minimal understanding of programming, and of R, so many of the issues that can be experienced won’t even come up for them. To put things into context, it only took two days for me to work out how to do everything that I need to do in R going from scratch, so it’s not as if I’m writing packages or doing anything particularly fancy myself, and these people who I will be teaching will be doing stuff that is less complicated than what I needed to do.
That being said, I’d like to thank those who commented for pointing out why R isn’t a great language for people starting out with programming. I’m still new to using R, so obviously don’t have the depth of experience with potential problems that others do, so it’s helpful to learn from others’ experience (or should that be “misery”?)! Python, Pascal and Ruby all sound like great options for getting into programming. I’m going to leave my initial post, with all it’s inaccuracies, intact below: first, because I think it’s good to have a record of what I have said so I can look back at how daft I was in the future, and second because, as people took the time to post comments, I don’t want the time they spent making comments and correcting me to have gone to waste. If I deleted most of what I said or removed the post, then their comments would seem odd or incorrect.
I’m helping out some colleagues learn programming from having zero experience with it in any shape or form. It’s quite a daunting task in some senses, because, well, it may not be easy! They are researchers, so they’ll need it for processing data and generating output, and perhaps processing BIG DATA at some point too. After some debate about the best way to go ahead, I’ve settled with R as being my weapon of choice to train these lucky individuals. The choices were as follows – note that I don’t know that many programming languages, so it’s not a huge list. I thought it would be worth sharing the pros and cons of each. Pros: Dead easy to use. Nice and easy integration with databases which can be used to deal with data processing. Can be extended to, for example, generate images (a plus for these people who study visual cognition, so often need to make pretty pictures to show to participants in experiments). There’s also an immense number of tutorials and guides on the net, and people who aren’t into research can help you out just by knowing their PHP.
Cons: Probably overkill. Running a webserver all the time can be a pain, even if XAMPP is used. It’s not easy (or even possible, as far as I am aware) to run statistical tests using PHP or any classes that can be added in. Pros: Forces users to write clean code, and again it’s very easy to use. Possible to integrate with databases to churn through datasets. Like PHP, it can be used to generate images for use in experiments (pygame), and again there are plenty of examples and tutorials. Plenty of extensions to do stats and plot graphs (NumPy and Matplotlib). Oh, and it’s named after Monty Python. Ni.
Cons: not really intended for churning big datasets and the kind of things I have in mind. Quite a bit of the decent libraries out there need to be paid for to be used. Pros: syntax is very simple, with few gotchas present in other languages (e.g., ending lines with a semicolon or forcing tabs in lines and so on). As it’s loosely typed, this can be both a blessing and a curse. It’s a blessing because users don’t have to worry about declaring variables. It’s a curse because they can slip into bad habits and not understand variable types properly. Oh, and I don’t need to say that it can work on all sorts of databases, churn through data very rapidly, generate images, run statistical tests and plot graphs that are of publication quality. Cons: Had to really think about this, but I guess that R is a nightmare to google for any kind of help when you’re stuck. I think it’s a fundamental issue relating to the fact that calling something a letter of the alphabet probably doesn’t help SEO rankings all that much. The official documentation would benefit from being a bit more like the PHP documentation (though maybe there is a site like that for R, I’ve just not found it), with users able to comment and give better examples than those provided initially. That being said, there are more blogs on R than you can shake even a very large proverbial stick at, which more than make up for it. I always search the legendary R-bloggers.com search box before googling anything to do with R now. I’ve never had to look any further than that. Is R an ideal language to teach the fundamentals of programming to beginners? I think the answer is “yes”. The beginners I have in mind are researchers and have specific needs regarding data processing, and it would benefit them to learn how to run stats in R, opening up future possibilities as well (e.g., LMEs). I’ve not mentioned Matlab, which I know is a favourite for researchers, because (1) it’s a gigantic monster to download and install, (2) I don’t know it that well and (3) it’s prohibitively expensive. I was also tempted to evaluate the use of LOLCODE to see if there was any mileage in using it (“IM IN YR LOOP UPPIN YR VAR TIL BOTH SAEM VAR”).
I myself first dabbled in programming back when I had a Sinclair back in the old days, and we did some very basic BASIC at primary school. Later on, I used BASIC to make emulators that mimicked my friends’ phrases and behaviour. Some of them were spot on! I guess I’ve always been trying to model human behaviour. I’ll post up the material I use to teach my colleagues to help them out and have a permanent copy of the material we go through.
That’s it for now, please feel free to share any other languages you may have found to be good for beginners. I’m sure there are some things that I have missed.