All posts for the month April, 2011

Data Aggregation in R: plyr, sqldf and data.table

I’ve previously put up a┬ácouple of┬áposts about aggregating data in R. In this post, I’m going to be trying some other alternative methods for aggregating the dataset. Before I begin, I’d like to thank Matthew Dowle for highlighting these to me. It’s a bit daunting at first, deciding which method of aggregating data is best. […]

Further Adventures in Visualisation with ggplot2

So I previously took a look at some data of player performance from a computer game. In this post, I’m going to do some further visualisations using ggplot2. The data consists of different types of player character, different roles for those characters, and their overall damage output (the unit here is damage per second, or […]

Sexy, Geeky Graphs using ggplot2 in R

So I’ve been looking for some data to play with while learning R, other than the data I’m analysing for various experiments and papers I’m working on. I thought to myself, “Hey, this R stuff is pretty geeky. Can I engage in a higher level of geekiness?” And I think I’ve found a way: using […]

Aggregate Function in R: Making your life easier, one mean at a time

I previously posted about calculating medians using R. I used tapply to do it, but I’ve since found something that feels easier to use (at least to me). ?View Code RSPLUSaggregated_output = aggregate(DV ~ IV1 * IV2, data=data_to_aggregate, FUN=median) aggregated_output The above code saves an aggregated dataset to aggregated_output and gives you the median in […]

RStudio, Revolution Analytics and Deducer: A Tale of Three GUIs

I’m in the process of moving from SPSS to R at the moment. It’s not been the easiest of rides, but then learning how to do a core part of your job never really should be. It’s been fun, though – don’t get me wrong – it’s definitely been an adventure!! Here I’m going to […]

Pivot Tables and Medians in R

Pivot Tables are a useful way of aggregating data into the format that you’re after. In this example, I’m going to be using R to pivot some data and calculate medians for me. This is useful because Excel can calculate medians (the =MEDIAN(values)) function, but what it can’t do is calculate medians for Pivot Tables. […]