Long and wide matrices

A quick lesson on representing data and reshaping matrices

There are two broad ways data can be stored in a matrix. In "wide" format, everything for a given subject / record is on a single line / row. For example:

patientId   control_cells   control_temp   drugA_cells   drugA_temp   drugB_cells   drugB_temp
1           ...
2           ...

Whereas the "long" format has multiple rows for a subject, one for each condition. For example:

patientId   treatment   cells   temp
1           control     ...
1           drugA       ...
1           drugB       ...
2           control     ...

Wide is generally more useful than long. (It's a good rule of thumb to have a record per line, although what you consider to be a record may vary.) So how do you convert between the two? Easy. First, let's make a long matrix::

Note the row / subject id has to be a factor::

long_df$patientId <- factor (long_df$patientId)

And use "recast" from the "reshape2" library. The parameters are:

  • the dataframe (don't know if this works with a matrix, may have to cast)
  • a formula showing how the data is grouped, '...' means 'everything else'

This assumes that everything else is a measurement column. Columns will be named appropriately:

library (reshape2)
## Error in library(reshape2): there is no package called 'reshape2'
wide_df <- recast (long_df, patientId ~ treatment + ...)
## Error in recast(long_df, patientId ~ treatment + ...): could not find function "recast"
print (wide_df)
## Error in print(wide_df): object 'wide_df' not found
print (colnames (wide_df))
## Error in is.data.frame(x): object 'wide_df' not found

Now this is just the barebones of recast and there are lots of options and possibilities. There are also several other libraries (notably tidyr) that can also make this conversion and are arguably more powerful. However, to my eye they're much more complicated. For simple cases, the above works just fine. Note that many of the examples scattered across the web also assume there's only one measurement column.