Long and wide matrices
There are two broad ways data can be stored in a matrix. In "wide" format, everything for a given subject / record is on a single line / row. For example:
patientId control_cells control_temp drugA_cells drugA_temp drugB_cells drugB_temp 1 ... 2 ... ...
Whereas the "long" format has multiple rows for a subject, one for each condition. For example:
patientId treatment cells temp 1 control ... 1 drugA ... 1 drugB ... 2 control ... ...
Wide is generally more useful than long. (It's a good rule of thumb to have a record per line, although what you consider to be a record may vary.) So how do you convert between the two? Easy. First, let's make a long matrix::
Note the row / subject id has to be a factor::
long_df$patientId <- factor (long_df$patientId)
And use "recast" from the "reshape2" library. The parameters are:
- the dataframe (don't know if this works with a matrix, may have to cast)
- a formula showing how the data is grouped, '...' means 'everything else'
This assumes that everything else is a measurement column. Columns will be named appropriately:
## Error in library(reshape2): there is no package called 'reshape2'
wide_df <- recast (long_df, patientId ~ treatment + ...)
## Error in recast(long_df, patientId ~ treatment + ...): could not find function "recast"
## Error in print(wide_df): object 'wide_df' not found
print (colnames (wide_df))
## Error in is.data.frame(x): object 'wide_df' not found
Now this is just the barebones of
recast and there are lots of options and possibilities. There are also several other libraries (notably
tidyr) that can also make this conversion and are arguably more powerful. However, to my eye they're much more complicated. For simple cases, the above works just fine. Note that many of the examples scattered across the web also assume there's only one measurement column.