3 months ago by
I will try to answer:
Q1) If you can run all samples in one go (and have equal number of samples in each organ) I would do that. Else you can follow the approach used in the recent CHESS paper where the authors of StringTie merged first within organ and then afterwards across organs (due to computational limits)
Q2) I interpret this as why do you want to merge. You want to merge because in the end you want to have all the same set of transcript/genes quantified in all your samples. This is necessary both because: 1) else you do not know which transcript correspond to which transcript in two different samples - else you cannot compare the two samples. 2) If it is not the same set of transcripts quantified you introduce a systematic bias in both samples.
Q3) You need to re-quantify all your samples using the combined transcriptome. Follow the instruction/guide here and you should be fine.
- If you want to do differential expression I suggest using tximport to get the data into R rather than the script StringTie supply via their homepage. Using tximport you can follow this DE analysis guide.
- If you are interested this data can also be used to identify and analyze isofom switches with predicted functional consequences with my R package IsoformSwitchAnalyzeR. For examples of what type of analysis can be done take a look at this section of the vignette.