Question: How to perfectly use transcript merge method of various software like stringtie?
gravatar for jaidev53ster
3 months ago by
jaidev53ster0 wrote:

As per the manual StringTie can be used with --merge in order to generate a non-redundant set of transcripts observed in all the RNA-Seq samples assembled previously. The stringtie --merge mode takes as input a list of all the assembled transcripts files (in GTF format) previously obtained for each sample, as well as a reference annotation file (-G option) if available. I have a few questions on this.

  1. I have 60 samples from Rat, 10 different organs and has 6 technical replicates for each organ. What is the best way to --merge? all the 60 samples at once or separate --merge for each and every organ.
  2. What is the significance of -- merge with an example.
  3. what is the next step after merging?


rna-seq next-gen • 146 views
ADD COMMENTlink modified 3 months ago by kristoffer.vittingseerup1.6k • written 3 months ago by jaidev53ster0
gravatar for kristoffer.vittingseerup
3 months ago by
European Union
kristoffer.vittingseerup1.6k wrote:

I will try to answer:

Q1) If you can run all samples in one go (and have equal number of samples in each organ) I would do that. Else you can follow the approach used in the recent CHESS paper where the authors of StringTie merged first within organ and then afterwards across organs (due to computational limits)

Q2) I interpret this as why do you want to merge. You want to merge because in the end you want to have all the same set of transcript/genes quantified in all your samples. This is necessary both because: 1) else you do not know which transcript correspond to which transcript in two different samples - else you cannot compare the two samples. 2) If it is not the same set of transcripts quantified you introduce a systematic bias in both samples.

Q3) You need to re-quantify all your samples using the combined transcriptome. Follow the instruction/guide here and you should be fine.

Note that:

  1. If you want to do differential expression I suggest using tximport to get the data into R rather than the script StringTie supply via their homepage. Using tximport you can follow this DE analysis guide.
  2. If you are interested this data can also be used to identify and analyze isofom switches with predicted functional consequences with my R package IsoformSwitchAnalyzeR. For examples of what type of analysis can be done take a look at this section of the vignette.
ADD COMMENTlink written 3 months ago by kristoffer.vittingseerup1.6k

Thank you for this helpful explanation and suggestion too.

ADD REPLYlink written 3 months ago by jaidev53ster0

No problem. If you like it you can always give it a thumbs up :-)

ADD REPLYlink written 3 months ago by kristoffer.vittingseerup1.6k

Sorry, I just forgot, I would love to do that. Thank you.

ADD REPLYlink written 3 months ago by jaidev53ster0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1162 users visited in the last hour