I am currently working on rnaseq data and the goal is to identify novel transcripts arising from the perturbation of a splicing factor. Through papers and posts I have read, several of my questions couldn't find answers, so I try here :
1) STAR has a per and multi-sample way. The authors said he would run multisample analyses including all conditions together. Is it relevant merging all my samples while one of the treatment disturb the splicing? Should I there precisely run a per-sample discovery/mapping to avoid multimapping of reads because of new Splice Junctions that (I guess) will be for the majority the consequence of my splicing perturbating conditions?
2) According to my reading, Stringtie is widely used as it ; per-sample assembly and then a merging step with all the samples. That two questions are quite related ; I could there simply rely on a multi-sample 2-pass mapping with STAR and thus directly assemble the all-in-one multi.bam generated?
To summarize I would say I don't get the criteria that imply a per/multi-sampling procedure and I am interested in the justifications behind.
To me, I would say the better way to conduct this is to separate between splicing pertubating conditions and others, perform parallel procedure until the GFF comparison could determine novel transcripts arising from "normal" conditions and splicing perturbating conditions.