I'm doing a 2-pass mapping of RNAseq reads to an mm10 index using Hisat2 in a shell scripted loop. This question relates to combining several --novel-splicesite-outfile splice site lists from a bunch of biological replicate reads, to use as a single --novel-splicesite-infile for all the biological replicates in the 2nd pass.
So I've read somewhere that it is useful to combine several novel-splicesite-outfile files from the entire dataset during the first pass, remove duplicates, and use this expanded splicesite file as the --novel-splicesite-infile for the second pass for all replicates in the 2nd pass.
It's not stated lucidly (to me at least) in the manual, but I think this happens automatically when you feed a whole bunch of replicates into a single multi-read input in a hisat2 run, am I right?
best regards, K