Clumpify then Salmon?
1
0
Entering edit mode
11 weeks ago
JMB ▴ 20

Hi,

I have a dataset that I would like to analyze using Salmon. The data is paired-end Illumina and has a very high duplication rate of 90-95% as determined by Markduplicates in Picard (small genome, oversequenced). I would like to remove duplicates using clumpify before Salmon. I know clumpify will do some sorting to reduce file sizes and I just want to make sure this will not interfere with Salmon, which I know does not want data sorted by coordinates. I'm assuming clumpify is not sorting this way, but just want to make sure I'm not missing anything. Thanks!

Salmon RNA-seq Clumpify • 290 views
ADD COMMENT
1
Entering edit mode
11 weeks ago
GenoMax 141k

No, clumpify is not going to sort based on co-ordinates. It only looks at the sequence. It is an alignment-free dedupe program. Re-ordering of the clumps is false by default. It is only useful for compression when turned on.

ADD COMMENT

Login before adding your answer.

Traffic: 3122 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6