Shall we move duplicated reads from metagenomic data?
Entering edit mode
6.1 years ago
liuyifan2014 ▴ 80

Hi everyone, do you usually filter the duplicates during the pretreatment of your metagenomic raw data? Since some organisms contain multiple sets of the same gene(like 16s), I am afraid it will lose some information after removal of the duplicates.


sequencing next-gen gene PRINSEQ duplicates • 2.2k views
Entering edit mode

As you have correctly identified, there are arguments for and against removing dupes in metagenomic sequencing. Whatever path you choose, you're going to hear all about the other path from reviewers, lab members, etc. Best advice is to just look at both.
I would personally filter in/out properly-mapped reads too. Depending on the tools you use, this shouldn't even increase the processing time significantly :) And no need to treat them all the same after filtering either - if the insert length without duplicates is different to with duplicates, fine. You're doing different views of the same data, not different experiments, so its OK that their analysis is slightly different as it isn't about a comparison between the different views after all.


Login before adding your answer.

Traffic: 2437 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6