Question: Filtering human reads in metagenomics: should supplementary reads be removed?
0
gravatar for ciemanek
14 months ago by
ciemanek140
The Netherlands/Amsterdam
ciemanek140 wrote:

I want to filter out human 'contaminants' from my metagenomic sample. I mapped my reads to a human genome and I am filtering them with samtools. So far, I only filter out reads that had mapping flag, but should I also add a flag for removing supplementary reads while filtering? I can't wrap my head around the idea of supplementary reads and what they actually mean in terms of filtering.

ADD COMMENTlink written 14 months ago by ciemanek140
1

It is always a good idea to remove reads that may map to host(s) e.g. human before performing de novo assembly in metagenomic analysis. However, I am not sure what you mean by

supplementary reads

can you explain what you mean by that?

ADD REPLYlink modified 14 months ago • written 14 months ago by Sej Modha4.7k

If it refers to What's Supplementary Reads? then they should be removed.

ciemanek : Have you looked at bbsplit.sh/removehuman.sh tool from BBMap suite?

ADD REPLYlink written 14 months ago by genomax89k

What I mean are reads mapped by bwa mem as supplementary - they are listed with samtools flagstat as below:

4086053 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
482807 + 0 supplementary
0 + 0 duplicates
964819 + 0 mapped (23.61% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)

What I wonder is what the fact that a read is supplementary tells me in terms of if it should be removed or not. I understand that those are reads that are not aligning fully into one fragment of a reference sequence but different parts of them map to different positions of the reference. Can we, in such case, say that this read is indeed coming from human DNA? And aren't reads flagged as 'supplementary' already a part of 'mapped'? I am not sure what flag should I use to filter out reads mapping to human.

I was not checking BBmap, as we wanted to use bwa + samtools since we already have it in our pipeline.

ADD REPLYlink modified 14 months ago • written 14 months ago by ciemanek140

If it is mapping to human genome then it should be removed.

ADD REPLYlink written 14 months ago by genomax89k

thanks a lot for the answer :)

ADD REPLYlink written 14 months ago by ciemanek140
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 833 users visited in the last hour