Question: Filtering human reads in metagenomics: should supplementary reads be removed?
0
gravatar for ciemanek
4 months ago by
ciemanek130
The Netherlands/Amsterdam
ciemanek130 wrote:

I want to filter out human 'contaminants' from my metagenomic sample. I mapped my reads to a human genome and I am filtering them with samtools. So far, I only filter out reads that had mapping flag, but should I also add a flag for removing supplementary reads while filtering? I can't wrap my head around the idea of supplementary reads and what they actually mean in terms of filtering.

ADD COMMENTlink written 4 months ago by ciemanek130
1

It is always a good idea to remove reads that may map to host(s) e.g. human before performing de novo assembly in metagenomic analysis. However, I am not sure what you mean by

supplementary reads

can you explain what you mean by that?

ADD REPLYlink modified 4 months ago • written 4 months ago by Sej Modha4.5k

If it refers to What's Supplementary Reads? then they should be removed.

ciemanek : Have you looked at bbsplit.sh/removehuman.sh tool from BBMap suite?

ADD REPLYlink written 4 months ago by genomax74k

What I mean are reads mapped by bwa mem as supplementary - they are listed with samtools flagstat as below:

4086053 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
482807 + 0 supplementary
0 + 0 duplicates
964819 + 0 mapped (23.61% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)

What I wonder is what the fact that a read is supplementary tells me in terms of if it should be removed or not. I understand that those are reads that are not aligning fully into one fragment of a reference sequence but different parts of them map to different positions of the reference. Can we, in such case, say that this read is indeed coming from human DNA? And aren't reads flagged as 'supplementary' already a part of 'mapped'? I am not sure what flag should I use to filter out reads mapping to human.

I was not checking BBmap, as we wanted to use bwa + samtools since we already have it in our pipeline.

ADD REPLYlink modified 4 months ago • written 4 months ago by ciemanek130

If it is mapping to human genome then it should be removed.

ADD REPLYlink written 4 months ago by genomax74k

thanks a lot for the answer :)

ADD REPLYlink written 4 months ago by ciemanek130
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1708 users visited in the last hour