Question: Remove contamination before assembly
0
gravatar for bruseq
2.1 years ago by
bruseq40
bruseq40 wrote:

Hello everyone,

I am trying to draft a fungal genome, but the raw sequence R1 and R2 contain contamination, so before doing assembly I mapped Raw reads using bowtie2 against refseq Bacterial db and mitochondrial db, then exact unmapped reads using samtools {the unmapped reads are contamination free } then did assembly using MaSuRCA and get final.genome.sacffolds.fasta file. But again I did ncbi online blast against NR db with assembled fasta file, they show approx. 93% similarity with bacteria.

So, please guide why assembled fasta show similarity with bacteria after removing these(bacteria) in the previous mapping step using refseq db.

Is there any other way to remove contamination from Raw reads before assembly. Please guide me.

Thank you Divya

draft genome assembly genome • 1.6k views
ADD COMMENTlink modified 2.1 years ago by h.mon31k • written 2.1 years ago by bruseq40
1
gravatar for genomax
2.1 years ago by
genomax92k
United States
genomax92k wrote:

But again I did ncbi online blast against NR db with assembled fasta file, they show approx. 93% similarity with bacteria.

Did you by chance remove the reads you need and leave the bacterial reads in?

Is your data so badly contaminated that you feel the need to align against RefSeq bacterial DB? Can't you use a specific bacterium (or two)? Otherwise use a related fungal genome try to get the reads you need and leave the rest behind. I suggest using bbsplit.sh from BBMap suite for that purpose.

ADD COMMENTlink written 2.1 years ago by genomax92k
1
gravatar for h.mon
2.1 years ago by
h.mon31k
Brazil
h.mon31k wrote:

Did you try assembling and then removing the contaminant contigs? BlobTools is a good tool for this.

ADD COMMENTlink written 2.1 years ago by h.mon31k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1904 users visited in the last hour