Question: What are the methods and recources of filtering contigs after de novo genome assembly ?
0
gravatar for faizansaleem1992
3.4 years ago by
faizansaleem199240 wrote:

Hello,

I have a plant chloroplast genome that I de novo assembled using velvet. Now I want to filter the contigs.

Kindly suggest me which methods will be the best for this purpose and what tools should I use that are not much complicated.

Thankyou.

Best Regards, Faizan Saleem

assembly genome • 1.1k views
ADD COMMENTlink modified 3.4 years ago • written 3.4 years ago by faizansaleem199240
1
gravatar for k.kathirvel93
3.4 years ago by
k.kathirvel93250
India
k.kathirvel93250 wrote:

Using BLASTn against nt is very helpful. Download the nt DataBase from SRA. and do Blast against with your query contig seq file. Here is the comment : blastn -query /home/assembly.fa -db nt -max_target_seqs 1 -outfmt '6 qseqid pident evalue staxids sscinames scomnames sskingdoms stitle' -out outblast.txt. All the Best.

ADD COMMENTlink written 3.4 years ago by k.kathirvel93250

Thanks alot. Someone told me that I have to filter my contigs according to the DNA threshold value or something like that and I dont understand how to do it. can you help me please.

ADD REPLYlink written 3.4 years ago by faizansaleem199240
0
gravatar for Brian Bushnell
3.4 years ago by
Walnut Creek, USA
Brian Bushnell17k wrote:

Typically, I'd suggest BLASTing them against nt or similar, and removing the ones that hit suspicious things. RefSeq also has a plastid dataset; you could align your contigs to that also.

Furthermore, the chloroplast contigs should have similar coverage. Once you know the chloroplast coverage, you can usually just throw away contigs with very different coverage. For example, if the chloroplast is 500x on average, and you get some contigs with 100x coverage, those are probably something else (like the plant main genome). You can determine coverage by mapping the reads to the assembly (e.g. with BBMap: bbmap.sh in=reads.fq ref=assembly.fa covstats=covstats.txt) or usually by looking at the contig names, though that's less accurate.

ADD COMMENTlink modified 3.4 years ago • written 3.4 years ago by Brian Bushnell17k

Thanks alot. Someone told me that I have to filter my contigs according to the DNA threshold value or something like that and I dont understand how to do it. can you help me please.

ADD REPLYlink written 3.4 years ago by faizansaleem199240

There is no recipe for decontamination. You need to use your judgement.

ADD REPLYlink written 3.4 years ago by Brian Bushnell17k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1588 users visited in the last hour