N50 value of bacterial assembly is not half of total assembly size
1
1
Entering edit mode
6 months ago
analyst ▴ 30

Dear scientists,

I have performed de novo assembly of bacterial WGS data using spades and abyss. However I am not sure which assembly approach is good to go because N50 value is not half or more than half of total assembly length.

All reads are QC passed (no adapter or primer sequences are found and base quality is above 30).

And also please suggest that how many contigs should I keep for further analysis. Length of reference genome is 5,240,075 bp.

Here i am attaching assembly reports please input your valuable suggestions and guidance.

Assembly with abyss

assembly with abyss

Assembly with spades

spades without reference

Assembly with spades (using --trusted-contigs flag where reference genome was used to guide assembly)

spades with --trusted-contigs

Assembly with unicycler

assembly with unicycler

bacterial asssembly spades • 932 views
ADD COMMENT
1
Entering edit mode

Assembly with spades (using --trusted-contigs flag where reference genome was used to guide assembly)

Looks like you pasted the stats for plain SPAdes assembly twice or vice versa. Since the last two stats are identical.

ADD REPLY
0
Entering edit mode

Thanks for informing GenoMax. I mistakenly pasted stats of spades (with --trusted-contigs option) twice. I have edited the post.

ADD REPLY
1
Entering edit mode

N50 is the length of the size-ordered contig that puts you over that 50% threshold

http://jermdemo.blogspot.com/2008/11/calculating-n50-from-velvet-output.html

ADD REPLY
0
Entering edit mode

Thank you so much Jeremy for sharing useful link.

ADD REPLY
3
Entering edit mode
6 months ago
Mensur Dlakic ★ 27k

Your SPAdes assembly looks fine. N50 is not supposed to be half a genome. I would take a >5000 bp as a contig cutoff and that should be similar to your reference genome. This can be confirmed by mapping. You can lower the threshold maybe wo 2000 bp, but that will hardly be different from 5 kb.

ADD COMMENT
0
Entering edit mode

Thank you @Mensur Dlakic for your valuable suggestion!

Please have a look on stats again I mistakenly pasted stats of spades assembly with --trusted-contig flag twice. Adding --trusted-contigs parameter brought difference in total contigs number and N50 value as well.

Please suggest which spades approach should I follow?

ADD REPLY
0
Entering edit mode

Not sure why you need me to answer this question, as the --trusted-contigs assembly is clearly the best. It has fewest contigs, largest N50 and the largest contig overall.

ADD REPLY
0
Entering edit mode

Thank you for your answer. Best,

ADD REPLY

Login before adding your answer.

Traffic: 1538 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6