N50 for de novo assembly result is too short
3.2 years ago
nisrinalulu ▴ 10

Dear all,

I have sequences from soil plantation sample. I already performed the assembly for my sequences using MEGAHIT but the result for N50 is too short, it's only around 500 bp. I have reed some paper that N50 is not only the parameter to say that our assembly is good, but i want to try my best to get good N50 for my data. I also already use another parameter of MEGAHIT to perform the assembly such as --kmin-1pass , --presets meta-large and --min-count 1 but the result seem not really good. Do you have an advice for me to perform MEGAHIT to get a good assembly result? Or do you have other assembler recommendation to perform de novo assembly for shotgun metagenome from soil plantation sample?

Thank you so much for your help.

Have you performed any QC on your data before assembly?

Yes, I have performed QC for my data using FASTQC and the result for the QC is good. This is one of the result for my QC. Almost all my sequences have the same result as below.

https://photos.app.goo.gl/7Dduc4diXh2EGDoe7

How deep did you sequence? Soil microbiome is probably the most complicated environment. If's anything less than two runs of HiSeq + Mate-pair or HiC the N50 you got is as good as you can get

I run my samples using Illumina HiSeq with 150 paired end. Beside of the platform for run the sequence, how I could know how deep is my sequence?

How many reads did you get? How long?

This is the total sequence that i get from QC:

1. F: 41,452,346 ; R: 41,452,346
2. F: 34,140,203 ; R: 34,140,203
3. F: 48,657,900 ; R: 48,657,900
4. F: 46,637,257 ; R: 46,637,257

OK, You need billions of reads in order to start getting something useful. You can only use it for reference-based analysis

Thank you so much for your reply. I'll looking forward about referece-based analysis. Sorry, do you have a recommendation of paper to perform reference-based anaysis?

Sorry, if I use reference-based analysis, can I get functional annotation from the analysis?

Yes, you sure can. You can start with using MG-RAST , it will do the magic for you.

Good luck with your research, I hope you'll get useful results.

Thank you so much Asaf

Hello, I was wondering if you improve your assembly somehow. I am also dealing with soil metagenomes and small N50