Relation of the number of contigs to the real DNA abundance before sequencing
0
0
Entering edit mode
2.7 years ago
P • 0

Hello everyone!

I have a problem with my work on contigs. I have searched the literature but have not found anything useful yet.

First, total DNA was isolated from a soil sample, then mechanically fragmented (sonication) and sequenced (Illumina). The raw data was then filtered using fastp and assembled de novo using Megahit.

The question is - what is the relationship between the number of contigs and the number of sequences before sequencing? I assume that the number of contigs is not the exact number of sequences before sequencing (e.g. damaged DNA during fragmentation, assembly problems, sequence similarities, etc.). Is it even possible to predict such information? However, I feel that I am losing important information about the relationship between bacterial taxa.

I have also done taxonomic classification (kraken2) of raw reads and contigs and the results are very different!

Thank you in advance for your help.

contig • 668 views
ADD COMMENT
0
Entering edit mode

relationship between the number of contigs and the number of sequences before sequencing?

Not sure what you mean by relationship but if you are asking if one can predict the number of contigs one would get then the answer is no. Assembly would be dependent on quality and complexity of libraries. For a metagenome sample there is no way to know the ground truth.

ADD REPLY

Login before adding your answer.

Traffic: 6205 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6