HI
I am new to sequencing and am trying to learn the basics for use on a miseq sequencer.
i understand that the read length is the number of bases sequenced consectively, but i was hoping that someone can please explain what the number of reads is, and what the depth of sequencing means?
Pierre's link covered the depth question. Regarding the number of reads, this sequencing reaction is occurring on many places on a "flowcell", each with possibly a different underlying DNA fragment. The number of these reactions is the number of reads (approximately), since there's a camera imaging these and determining the sequence for each individually (technically it's looking at clusters, but that's beside the point). The image below may help. It's a rendering of a flowcell with attached adapters (pink and purple dots). Attached to some of those are different fragments of DNA. Those would be used to make clusters, which would then be sequenced in parallel (the process being monitored by a camera to determine the sequence).
The camera would then see something like this (thank you google image search!), where the colors would be different bases in different clusters:
I should note that the image and part of my reply uses terminology appropriate for Illumina-generated data, but other methods (nanopores, etc.) will follow the same general principals.
Thank you for your replies.
Am i now right in thinking that read depth = (total number of bases generated) / (size of genome sequenced). ie how any times coverage the whole genome has had, but that coverage/base is worked out by the lander/waterman equation: C = LN/G where C is coverage (how many times a nucleotide has been sequenced in a run)?
So on the miseq the read length is 2x300 and the number of reads is 25M - so if you had a genome that is 2.5G that equates to 6x for every base?
Thanks for your help :)
Depth and coverage are the same (the full term is "depth of coverage"). LN is the same as total number of bases generated. Keep in mind that reads in a pair will often overlap, so these simple equations may overestimate things (not to mention the effects of trimming or soft-clipping).
Edit: Yes, your example would have ~6x coverage, though likely less in practice.
dup of What is the sequencing 'depth' ?