I have only read about DNA sequencing and never seen the actual results from a sequencing project. I'm wondering how heterozygotes and and somatic mutations show up in sequencing results. This is my understanding of a sequencing project
1) extract DNA, typically from blood cells 2) make clone library. There is a formula which works out how many clones you need to make sure all of the DNA of a heterozygous individual is represented in a clone (by all of the DNA i mean both copies of a chromosome) 3) sequence the clones. The sequencing project has an overall coverage. On a genome basis, it means that, on average, each base has been sequenced a certain number of times (10X, 20X...). For a specific nucleotide, it represents the number of sequences that added information about that nucleotide.
If the individual is heterozygous at a loci you will see 2 alleles at that position. You would expect to see each allele in approximately 50% of the sequencing reads. However is it correct that there is no reason stopping your clone library from overrepresenting one chromosome so you do not get a 50:50 distribution of each allele?
Considering somatic mutations. it is possible that one of your blood cells has a spontaneous mutation at a particular locus and it is possible that the DNA fragment from this such blood cell is inserted into a clone libary. Whilst I imagine this is very rare, is it possible? How would this show up in your sequencing results? Lets say a locus has 25x coverage and only one of those reads is a different allele to the others due to your somatic mutation, would it be classed as a sequencing error or would you class the locus as heterozygous? If that locus was already heterozygous you could in theory get 3 alleles there I presume?
thanks a lot