I have a set os samples from ancient DNA. Highly degraded and with a lot of bacterial contamination.
We did shallow sequencing just to see if there is any human DNA in it.
Now we want to select the best samples to go with deep sequencing and probably whole genome enrichment.
The question is, by looking at my alignment results:
how many sequences were aligned (probable human) and to how many unique places in genome, can I come up with a rough estimate of how deep should I sequence my sample to get at least one read at each human molecule present in library?
And the reason for doing this: If I have only this much template human molecules there is no point going really deep with sequencing, since I will only add coverage to the sites, that I already have sequenced.
So how deep to sequence to find all there is at least once?
And this is what I was thinking of till now:
I have my alignments, I can estimate how many unique places of alignment I have. I can do random subsampling a couple of times at the same depth (the same number of randomly selected reads) map those and see how many different unique mapping places I get between each mapping.
Does it make any sense to try and look for an answer this way? Any suggestions and different aproaches are warmly welcome!