Question: how to calculate genome coverage for integrated genomes?
0
gravatar for marongiu.luigi
3.1 years ago by
Germany, Mannheim, UMM
marongiu.luigi520 wrote:

Hello,

I have some troubles calculating the number of reads that I should expect for a target sequence (for instance a trasposon) integrated into the human genome. That is: how many reads should I expect to map to my target sequence and confirm the presence of the target?

Assuming: 1) a pre-calculated coverage of 20, 2) a target region of 1000 bp and 3) a fixed length read of 150 bp and using the formula C=NL/G i get:

N=CG/L=20 x 1000 / 150 = 133 reads

this looks a bit too many reads. Or should I calculate using the whole human sequence, since the target is integrated into it? in that case, I get:

N=20 x 3 000 000 000 / 150 = 400 000 000 reads

that is clearly wrong.

My question is, therefore: how do I calculate the coverage in general and for integrated sequences in particular? Thank you

sequencing assembly genome • 1.2k views
ADD COMMENTlink written 3.1 years ago by marongiu.luigi520

But from where do you get this pre-calculated coverage of 20?

ADD REPLYlink written 3.1 years ago by WouterDeCoster45k

Probably shooting for a 20x coverage.

ADD REPLYlink written 3.1 years ago by GenoMax96k

the data was given with this coverage but based on the human genome. I would like to estimate how many reads should I expect for the trasposon

ADD REPLYlink written 3.1 years ago by marongiu.luigi520

Is the sequence for that transposon so specific that you don't expect to get any alignments outside that 1kb?

ADD REPLYlink written 3.1 years ago by GenoMax96k

well, the sequence is not human as such, but otherwise there is nothing special about the target; the reads should align more or less at the same average for both human and transposon. So shall I expect 20x coverage also for the transposon?

ADD REPLYlink written 3.1 years ago by marongiu.luigi520

If you are sure the transposon is only in that one location (seems a bit implausible) then that may be a reasonable assumption. As long as there is no strange bias in transposon sequence compared to human genome.

ADD REPLYlink written 3.1 years ago by GenoMax96k

OK, otherwise is the formula correct? should I expect 133 reads covering the trasposon?

ADD REPLYlink written 3.1 years ago by marongiu.luigi520
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1977 users visited in the last hour
_