Question: Extract the overlap of well-covered regions across multiple samples
0
gravatar for DVA
7 months ago by
DVA490
United States
DVA490 wrote:

I am looking at somatic mutations across multiple samples, but the samples are not covered equally in many regions. Since we care about comparing the mutation counts between these samples, I would need to consider the uneven coverage -- e.g. sample A has 5X coverage at position 1, while sample B has 20X coverage at position 1; if I set filtering criteria about coverage in my workflow, and get rid of <10X mutations, then even if sample A has mutation in position 1, I would miss it; thus the comparison would not be fair.

Now my questions is, is there an easy/fast way to extract the well-covered regions across multiple samples? These are all WGS data (bam files size ~50-60GB), so I guess I could run bedtools on all of them and then overlap? Any other suggestions please? Thank you.

coverage wgs • 268 views
ADD COMMENTlink written 7 months ago by DVA490
1

and get rid of <10X mutations, then even if sample A has mutation in position 1, I would miss it;

If you're afraid of missing such mutation, why do you need to extract well-covered regions ?

ADD REPLYlink modified 7 months ago • written 7 months ago by Pierre Lindenbaum116k

Most likely, you don't need to worry about coverage beforehand. VCF files include depth information, and often times, information about per-sample depth and / or per allele. So you can filter by coverage after variant calling.

ADD REPLYlink written 7 months ago by h.mon23k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1707 users visited in the last hour