Question: Extract the overlap of well-covered regions across multiple samples
0
gravatar for DVA
13 months ago by
DVA530
United States
DVA530 wrote:

I am looking at somatic mutations across multiple samples, but the samples are not covered equally in many regions. Since we care about comparing the mutation counts between these samples, I would need to consider the uneven coverage -- e.g. sample A has 5X coverage at position 1, while sample B has 20X coverage at position 1; if I set filtering criteria about coverage in my workflow, and get rid of <10X mutations, then even if sample A has mutation in position 1, I would miss it; thus the comparison would not be fair.

Now my questions is, is there an easy/fast way to extract the well-covered regions across multiple samples? These are all WGS data (bam files size ~50-60GB), so I guess I could run bedtools on all of them and then overlap? Any other suggestions please? Thank you.

coverage wgs • 372 views
ADD COMMENTlink written 13 months ago by DVA530
1

and get rid of <10X mutations, then even if sample A has mutation in position 1, I would miss it;

If you're afraid of missing such mutation, why do you need to extract well-covered regions ?

ADD REPLYlink modified 13 months ago • written 13 months ago by Pierre Lindenbaum122k

Most likely, you don't need to worry about coverage beforehand. VCF files include depth information, and often times, information about per-sample depth and / or per allele. So you can filter by coverage after variant calling.

ADD REPLYlink written 13 months ago by h.mon26k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 798 users visited in the last hour