Question: How to conclude in a significant decrease of coverage depth
gravatar for maxime.policarpo
14 days ago by
France, Paris
maxime.policarpo40 wrote:


Here is my question :

I have several files in .bam format resulting from the alignment of reads of different individuals against a ref genome. I know that in a particular region, there is a large deletion, that can either be homozygous or heterozygous (some individual have one chromosome with the big deletion, one with the intact region).

Homozygous deleted individuals are easily detected as there are 0 reads in this region.

I would like to detect heterozygous individuals using the mean coverage in this region. The idea is that, for an individual who is heterozygous for the deletion, there will be a significant decrease in reads coverage in this region compare to the rest of the chromosome.

To do so i used the samtools depth command. I first computed the coverage over the entire chromosome :

samtools depth -r Chromosome1 Bam_files/Ind6_vs_Genome_sorted.bam | awk '{sum+=$3} END { print "Average = ",sum/NR}'

Average = 12.1426

Then i computed the coverage on the region of interest :

samtools depth -r Chromosome1:5566-60000 Bam_files/Ind6_vs_Genome_sorted.bam | awk '{sum+=$3} END { print "Average = ",sum/NR}'

Average = 5.39289

It seems like there is indeed a decrease of coverage in this region but how can i conclude that this is significant ?

I hope this is clear,



ADD COMMENTlink modified 14 days ago by Ahill1.3k • written 14 days ago by maxime.policarpo40

Here are two solutions i propose, but i have no idea if it is correct :

1 - First, compare the two coverages (whole chromosome and region) between an individual we know does not have the deletion (homozygous non deleted) and an individual we are investigating with a Chi2 and if it is significantly different, then we can conclude this individual is heterozygous for the deletion.

2- If coverage of the region is comprised between 40-60% of the whole chromosome coverage, conclude that the individual is heterozygous

ADD REPLYlink written 14 days ago by maxime.policarpo40

Hi maxime.policarpo ,

I've moved this post to a comment as it does not actually answers your original question. This way we can keep the forum structured and organised

ADD REPLYlink written 14 days ago by lieven.sterck3.5k
gravatar for Ahill
14 days ago by
United States
Ahill1.3k wrote:

Comparing read depths across individuals might be affected by inter-sample differences in sequencing quality/yield. Tools for identifying structural variations like deletions from read depth are available - take a look at the list of software in Table 1 in this paper or this paper. Perhaps using one of those software tools to ID copy-number variations may be the easiest and most accurate way to verify the presence of deletions in your region. If that doesn't suit you could try to formulate your own statistical model to identify regions with significantly reduced read depths: for example by doing a randomization test using average read depths from randomly sampled segments of the same size as your target region (for more ideas see e.g. this thread and references therein).

ADD COMMENTlink modified 14 days ago • written 14 days ago by Ahill1.3k

Thanks a lot for your suggestions !


ADD REPLYlink written 14 days ago by maxime.policarpo40
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1304 users visited in the last hour