Question

How do I interpret a 200bp reduction of coverage in a gene exon?

0

Entering edit mode

3.2 years ago

francesca.genova90 • 0

Good morning everyone,

I hope someone can help me with this issue. In my lab, we performed a targeted sequencing with Illumina MiSeq. We sequenced one single gene (whole gene, both introns and exons) in 96 samples, belonging to two different experimental groups. After the alignment and the duplicates removal, I decided to visualize my bam files using the Golden Helix Genome Browser. I noticed a reduction of the coverage for a 200bp region in the only coding exon of my sequenced gene. This reduction occurs in all my samples but it is more evident in one group than the other (statistically determined using read depth values).

How should I interpret these data?

Bam visualization with Golden Helix Genome Browse. Image shows two samples, one for each experimental group

read-depth coverage golden-helix bam • 2.4k views

ADD COMMENT • link updated 16 months ago by Ram 44k • written 3.2 years ago by francesca.genova90 • 0

1

Entering edit mode

structural variance between alleles of the same gene?

ADD REPLY • link 3.2 years ago by lieven.sterck 15k

1

Entering edit mode

If you have a reference set available then you can try ExomeDepth (LINK) package to see if you have a CNV. There are other packages available as well.

ADD REPLY • link 3.2 years ago by GenoMax 144k

0

Entering edit mode

Thanks GenoMax for your answer. To my knowledge, CNV calling tools usually work on whole genome/whole exome sequencing data. I have sequenced only one gene. Therefore, I believe that CNV calling would not work properly.

ADD REPLY • link 3.2 years ago by francesca.genova90 • 0

1

Entering edit mode

How did you normalize the data and assess significance? Could this be a general issue due to GC bias and "sequencability" of that stretch of DNA, did you check public sequencing data, e.g. GTEx or any other human samples whether this reduction is a general phenomenon for that region?

Edit: If this is not human then any other public (e.g. WGS or exome) samples from the same species.

ADD REPLY • link 3.2 years ago by ATpoint 84k

0

Entering edit mode

Thanks ATpoint for your response. I extracted read depth values with "samtools depth" samtools depth manual. I calculated the read depth mean for the background coverage as well as for the 200bp region coverage for each sample. First I checked if the background coverage differed between samples of the two experimental groups (t test) and they were not significantly different. Then, I subtracted the average read depth of the 200bp region to the average background read depth, for each sample (normalization), to check how the depth of loss changed between our conditions. Finally, I performed the t test between the two experimental groups, using these differences, and I got a Pval = 0.03

I though this was the best way to evaluate the loss of coverage that I noticed with the visual inspection on Genome Browse

ADD REPLY • link 3.2 years ago by francesca.genova90 • 0

0

Entering edit mode

using PCR to validate the 200bp deletion if the quality of these reads is not very low.

ADD REPLY • link 3.2 years ago by Zhilong Jia ★ 2.2k

1

Entering edit mode

3.2 years ago

Jorge Amigo 14k

There are several reasons why this could be happening. One of the easiest and fastest ones to check is that maybe that coding exon has an homolog region in the genome, therefore some reads may be mapping with equal probability on both sites (this assigns those reads a mapping quality of "0"). Check if reads with mapping quality = 0 are being filtered (maybe only reads with mapping quality > 1 are being displayed by default) and, if so, just remove that threshold and check if the coverage on that region changes. If it does you'll know the reason why, and if doesn't you may continue thinking about other alternatives.

ADD COMMENT • link 3.2 years ago by Jorge Amigo 14k

0

Entering edit mode

Thanks Jorge Amigo ! I used BLAST to check whether that region has homologous along the genome and I did not find alignments other than the one on my gene.

ADD REPLY • link 3.2 years ago by francesca.genova90 • 0

score 2 · Accepted Answer · 2021-05-07

2

Entering edit mode

3.2 years ago

German.M.Demidov ★ 2.9k

It is a deletion!

(seriously, it is not a reduction of coverage, but a massive drop of 50% at exactly two bases, instead of gradual reduction)

I recommend to load it to IGV and inspect soft-clipped parts of reads and paired-end distances

ADD COMMENT • link 3.2 years ago by German.M.Demidov ★ 2.9k

0

Entering edit mode

Thanks for your reply! Do you have any suggestions about the biological meaning of having a different read depth in the same deletetion between the two conditions? Have you ever handled this kind of data?

ADD REPLY • link 3.2 years ago by francesca.genova90 • 0

0

Entering edit mode

Yup. It can be a polymorphism. You need to check the frequency of this deletion in some large cohort (such as 1000GP). This is how it started: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4405206/ , modern resources have much more details and study larger cohorts.

ADD REPLY • link 3.2 years ago by German.M.Demidov ★ 2.9k