Question: Targeted Sequencing - Calculating The Minimum Exon Coverage
1
7.2 years ago by
Luca Beltrame220
IRCCS Istituto di Ricerche Farmacologiche "Mario Negri", Milano, Italy
Luca Beltrame220 wrote:

I recently received a batch of data from a targeted sequencing (50 genes, human) experiment. The sequencing was targeted on all exons of these 50 genes. I was suggested to calculate the minimum coverage for each exon, in case the coverage for the reads is not uniform.

As I'm still inexperienced, I was wondering how to calculate this. My idea would be, given that I have a BED file with the sequenced regions:

1. Calculate per-base coverage for each BAM file using bedtools coverage and the BED file for the regions;
2. Group the result by exon coordinate, and choose the minimum value for each coordinate.

Is there a better way, or is my approach totally off? Thanks!

exon coverage sequencing • 4.1k views
ADD COMMENTlink
written 7.2 years ago by Luca Beltrame220
2
7.2 years ago by
Sean Davis25k
National Institutes of Health, Bethesda, MD
Sean Davis25k wrote:

Minimum exon coverage, though, is probably not the most actionable number to work with. What you really want to calculate is the number of "callable bases". You can do this per gene, per exon, or per sample. One simple but approximate way to look at this is the percentage of bases covered >X (ie., 82% of target bases covered at 30x) after removing duplicates and low quality bases/reads. Picard CalculateHsMetrics is useful for this kind of thing, but you can also write your own.

ADD COMMENTlink written 7.2 years ago by Sean Davis25k
1
7.2 years ago by
United States
Zev.Kronenberg11k wrote:

It is defiantly smart to take a look at coverage / depth of your exons.

I would suggest checking out BEDTOOLS. It has methods to answer these questions. I.E. Given my bam what is my % coverage....

ADD COMMENTlink written 7.2 years ago by Zev.Kronenberg11k

In fact I'm using bedtools coverage to get the fraction of exons covered. I was wondering if there was anything else I could do.

ADD REPLYlink written 7.2 years ago by Luca Beltrame220
0
7.2 years ago by
Leszek4.0k
IIMCB, Poland
Leszek4.0k wrote:

I came across TEQC - Bioconductor package for dealing with enrichment sequencing.
Easily, you can plot coverage distribution histograms (manual).
Give it a try!

ADD COMMENTlink written 7.2 years ago by Leszek4.0k

Unfortunately, like many Bioconductor packages, it fails with a completely nondescript error when using it, and it looks it doesn't support my aligned genome (the 1000 genomes reference) which uses chromosomes without "chr" prefix.

ADD REPLYlink modified 7.2 years ago • written 7.2 years ago by Luca Beltrame220

I can never understand why people write code where seqid must begin with chr...

ADD REPLYlink written 7.2 years ago by Zev.Kronenberg11k

I fixed it in the end, the BAM handling is broken, it works if I convert the BAM to a BED file.

ADD REPLYlink written 7.2 years ago by Luca Beltrame220

Hi Leszek,

I have used TEQC, but getting errors. I posted the details here: C: reading a bed file in R

Thanks!

ADD REPLYlink written 2.7 years ago by bioinfo8120
Please log in to add an answer.

Content
Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 795 users visited in the last hour