Question: Trying To Identify Low Coverage Regions (Bases) In My Sample From A Bed And Bam File.
0
gravatar for pfeifferr2150
6.6 years ago by
pfeifferr21500 wrote:

I have a bed of my targets (start and stop coordinates of each exon of my genes of interest). I have a BAM file generated from my ION PGM run. I am trying to identify any location identified in my BED file where I have less than 20x coverage so I can fill in these regions with traditional Sanger sequencing.

Basically, if any bases within the exon are covered at less than 20x, I want to know which. (I have heard 20x is a good threshold for germline variants, agreed?)

Also, if I add a descriptive section to the BED file contain “gene-exon” that could be included would be even more helpful.

I have been reading post online for days, and I am lost here. Can anybody help please?

Thanks

Pfeifferr

bed coverage • 2.4k views
ADD COMMENTlink modified 6.6 years ago by dariober11k • written 6.6 years ago by pfeifferr21500

cross-posted: http://seqanswers.com/forums/showthread.php?t=41880

ADD REPLYlink written 6.6 years ago by Pierre Lindenbaum131k
2
gravatar for dariober
6.6 years ago by
dariober11k
WCIP | Glasgow | UK
dariober11k wrote:

Something along these lines, using bedtools, might help. Assuming your bam file is already sorted.

## For speed, sort the file of target regions if not already sorted:
sort -k1,1 -k2,2n -k3,3n targets.bed > targets2.bed

genomeCoverageBed -bga -ibam myreads.bam -g genome.fasta \
| intersectBed -a - -b targets2.bed -sorted \
| awk '$4 < 20 {print $0}' > lowcov.bed

lowcov.bed will be a bed file of intervals with coverage <20x.

ADD COMMENTlink modified 12 months ago by RamRS30k • written 6.6 years ago by dariober11k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1197 users visited in the last hour