Rna-Seq : How To Check For Very High Coverage Region
4
2
Entering edit mode
10.8 years ago

Hi,

How can I check for very high coverage region in a bam file. I've several samples with good alignment rate (~90%) but when I count the reads for known genes (ensembl), I don't have good results compared to previous data. I think it's rRNA or other contamination. So how can I check for high coverage region in my data (more than 10000 reads per position) ?

Thanks a lot,

N.

rna-seq coverage • 3.0k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
1
Entering edit mode
10.8 years ago
bruce.moran ▴ 960

Hi,

I was going to suggest 'samtools depth' but that doesn't appear in the manual anymore, might still be functioning though (works on my cluster)... Essentially, depth used to give number of reads aligning to a region. So you would give it a BAM and the region and reference.fa and get your (per base) 'depth', then awk out those higher than 10000.

Bruce.

ADD COMMENT
1
Entering edit mode
10.8 years ago
DG 7.3k

You could use the GATK DepthOfCoverage walker with a provided BED file. Usually these are per exon but any sort of intervals will do. I believe you can get per position coverage information from that tool as well. Otherwise for intervals you can set what sort of percentage/depth cutoffs you want to report.

ADD COMMENT
1
Entering edit mode
10.8 years ago

You can generate a mpileup output with samtools. Then use a script to scan through the file to get the top X bases with the highest read mapping. You can use this top X list to narrow down the regions with highest coverage.

To get counts for each base, you can simply count the number of '.' or ',' in the 4th column of the mpileup output. A '.' (period) stands for read mapping to forward strand and ',' (comma) stands for read mapping to reverse strand.

ADD COMMENT
1
Entering edit mode
10.8 years ago
Ian 6.0k

I would use bedtools coverage:

Summary: Returns the depth and breadth of coverage of features from A on the intervals in B.

bedtools coverage -abam reads.bam -b interesting_intervals.bed > output

other flags can help you get the format you want.

ADD COMMENT

Login before adding your answer.

Traffic: 2807 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6