How To Get The Read Depth?
5
7
Entering edit mode
11.7 years ago
madkitty ▴ 690

My boss said he wanted to get the read depth graph for each CNV we found. The only data I have is a file with the number of reads aligned to each position (let's say from position 1 to 500 on the genome) Let's say we have a CNV deletion identified from position 100 to 150 .. How am I suppose to output the read depth for that CNV ?

genome sequencing read depth-of-coverage • 16k views
ADD COMMENT
0
Entering edit mode

Thanks a lot for your answers guys! All of them are really useful :)

ADD REPLY
8
Entering edit mode
11.7 years ago

You'll probably want to plot the read depth in small bins, so that you can visualize the changes more clearly. I'd recommend grabbing the CNV region, plus some flanks with samtools, then writing a little script that parses the output and returns the average depth of coverage in 100bp bins. Plot these values for the tumor and the normal across the region, and if all goes well, you'll see something like this:

enter image description here

Normal is on top, tumor is on bottom, and you can clearly see the deletion in the tumor.

ADD COMMENT
0
Entering edit mode

that's really nice! Exactly what I need, but for the script can I just use a package that already exists in R ? Are you aware of any ?

ADD REPLY
2
Entering edit mode

For a simple implementation in python, try ngCGH. In R, see the GenomicRanges and Shortread packages and the coverage() method.

ADD REPLY
4
Entering edit mode
11.7 years ago

If you have a sorted and indexed bam file you could do

samtools view yourFile.bam chr1:1000-5000 | wc

this will give you the number of reads mapping over chromosome 1 from position 1000 to 5000. For comparison you should also do the same with the normal and, at least, normalise for total number of reads.

However, how did you figure out that you have a CNV in that region? What program did you use? Can't the program you used tell you?

also consider this is very crude, as the number of reads have noise.

ADD COMMENT
0
Entering edit mode

Indeed, as your CNV calling algorithm probably used read-depth as part of its protocol, you should really be sure you have a good handle on what it is doing as well as what you expect to plot. That read depth is usually on a per nucleotide basis but you can also use the number of reads that overlap with your called interval as a proxy.

ADD REPLY
0
Entering edit mode

Thanks for your answer. I used Breakdancer to find CNVs, Breakdancer already gives me the number of reads aligned to the CNV found, but my boss doesn't believe breakdancer and wants to see a figure similar to the one posted by Chris Miller below...

ADD REPLY
2
Entering edit mode
11.7 years ago
Nikolay Vyahhi ★ 1.3k

AFAIK read depth for CNV is coverage for this region in alignment.

E.g. if you identified that there is no gene/region (deletion), then it's depth = 0.

E.g. if you identified 2 copies of the same region, then depth will be about twice as usual depth (given uniformity of coverage).

ADD COMMENT
1
Entering edit mode
11.7 years ago

You might try:

samtools depth -r chr1:100-150 bamfile.bam

I agree with Stefano, though, that you probably need to understand what your CNV caller did before proceeding.

ADD COMMENT
1
Entering edit mode
11.7 years ago
Ian 6.0k

If you can get a BED file (chr start end) of CNV regions you can use bedtools to get coverage of reads for each CNV region.

bedtools coverage -abam reads.bam -b cnv_regions.bed
ADD COMMENT

Login before adding your answer.

Traffic: 2143 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6