My boss said he wanted to get the read depth graph for each CNV we found. The only data I have is a file with the number of reads aligned to each position (let's say from position 1 to 500 on the genome) Let's say we have a CNV deletion identified from position 100 to 150 .. How am I suppose to output the read depth for that CNV ?
If you have a sorted and indexed bam file you could do
samtools view yourFile.bam chr1:1000-5000 | wc
this will give you the number of reads mapping over chromosome 1 from position 1000 to 5000. For comparison you should also do the same with the normal and, at least, normalise for total number of reads.
However, how did you figure out that you have a CNV in that region? What program did you use? Can't the program you used tell you?
also consider this is very crude, as the number of reads have noise.
You'll probably want to plot the read depth in small bins, so that you can visualize the changes more clearly. I'd recommend grabbing the CNV region, plus some flanks with samtools, then writing a little script that parses the output and returns the average depth of coverage in 100bp bins. Plot these values for the tumor and the normal across the region, and if all goes well, you'll see something like this:
Normal is on top, tumor is on bottom, and you can clearly see the deletion in the tumor.
AFAIK read depth for CNV is coverage for this region in alignment.
E.g. if you identified that there is no gene/region (deletion), then it's depth = 0.
E.g. if you identified 2 copies of the same region, then depth will be about twice as usual depth (given uniformity of coverage).