Entering edit mode
3.5 years ago
msening
•
0
Hi, I am using multiple Bam files to get the coverage for each of them.
file1 <- BamFile("bamfile.sorted.bam")
x<-readGAlignments(file1)
xcov<-coverage(x)
And then I wanted to draw the coverage plots:
lapply(xcov, function(x) {
plot(as.numeric(x), type="l")
})
But I have up to 3000 segments in xcov (the RleList object), so I would like to filter them and keep the ones that have more than 1 run. For example, I don't want these in my coverage plots:
RleList of length 3138
$seg1
integer-Rle of length 10731 with 1 run
Lengths: 10731
Values : 0
$seg2
integer-Rle of length 2789 with 1 run
Lengths: 2789
Values : 0
I tried to do this:
lapply(xcov, function(x) {
plot(as.numeric(x>0), type="l")
})
But it didn't do what I wanted it to do, is there any other way to fix this?
Hi, I tried it but it gave me an error like this:
I tried your xcov and it works, but my xcov is different, which is probably because I read it from a bam file. I just don't really understand the error and wonder if you could help me with it?
True. It is a package specific derivative data type. But actually it is even easier with this than a generic RLE list.
I am using
sapplyto apply a function to each element ofxcov.I am converting the RLE it back to the numeric vector (notice that the function used here isas.numeric, which only works because it is redefined by the package, usually you would need to useinverse.rlefor this). Then I calculate the sum, which is 0 for uncovered segments (having no runs) and a positive number for all others. This information needs to be translated to a logical TRUE (keep) and FALSE (delete) pattern.as.logicalconveniently does exactly this, but asum(as.numeric(x)) > 0would do exactly the same. The output of this is then directly used to subset the original dataxcov[...].