How to do coverage counting for each genomic position in R ?
2
3
Entering edit mode
6.5 years ago
MAPK ★ 1.9k

Hi Guys,
I am using pileup in Rsamtool from R package. I need to get  the total "count" for each position. For example, for chr11, position 1643082, there are a total of 1 count of A, 3 counts of C and 18 counts of T. I want them in this order:

chr11:1643082  A:1, C:3, T:18

and so forth for position chr11, 47663948 to be shown in the result. There are thousands of different positions and I need to calculate the counts for each nucleotide.Thanks!

table1

seqnames       pos strand nucleotide count               which_label
1     chr11   1643082      +          A     1     chr11:1643082-1643082
2     chr11   1643082      -          C     3     chr11:1643082-1643082
3     chr11   1643082      +          T    15     chr11:1643082-1643082
4     chr11   1643082      -          T     3     chr11:1643082-1643082
5     chr11  47663948      +          C    16   chr11:47663948-47663948
6     chr11  47663948      -          C    11   chr11:47663948-47663948
7     chr11  47663948      +          T     2   chr11:47663948-47663948
R • 2.1k views
9
Entering edit mode
6.5 years ago
PoGibas 5.0k

You can do this easily with data.table package.

Something like this:

df[, sum(count), by=list(seqnames,pos,nucleotide)]
0
Entering edit mode

7
Entering edit mode
6.5 years ago

The alternative using dplyr:

library(dplyr)
df %>%
group_by(seqnames,pos,nucleotide) %>%
summarise(count=sum(count))


Edit: to get the exact output you described, you have to:

> biost %>%
group_by(seqnames,pos,nucleotide) %>%
summarise(count=sum(count))%>%
mutate(mypos=paste(seqnames, pos, sep=':')) %>%
group_by(mypos) %>%
arrange(nucleotide) %>%   # Make sure nucleotides are always sorted the same way
summarise(counts=paste(nucleotide, count, sep=':',collapse=', '))

Source: local data frame [2 x 2]

mypos         counts
1  chr11:1643082 A:1, C:3, T:18
2 chr11:47663948      C:27, T:2
0
Entering edit mode

Thank you, Giovanni!

Traffic: 2285 users visited in the last hour
FAQ
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.