Subsetting Granges based on its metadata
2
3
Entering edit mode
9.7 years ago

Hello

I have a GRanges object consisting of a list of single nucleotide positions. Each row has an associated variant ID which is stored in a data frame in elementMetadata(all.vcf) as such:

library(GenomicRanges)

all.vcf
GRanges with 6378 ranges and 1 metadata column:
         seqnames             ranges strand   |      ID_names
            <Rle>          <IRanges>  <Rle>   |   <character>
     [1]    CHR01   [ 16725,  16725]      *   |   CHR01.16725
     [2]    CHR01   [ 46270,  46270]      *   |   CHR01.46270
     [3]    CHR01   [ 46282,  46282]      *   |   CHR01.46282
     [4]    CHR01   [ 64420,  64420]      *   |   CHR01.64420
     [5]    CHR01   [109016, 109016]      *   |  CHR01.109016
     ...      ...                ...    ... ...           ...
  [6374]    CHR05 [2939237, 2939237]      *   | CHR05.2939237
  [6375]    CHR05 [2965552, 2965552]      *   | CHR05.2965552
  [6376]    CHR05 [2981136, 2981136]      *   | CHR05.2981136
  [6377]    CHR05 [3096084, 3096084]      *   | CHR05.3096084
  [6378]    CHR05 [3154633, 3154633]      *   | CHR05.3154633
  ---
  seqlengths:
     CHR01   CHR02   CHR03   CHR04   CHR05
   1325633 1428053 1944125 2519115 3319307

I am trying to subset this Granges object based on the variant IDs stored in the elementMetadata data frame using a character vector called filt.ids which only contains the variant IDs I'm interested in. I have tried:

all.vcf[which(elementMetadata(all.vcf)[,1] == filt.ids)]

But this returns an error as there are far less variants in filt.ids than there are in the Granges object (longer object length is not a multiple of shorter object length).

Any ideas?

genome R • 14k views
ADD COMMENT
11
Entering edit mode
9.7 years ago
Michael 54k
all.vcf[(elementMetadata(all.vcf)[,1] %in% filt.ids)]
______________________________________^^^^
ADD COMMENT
0
Entering edit mode

Thanks, that worked perfectly!

ADD REPLY
3
Entering edit mode
6.9 years ago
msameet ▴ 50

Another handy trick is to use the column name of the metadata directly. e.g.

all.vcf[(elementMetadata(all.vcf)[, "Name"] %in% filt.ids)]
ADD COMMENT

Login before adding your answer.

Traffic: 1973 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6