Entering edit mode
8.3 years ago
Nicolas Rosewick
11k
Hi,
I've two GRanges as :
> gene.pos.gr
GRanges object with 63677 ranges and 1 metadata column:
seqnames ranges strand | ensembl_gene_id
<Rle> <IRanges> <Rle> | <character>
[1] HG991_PATCH [66119285, 66465398] + | ENSG00000261657
[2] 13 [23551994, 23552136] - | ENSG00000223116
[3] 13 [23708313, 23708703] + | ENSG00000233440
[4] 13 [23726725, 23726825] - | ENSG00000207157
[5] 13 [23743974, 23744736] - | ENSG00000229483
... ... ... ... . ...
[63673] HSCHR17_1_CTG1 [ 62293, 253878] - | ENSG00000262334
[63674] HSCHR17_1_CTG1 [189016, 190077] + | ENSG00000262737
[63675] HSCHR17_1_CTG1 [198829, 201112] + | ENSG00000263267
[63676] HSCHR17_1_CTG1 [220262, 225205] + | ENSG00000262336
[63677] HSCHR17_1_CTG1 [222361, 224506] + | ENSG00000262005
-------
seqinfo: 265 sequences from an unspecified genome; no seqlengths
and
> exon.pos.gr
GRanges object with 738009 ranges and 1 metadata column:
seqnames ranges strand | ensembl_gene_id
<Rle> <IRanges> <Rle> | <character>
[1] HG991_PATCH [66119285, 66119659] + | ENSG00000261657
[2] HG991_PATCH [66298434, 66298819] + | ENSG00000261657
[3] HG991_PATCH [66314236, 66314392] + | ENSG00000261657
[4] HG991_PATCH [66320895, 66321004] + | ENSG00000261657
[5] HG991_PATCH [66339743, 66339847] + | ENSG00000261657
... ... ... ... . ...
[738005] HSCHR17_1_CTG1 [200680, 201068] + | ENSG00000263267
[738006] HSCHR17_1_CTG1 [220262, 220486] + | ENSG00000262336
[738007] HSCHR17_1_CTG1 [224683, 225205] + | ENSG00000262336
[738008] HSCHR17_1_CTG1 [222361, 223093] + | ENSG00000262005
[738009] HSCHR17_1_CTG1 [224327, 224506] + | ENSG00000262005
-------
seqinfo: 265 sequences from an unspecified genome; no seqlengths
I want to do the difference between them so I used difference <- setdiffgene.pos.gr,exon.pos.gr)
It gives me :
> difference
GRanges object with 290454 ranges and 0 metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
[1] HG991_PATCH [66119660, 66298433] +
[2] HG991_PATCH [66298820, 66314235] +
[3] HG991_PATCH [66314393, 66320894] +
[4] HG991_PATCH [66321005, 66339286] +
[5] HG991_PATCH [66339343, 66339742] +
... ... ... ...
[290450] HSCHR2_2_CTG12 [149868059, 149870556] +
[290451] HSCHR2_2_CTG12 [149870662, 149872728] +
[290452] HSCHR2_2_CTG12 [149872946, 149874163] +
[290453] HSCHR2_2_CTG12 [149874278, 149882769] +
[290454] HSCHR2_2_CTG12 [149882977, 149885671] +
-------
seqinfo: 265 sequences from an unspecified genome; no seqlengths
How to keep the extra columns from the gene.pos.gr GRanges, especially the ensembl_gene_id column
Thanks
I am not sure if you can do that directly using any of the set functions (intersect, union etc) of GRanges. The alternative is to
findOverlapbetween yourgene.pos.granddifference, and using that, get the required metadata and add it to thedifference. For example, see https://stat.ethz.ch/pipermail/bioconductor/2013-August/054486.html