Get sequence by genomic coordinates in R
1
0
Entering edit mode
3.6 years ago
a11msp ▴ 120

I've realised that after many years of using R, I still don't know a good way to extract a sequence by genomic coordinates. I tried using bioMart, but it seems like getSequence() can't just get any sequence, it asks for some anchors such as gene name, etc. Would appreciate your advice!

R sequence • 4.5k views
ADD COMMENT
8
Entering edit mode
3.6 years ago
ATpoint 82k

In R given a BSgenome object, here chr1:3000000-3000100, using the Biostrings library:

my.dnastring <- as.character(Biostrings::getSeq(BSgenome.Mmusculus.UCSC.mm10, "chr1", 3000000, 3000100))

my.dnastring
> NTTCTGTTTCTATTTTGTGGTTACTTTGAGGAGAGTTGGAATTAGGTCTTCTTTGAAGGTCTGGTAGAACTCTGCATTAAACCCATCTGGTCCTGGGCTTT
ADD COMMENT
0
Entering edit mode

This is great - thanks very much!

ADD REPLY
0
Entering edit mode

Can I somehow use this with a file that contains genomic coordinates. I have a data.frame and GRange object that contains all the coordinates for my sequences that I would like to retrieve.

ADD REPLY

Login before adding your answer.

Traffic: 1983 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6