Can getSequence get sequence of coding and its upstream in one entry?
0
0
Entering edit mode
4.1 years ago

I am trying to retrieve the sequence of a number of genes using getSequence., but specifically I only wish to retrieve 500bp upstream and downstream of the start and stop codon separately (1000bp each). However, it seems that biomaRt does not allow retrieval of upstream or downstream sequence using "coding".

I tried the following (I haven't worked out how to trim the sequence into the desired length, and choose 1 out of all the sequences only):

library(biomaRt)
ensembl <- useMart("ensembl")
ex <- c("ACTN4")
mart <- useMart("ensembl", dataset = "mmusculus_gene_ensembl")
gene2sequence <- getSequence (id = ex, type = "external_gene_name", seqType = "coding", upstream = "500", mart = mart)
exportFASTA(gene2sequence, file="desktop/test.fasta")

But I will get an error:

> Query ERROR: caught BioMart::Exception::Usage: Filter upstream_flank NOT FOUND

Do I have to get the sequence using coding, 5utr, 3utr separately, and link them together? Or Is there any way I can work around this? Thanks!

getSquence biomaRt • 1.3k views
ADD COMMENT
0
Entering edit mode

Allowed seqType arguments are listed on page 10 of biomaRt vignette. 3utr and 5utr are among possible options for seqType.

ADD REPLY
0
Entering edit mode

I tried 3utr and 5utr, but I want to retrieve 500bp sequence around stop codon, which would be part of the 5utr and coding. I'm not sure how to do it though, perhaps get the 5utr and coding separately and combine them? It's possible but I am a beginner on bioinformatics so it would be difficult.

ADD REPLY
0
Entering edit mode

Does this cover you needs)?

getSequence(chromosome, start, end, id, type, seqType,upstream, downstream, mart, verbose = FALSE)

upstream  To add the upstream sequence of a specified number of basepairs to the output.
downstream  To add the downstream sequence of a specified number of basepairs to the out-put
ADD REPLY
0
Entering edit mode

I tried getSequence (id = ex, type = "external_gene_name", seqType = "5utr", downstream = "500", mart = mart), but I got an error:

Query ERROR: caught BioMart::Exception::Usage: Filter downstream_flank NOT FOUND

That's where I am stuck, I guess upstream and downstream can only be used in seqtype such as coding_transcript_flank' which is mentioned in here

ADD REPLY

Login before adding your answer.

Traffic: 2558 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6