Question: Can getSequence get sequence of coding and its upstream in one entry?
0
gravatar for oboeconcerto314
8 months ago by
oboeconcerto3140 wrote:

I am trying to retrieve the sequence of a number of genes using getSequence., but specifically I only wish to retrieve 500bp upstream and downstream of the start and stop codon separately (1000bp each). However, it seems that biomaRt does not allow retrieval of upstream or downstream sequence using "coding".

I tried the following (I haven't worked out how to trim the sequence into the desired length, and choose 1 out of all the sequences only):

library(biomaRt)
ensembl <- useMart("ensembl")
ex <- c("ACTN4")
mart <- useMart("ensembl", dataset = "mmusculus_gene_ensembl")
gene2sequence <- getSequence (id = ex, type = "external_gene_name", seqType = "coding", upstream = "500", mart = mart)
exportFASTA(gene2sequence, file="desktop/test.fasta")

But I will get an error:

> Query ERROR: caught BioMart::Exception::Usage: Filter upstream_flank NOT FOUND

Do I have to get the sequence using coding, 5utr, 3utr separately, and link them together? Or Is there any way I can work around this? Thanks!

biomart getsquence • 349 views
ADD COMMENTlink modified 3 months ago by Biostar ♦♦ 20 • written 8 months ago by oboeconcerto3140

Allowed seqType arguments are listed on page 10 of biomaRt vignette. 3utr and 5utr are among possible options for seqType.

ADD REPLYlink modified 8 months ago • written 8 months ago by GenoMax92k

I tried 3utr and 5utr, but I want to retrieve 500bp sequence around stop codon, which would be part of the 5utr and coding. I'm not sure how to do it though, perhaps get the 5utr and coding separately and combine them? It's possible but I am a beginner on bioinformatics so it would be difficult.

ADD REPLYlink written 8 months ago by oboeconcerto3140

Does this cover you needs)?

getSequence(chromosome, start, end, id, type, seqType,upstream, downstream, mart, verbose = FALSE)

upstream  To add the upstream sequence of a specified number of basepairs to the output.
downstream  To add the downstream sequence of a specified number of basepairs to the out-put
ADD REPLYlink modified 7 months ago • written 7 months ago by GenoMax92k

I tried getSequence (id = ex, type = "external_gene_name", seqType = "5utr", downstream = "500", mart = mart), but I got an error:

Query ERROR: caught BioMart::Exception::Usage: Filter downstream_flank NOT FOUND

That's where I am stuck, I guess upstream and downstream can only be used in seqtype such as coding_transcript_flank' which is mentioned in here

ADD REPLYlink modified 7 months ago • written 7 months ago by oboeconcerto3140
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2413 users visited in the last hour