I am trying to retrieve the sequence of a number of genes using getSequence., but specifically I only wish to retrieve 500bp upstream and downstream of the start and stop codon separately (1000bp each). However, it seems that biomaRt does not allow retrieval of upstream or downstream sequence using "coding".
I tried the following (I haven't worked out how to trim the sequence into the desired length, and choose 1 out of all the sequences only):
library(biomaRt) ensembl <- useMart("ensembl") ex <- c("ACTN4") mart <- useMart("ensembl", dataset = "mmusculus_gene_ensembl") gene2sequence <- getSequence (id = ex, type = "external_gene_name", seqType = "coding", upstream = "500", mart = mart) exportFASTA(gene2sequence, file="desktop/test.fasta")
But I will get an error:
> Query ERROR: caught BioMart::Exception::Usage: Filter upstream_flank NOT FOUND
Do I have to get the sequence using coding, 5utr, 3utr separately, and link them together? Or Is there any way I can work around this? Thanks!