Question: Retrieving sequences from Ensembl Archive
0
gravatar for biostR
4 weeks ago by
biostR0
biostR0 wrote:

Hi,

I am looking for a way to retrieve DNA sequences from Ensembl May 2017 archive, based on coordinates. I thought using Biomart package would be useful for getting DNA sequences, however, it did not work. Apparently, sequence type (seqType, type) is required for obtaining a sequence using getSequence function.

For example:

  ensembl<-useMart(host="may2017.archive.ensembl.org",
                     biomart="ENSEMBL_MART_ENSEMBL",
                     dataset="hsapiens_gene_ensembl")    
    seq<-biomaRt::getSequence(chromosome="X", start =  100639991, end = 100644991 , mart=ensembl )

This gives the following error:

Error in biomaRt::getSequence(chromosome = "X", start = 100639991, end = 100644991,  : 
  Please specify the type of sequence that needs to be retrieved when using biomaRt in web service mode.  Choose either gene_exon, transcript_exon,transcript_exon_intron, gene_exon_intron, cdna, coding,coding_transcript_flank,coding_gene_flank,transcript_flank,gene_flank,peptide, 3utr or 5utr

Is there a nice way for getting the DNA sequences of a large list of genomic coordinates?

Thank you very much.

ADD COMMENTlink modified 29 days ago by Emily_Ensembl15k • written 4 weeks ago by biostR0

Did you check the documentation? Sequence type genomic is one of the allowed options.

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by genomax54k

biomaRt v2.32.1 is installed which does not allow "genomic" as the seqType. If I try I get the following:

Error in biomaRt::getSequence(chromosome = "X", start = 100639991, end = 100644991,  : 
Please specify the type of sequence that needs to be retrieved when using biomaRt in web service mode. 
Choose either gene_exon, transcript_exon,transcript_exon_intron,
gene_exon_intron, cdna, coding,coding_transcript_flank,coding_gene_flank,transcript_flank,
gene_flank,peptide, 3utr or 5utr
ADD REPLYlink modified 29 days ago • written 29 days ago by biostR0

Tagging: Mike Smith to see if he can help.

ADD REPLYlink written 29 days ago by genomax54k
2
gravatar for Emily_Ensembl
29 days ago by
Emily_Ensembl15k
EMBL-EBI
Emily_Ensembl15k wrote:

BioMart is gene-centric, it cannot get sequences of genomic regions. The easiest way to get what you need is using the REST API archive with the POST sequence/region endpoint. This will allow you to retrieve multiple sequences, and you can code around it in any language.

ADD COMMENTlink written 29 days ago by Emily_Ensembl15k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 657 users visited in the last hour