Use biomart getSequence to obtain cDNA sequence based on exon coordinates obtained from DEXSeq
0
0
Entering edit mode
9 weeks ago
osiemen ▴ 30

Hi,

I need to obtain the dna sequence for set of exon coordinates I obtained from my DEXSeq analysis so that I can perform a motif analysis using e.g . (X)Streme,MeMe suit. I know that biomart gives us the option to retrieve cdna from ensemble using chromosomal coordinates combined with getSequence(). Unfortunately I have not been able to do this. Whenever I run getSequence() , I receive the following 443 error:

Error in curl::curl_fetch_memory(url, handle = handle) : 
  Timeout was reached: [dec2021.archive.ensembl.org:443] Operation timed out after 300000 milliseconds with 0 bytes received 

I use the following code for this:

human.mart <- useMart(host="https://dec2021.archive.ensembl.org", "ENSEMBL_MART_ENSEMBL", dataset="hsapiens_gene_ensembl")

Example coordinates:

getSequence(chromosome = 12, start = 54369133, end = 54391298, mart=human.mart,seqType = "cdna",type = "ensembl_gene_id") 

head(human_df,2)
    groupID_strip            groupID featureID genomicData.seqnames genomicData.start genomicData.end genomicData.width genomicData.strand HGNC.symbol
1 ENSG00000001497 ENSG00000001497.16      E002                 chrX          65512583        65512901               319                  -       LAS1L
2 ENSG00000004975 ENSG00000004975.11      E001                chr17           7225341         7225454               114                  -        DVL2
    Gene.stable.ID.1
1 ENSMUSG00000057421
2 ENSMUSG00000020888


getSequence(chromosome = human_df$genomicData.seqnames[1], 
            start = human_df$genomicData.start[1],
            end = human_df$genomicData.end[1],
            type="ensembl_gene_id",
            seqType="cdna",
            upstream=20, 
            mart=human.mart) 

For some reason I cannot receive the cdna using the code abov, however if use the ensemble ID, I can obtain cDNA, but thats not what i want:

#Works
getSequence(id = human_df$groupID_strip[1], 
            type="ensembl_gene_id",
            seqType="cdna",
            upstream=20, 
            mart=human.mart)

Does anybody know how solve this issue or are there maybe other ways to retrieve the cDNA using exon coordinates obtained from dexseq?

Thanks in advance!

Biomart Exon RNA-Seq DEXSeq • 348 views
ADD COMMENT
0
Entering edit mode

Unfortunately other ensembl serves don't work..

ADD REPLY
0
Entering edit mode

For some reason I cannot receive the cdna using the code abov, however if use the ensemble ID, I can obtain cDNA, but thats not what i want

This is confusing. Is your post about the curl time out error or is it about getting the sequences that you want?

If your #Works code block works in that it retrieve sequences then the real issue isn't the time out error. If so, what exactly is the issue with what is returned from your working getSequence example when using the ensembl ID?

ADD REPLY

Login before adding your answer.

Traffic: 2326 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6