Use biomart getSequence to obtain cDNA sequence based on exon coordinates obtained from DEXSeq
0
0
Entering edit mode
9 weeks ago
osiemen ▴ 30

Hi,

I need to obtain the dna sequence for set of exon coordinates I obtained from my DEXSeq analysis so that I can perform a motif analysis using e.g . (X)Streme,MeMe suit. I know that biomart gives us the option to retrieve cdna from ensemble using chromosomal coordinates combined with getSequence(). Unfortunately I have not been able to do this. Whenever I run getSequence() , I receive the following 443 error:

Error in curl::curl_fetch_memory(url, handle = handle) :
Timeout was reached: [dec2021.archive.ensembl.org:443] Operation timed out after 300000 milliseconds with 0 bytes received


I use the following code for this:

human.mart <- useMart(host="https://dec2021.archive.ensembl.org", "ENSEMBL_MART_ENSEMBL", dataset="hsapiens_gene_ensembl")


# Example coordinates:

getSequence(chromosome = 12, start = 54369133, end = 54391298, mart=human.mart,seqType = "cdna",type = "ensembl_gene_id")

groupID_strip            groupID featureID genomicData.seqnames genomicData.start genomicData.end genomicData.width genomicData.strand HGNC.symbol
1 ENSG00000001497 ENSG00000001497.16      E002                 chrX          65512583        65512901               319                  -       LAS1L
2 ENSG00000004975 ENSG00000004975.11      E001                chr17           7225341         7225454               114                  -        DVL2
Gene.stable.ID.1
1 ENSMUSG00000057421
2 ENSMUSG00000020888

getSequence(chromosome = human_df$genomicData.seqnames[1], start = human_df$genomicData.start[1],
end = human_df$genomicData.end[1], type="ensembl_gene_id", seqType="cdna", upstream=20, mart=human.mart)  For some reason I cannot receive the cdna using the code abov, however if use the ensemble ID, I can obtain cDNA, but thats not what i want: #Works getSequence(id = human_df$groupID_strip[1],
type="ensembl_gene_id",
seqType="cdna",
upstream=20,
mart=human.mart)


Does anybody know how solve this issue or are there maybe other ways to retrieve the cDNA using exon coordinates obtained from dexseq?

Biomart Exon RNA-Seq DEXSeq • 349 views
0
Entering edit mode

Unfortunately other ensembl serves don't work..

0
Entering edit mode

For some reason I cannot receive the cdna using the code abov, however if use the ensemble ID, I can obtain cDNA, but thats not what i want

This is confusing. Is your post about the curl time out error or is it about getting the sequences that you want?

If your #Works code block works in that it retrieve sequences then the real issue isn't the time out error. If so, what exactly is the issue with what is returned from your working getSequence example when using the ensembl ID?