I need to obtain the dna sequence for set of exon coordinates I obtained from my DEXSeq analysis so that I can perform a motif analysis using e.g . (X)Streme,MeMe suit.
I know that biomart gives us the option to retrieve cdna from ensemble using chromosomal coordinates combined with
getSequence(). Unfortunately I have not been able to do this. Whenever I run getSequence() , I receive the following 443 error:
Error in curl::curl_fetch_memory(url, handle = handle) : Timeout was reached: [dec2021.archive.ensembl.org:443] Operation timed out after 300000 milliseconds with 0 bytes received
I use the following code for this:
human.mart <- useMart(host="https://dec2021.archive.ensembl.org", "ENSEMBL_MART_ENSEMBL", dataset="hsapiens_gene_ensembl")
getSequence(chromosome = 12, start = 54369133, end = 54391298, mart=human.mart,seqType = "cdna",type = "ensembl_gene_id") head(human_df,2) groupID_strip groupID featureID genomicData.seqnames genomicData.start genomicData.end genomicData.width genomicData.strand HGNC.symbol 1 ENSG00000001497 ENSG00000001497.16 E002 chrX 65512583 65512901 319 - LAS1L 2 ENSG00000004975 ENSG00000004975.11 E001 chr17 7225341 7225454 114 - DVL2 Gene.stable.ID.1 1 ENSMUSG00000057421 2 ENSMUSG00000020888 getSequence(chromosome = human_df$genomicData.seqnames, start = human_df$genomicData.start, end = human_df$genomicData.end, type="ensembl_gene_id", seqType="cdna", upstream=20, mart=human.mart)
For some reason I cannot receive the cdna using the code abov, however if use the ensemble ID, I can obtain cDNA, but thats not what i want:
#Works getSequence(id = human_df$groupID_strip, type="ensembl_gene_id", seqType="cdna", upstream=20, mart=human.mart)
Does anybody know how solve this issue or are there maybe other ways to retrieve the cDNA using exon coordinates obtained from dexseq?
Thanks in advance!