How To Get The 3' Utr Given Geneid And Transcriptid In Ensembl Using Python?
4
3
Entering edit mode
10.1 years ago
Sam ▴ 70

Hello Everyone,

I have to do data collection for my current project where i have to collect all the 3' UTR sequences given Ensembl GeneID and Ensembl Transcript ID. I am new to this field and was wondering whether there is an easier way to do this than manually getting each 3' UTR. I have a list of GeneID and Transcript ID.

I did explore Ensembl BioMart, however i could not figure out how to exactly input both GeneID and transcriptID.

Also, would there be a way to incorporate this into python? I do know SQL at Beginner-Intermediate level.

Thank you very much in advance.

ensembl utr biomart api • 7.6k views
ADD COMMENT
2
Entering edit mode
5.2 years ago

A possibly more convenient solution: https://github.com/hammerlab/pyensembl

ADD COMMENT
1
Entering edit mode
10.1 years ago
Darked89 4.2k

Biomart: http://www.ensembl.org/biomart/martview

Database: Ensembl Genes 61 Dataset: Homo sapiens genes

Filters: check box on ID list limit There is a text area where you can paste Ensembl gene ids, i.e.: ENSG00000139618

Attributes: Sequences

Then below select: 3' UTR

Click count (top left of the page) just for checking, then go for Results.

ADD COMMENT
1
Entering edit mode
10.1 years ago

You do not need to input both GeneID and transcriptID at the same time, the transcript id should be sufficient and it is unique. It is true that a gene can have multiple transcripts, but a transcript should only be assigned to one gene, so with a transcript id, gene ids are redundant.

Biomart has some built in tools to automatize querys, here is an exported URL to a Biomart query, such as the one described by darked, but with transcript ids as a filter instead of Ensembl gene ids. If you follow the link, you have a query you can start with. You can also get this query as XML or directly as a perl script. If you really must use python, there is more documentation on how to use the REST or SOAP interface.

ADD COMMENT
0
Entering edit mode
10.1 years ago
Sam ▴ 70

Darked89 and Michael Dondrup: thank you for answering my question

Michael: I tried it the way u suggested. However i only get back 1 result after inputting all my transcript ids. example link with multiple ids a bit shorter

Here is the [file with all the TransID][2]

Thank you for your time.

[2]: http://www.filedropper.com/transid "file with all the transID

ADD COMMENT
1
Entering edit mode

Argh! I realise your reply was too large to paste in as a comment, but please do not add an answer if you're trying to address comments. If you want to paste a large amount of text into a comment on someone else's reply, please use something like http://pastebin.com/ and point a link to the output you would like us to look at!

ADD REPLY
0
Entering edit mode

No, your link works fine, the result is a multi-fasta file, scroll down ;)

ADD REPLY
0
Entering edit mode

He tried to do it right, but the very long URL broke the formatting. I put a shorter URL.

ADD REPLY
0
Entering edit mode

Daniel Swan: I'm sorry about making a new comment. I am new to this website and i did not want to paste the huge link to the comment section. Therefore i thought i would make a new comment. I will keep this in mind in future. Thank you

ADD REPLY
0
Entering edit mode

Michael: Thank you very very much. I realized after looking at your shortened link that i was not putting commas after every transcript ID. Therefore it was not querying all the ID's. You put in 5 ID's separated by commas and therefore it queried and returned all of them. I put all my ID's in that format and i got the answer! Thank you very much for all your time and effort. :)

ADD REPLY

Login before adding your answer.

Traffic: 2145 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6