Retrieving ensembl protein length with biomart
2
1
Entering edit mode
6 weeks ago
iatz ▴ 30

I'd be interested in retrieving via Biomart the protein length associated to a transcript, as per screenshot below

scrennshot

I have retrieved the list of attributes available from biomart but can't seem to find the right field. Can someone confirm if this info is accessible programatically ? how ?

Thanks,

Ensembl biomart • 380 views
ADD COMMENT
3
Entering edit mode
6 weeks ago
Ben_Ensembl ★ 2.0k

Hi iatz,

You can't retrieve protein length directly from BioMart. However, you can retrieve the CDS length and divide by 3.

enter image description here

ADD COMMENT
1
Entering edit mode

That's perfect, I can as well use directly CDS length for my purposes, thanks !

ADD REPLY
2
Entering edit mode
5 weeks ago

gget seq will return the nucleotide (option "gene") or amino acid (option "transcript") sequence and sequence length for any Ensembl transcript ID:

enter image description here

Alternatively, you can also get the nucleotide sequence length from gget info by calculating the difference between the sequence "start" and "end". (Note: Both gget info and gget seq return information regarding the genomic nucleotide sequence, including all exons and introns.)

All gget tools work from the command line and any Python environment, e.g. JupyterLab.

ADD COMMENT

Login before adding your answer.

Traffic: 676 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6