How to Filter ENSembl to fetch more confident transcripts
1
0
Entering edit mode
4.5 years ago

We recently noted that the current ENSembl data is full of very short isoforms that likely do not represent functional transcripts BUT lead to over-estimate the diversity of the transcriptome (and may include intron-retention isoforms too).

I am looking for a programatic way to reproduce a subset of ENSembl closer to what is provided in ENCODE but that would let me control the degree of evidence I wish to keep.

I found info in ENSembl about TSL (Transcript Support Level) but do not find TSL exposed in BioMART not examples of [R] biomaRt commands applying this annotation to filtering (# although head(organismAttributes("Homo sapiens"), 20) returns transcript_tsl at position #15)

Does anybody have code biomaRt examples to fetch only human transcripts with annotation 'GENCODE basic' as exemplified in the screen-shot of the ABLIM1 gene?

Getting the full GENCODE build as a download is not what I need here, I want to create my own subsets

Stephane

ensembl biomart transcripts • 1.5k views
1
Entering edit mode
4.5 years ago

TSL and GENCODE Basic are both filters available in BioMart. They're listed under GENE.

Also, please do not cross-post to BioStars and to Ensembl helpdesk, as we monitor both of these channels. I will close your ticket on Ensembl helpdesk.

0
Entering edit mode

Thanks for this Emily,

Can you please be more explicit, I do not see GENCODE nor TSL in the drop down under gene in BioMART.

Also, what about my specific request to do this programatically using biomaRt in R?

Best Stephane

1
Entering edit mode

Did you try scrolling down the page?

0
Entering edit mode

SHAME on ME :-) is t is clearly Friday.

One last, what about doing this in R?

0
Entering edit mode
filters = 'transcript_tsl', values = TRUE,
filters = 'transcript_gencode_basic', values = TRUE