Question: How to download all transcripts from Mouse with exons as upper case and introns as lower case?
I was using UCSC's Table Browser before but you have to input the transcript IDs and most of the time the databases don't match up completely for ensembl and many transcripts are left out of the mix. Is there a way in Ensembl or something where I can just download all of the sequences where the exons are upper case and the introns are lower case? I have been trying to do it from the GFF3 file but these files are really big.

No you don't have to. Just select "Gencode VM11 (Ensembl 86) track --> region "genome" --> output format as "sequence" which for mm10 is generating this summary

item count  58,611
item bases  1,089,603,911 (41.07%)
item total  2,300,254,359 (86.71%)
smallest item   9
average item    39,246
biggest item    4,434,882
block count 452,643
block bases 84,734,202 (3.19%)
block total 132,515,435 (5.00%)
smallest block  1
average block   293
biggest block   122,802

You can then select to Genome on next screen --> get Exons in upper case and introns in lower case.

This will get you Ensembl ID's, if that is what you are looking for.

ADD REPLYlink modified 18 months ago • written 18 months ago by genomax63k
