Entering edit mode
3.8 years ago
Kai_Qi
▴
130
I have got a file containing the coordinates of alternatively spliced exon/Intron. Now I want to get the position of these intron/exons. To be specific: how many are located in 5UTR/CDS/3UTR?
Any advice or recommended manual is appreciated.
Exons include UTRs and CDS. If you're looking for co-ordinates per feature, you may want to download the GTF file from GENCODE and look for those features:
See an example entry:
Thank you for your reply. For example, I have a list of intron that is retained in the transcript after analyzing. Now I want to get the distribution of these introns among 5UTR/CDS/3UTR.
For your reply, my understanding is that I merge the files I got with the GTF files so that I will get where are these coordinates located?
I would not call it "merge", it's more of a lookup operation, but yes, you will need to use the GTF dataset to get the information you need. The exact steps you will need to take will depend on your data and you are in the best position to figure out the specifics.
I have generated the bed files for 5UTR/3UTR/CDS. It looks that they are all exon regions. Under this circumstances, though I have a bed file for my intron, but how can I do intersect with bedtools intersect?
You could go with the approach that whatever's not an exon is an intron, and use some sort of complement option.
Thanks! I think maybe I can use my coordinates to add 1 and do the intersect.
see Defining Precisely The Genomic Context Based On A Position