Hi all. I'm having trouble producing a file that contains only the coding exons that do not contain UTR's. I've obtained a file from UCSC table browser that looks something like this:
#name cdsStart cdsEnd exonCount exonStarts exonEnds
NM_017436 43088895 43089957 3 43088126,43091496,43116802, 43090003,43091637,43116876,
NM_001173466 53701272 53715249 15 53701239,53701628,53701835,53702065,53702218,53702508,53702743,53702940,53703384,53708081,53708877,53709118,53709510,53714348,53715126, 53701497,53701713,53701917,53702133,53702312,53702599,53702804,53703065,53703505,53708225,53708924,53709210,53709566,53714476,53715412,
I have the cdsStart and cdsEnd but what I want to do is to incorporate those starts and ends into the exonStarts and exonEnds so I can use this file for further analysis. For example, this is what I would want my output to look like:
#name cdsStart cdsEnd exonCount exonStarts exonEnds
NM_017436 43088895 43089957 3 43088895, 43089957,
For this example, the cdsStart and cdsEnds were in the first exon and thus I only wanted these exons to appear in my file. Is there any easy way to to carry this out from the table browser or do I need to modify the file? If so, any suggestions on how to do that?
Thank you!
This is the best solution if you don't need to bulk process data using MySQL.
Does this give you every individual exon for the region? Are they only coding exons?
there's a option to select "coding exons" instead of "exons plus X bases at each end"
Perfect. Thank you!