Hi all. I'm having trouble producing a file that contains only the coding exons that do not contain UTR's. I've obtained a file from UCSC table browser that looks something like this:
#name cdsStart cdsEnd exonCount exonStarts exonEnds NM_017436 43088895 43089957 3 43088126,43091496,43116802, 43090003,43091637,43116876, NM_001173466 53701272 53715249 15 53701239,53701628,53701835,53702065,53702218,53702508,53702743,53702940,53703384,53708081,53708877,53709118,53709510,53714348,53715126, 53701497,53701713,53701917,53702133,53702312,53702599,53702804,53703065,53703505,53708225,53708924,53709210,53709566,53714476,53715412,
I have the cdsStart and cdsEnd but what I want to do is to incorporate those starts and ends into the exonStarts and exonEnds so I can use this file for further analysis. For example, this is what I would want my output to look like:
#name cdsStart cdsEnd exonCount exonStarts exonEnds NM_017436 43088895 43089957 3 *43088895*, *43089957*,
For this example, the cdsStart and cdsEnds were in the first exon and thus I only wanted these exons to appear in my file. Is there any easy way to to carry this out from the table browser or do I need to modify the file? If so, any suggestions on how to do that?