I want to remove the overlapped region between exonic and UTR (5UTR and 3UTR) regions to keep the only exonic region.
Trial version of Input dataset is:
Chr Strand  Exon_ID Exon_start  Exon_end    5UTR_start  5UTR_end    3UTR_start  3UTR_end
1   1   AT1G01010.1.exon1   3631    3913    3631    3759    0   0
1   1   AT1G01010.1.exon2   3996    4276    0   0   0   0
1   1   AT1G01010.1.exon6   5439    5899    0   0   5631    5899
1   -1  AT1G01020.1.exon1   8571    9130    8667    9130    0   0
**1 -1  AT1G01060.7.exon8   33662   34327   0   0   33662   33991
1   -1  AT1G01030.1.exon2   11649   12940   12941   13173   11649   11863**
For Instance, for exon (AT1G01010.1.exon1), the Exon strat (3631) and 5UTR start (3631) both starts from the same position (3631), 5UTR region ends at 3759 while exon ends at 3913 since there is an overlap of 128 base pairs (3759-3631) so to keep only exonic region I want to change exonic start as 3760 and exonic end will remain same. But for exon (AT1G01020.1.exon1), the Exon strat (8571) and 5UTR start (8667) both ends at the same position (9130), but there is an overlap of 463 base pairs (9130-8667) so to keep only exonic region I want to change exonic end as 8666 and exonic start will remain same.
Final output should be like this:
Chr Strand  Exon_ID Exon_start  Exon_end    5UTR_start  5UTR_end    3UTR_start  3UTR_end
1   1   AT1G01010.1.exon1   3760    3913    3631    3759    0   0
1   1   AT1G01010.1.exon2   3996    4276    0   0   0   0
1   1   AT1G01010.1.exon6   5439    5630    0   0   5631    5899
1   -1  AT1G01020.1.exon1   8571    8666    8667    9130    0   0
**1 -1  AT1G01060.7.exon8   33992   34327   0   0   33662   33991
1   -1  AT1G01030.1.exon2   11864   12940   12941   13173   11649   11863**
I have tried a few awk commands but wasn't able to get the desired output, Any help will be highly appreciated.
UTRs are exonic. If you remove UTRs from the exonic regions, the regions you are left with are not exons. If you are doing this for standard DE analysis of RNAseq, it would be very bad practice to remove the UTRs, as UTRs make up on average 30% of a transcript, and often more than 50% of a transcript, so by removing them you are removing 30-50% of your data.