Hello everyone!
I have to do something and I kind of lost. I have 2 tab delimited text files which contains exons coordinates.
The first file contains the start coordinates (for example):
NM_032291   chr1    +   66999638    67091529    67098752    67101626
NM_001308203    chr1    +   66999251    66999928    67091529    67098752    67105459    67108492
and the second file - contains the end coordinates:
NM_032291   chr1    +   67000051    67091593    67098777    67101698
NM_001308203    chr1    +   66999355    67000051    67091593    67098777    67105516    67108547
I'd like to have multiple bed files for each exon(for example for the first exon):
 chr1    66999638     67000051     NM_032291   length    +
 chr1    66999251     66999355     NM_001308203    length    +
Each gene contains different number of exons - so the number of the columns is unknown. I believe there is a very simple way to do it, I've tried awk but without success.
Thanks!
Thanks! About your question: that's my data.... But I have one more problem - I just gave an example - the number of exons for each gene is different.
You mean you have same ids multiple times in first column in a same file?
I edited my post, so you can see now.... And no, I have one Id per line and then all the exons coordinates
then the solution I provided should work fine. Did you get any error?
Maybe I didn't understand your answer... what does it mean $10? Isn't it simply column 10?
Yes, but after combining two files into one. I added explanation.