Dear all,
I have a list of site information exemplified as below. Also I have a GFF3 file. Now I need to find where those sites locate, for example, my ideal output is shown at bottom. Could anyone help to give me suggestions or solutions? I much appreciated your helps. THANKS A LOT!
Site file:
Site No. Chr Start End
1 1 3000 3100
2 1 3200 3280
GFF3 file:
Chr1 MSU_osa1r7 gene 2903 10817 . + . ID=LOC_Os01g01010;Name=LOC_Os01g01010
Chr1 MSU_osa1r7 mRNA 2903 10817 . + . ID=LOC_Os01g01010.1;Name=LOC_Os01g01010.1
Chr1 MSU_osa1r7 exon 2903 3268 . + . ID=LOC_Os01g01010.1:exon_1
Chr1 MSU_osa1r7 intron 3269 3300 . + . ID=LOC_Os01g01010.1:intron_1
Ideal output:
Site No. Chr Start End Pos
1 1 3000 3100 ID=LOC_Os01g01010.1:exon_1
2 1 3200 3280 ID=LOC_Os01g01010.1:exon_1;ID=LOC_Os01g01010.1:intron_1
Does your site file contain any chromosome information?
Hi Pgibas, yes, I forget to add it.
A bit of coding with R and the GenomicRanges packages should make this doable.