query on a pre-process step involving GFF3
0
0
Entering edit mode
5.8 years ago
prasundutta87 ▴ 660

Hello,

So, I had to update the chromosome names in a GFF3 file. For ease, I removed the lines that started with #. This helped me to get a clean tab separated GFF3 file that I could use in my R script for updating the chromosome names. The reason I removed the lines because some lines had something like this:

##species https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=89462
##sequence-region NW_020229131.1 1 29085

I am aware that lines that start with ## are directives and can be ignored by parsers. So, by removing them, I technically did not de-format the GFF3 format.

If there are some loopholes or if I did anything, kindly correct me.

GFF3 annotation • 858 views
ADD COMMENT
0
Entering edit mode

Removing comments should not be a problem but updating names of the chromosome would have an impact on other things, e.g. if you have a corresponding genome .fasta file then you should update the sequence headers in that file too.

ADD REPLY
0
Entering edit mode

Thanks, Sej..I have the other way round issue..I updated my GFF3 file chromosome names to match my reference genome..

ADD REPLY

Login before adding your answer.

Traffic: 1503 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6