Problem using SED for change IDs in gff file
1
0
Entering edit mode
4.1 years ago

Hi,

I want to use SED to change the names ID in a GFF file. I just want to change "ILINNGPN_01481" that appears only once for "tfpI" in the whole file. I am using the simplest line:

sed 's/ILINNGPN_01481/tfpI/g' bovis_PROKKA.gff > bovis_PROKKA_edited.gff

The IDs changes happened well, but suddenly the CDS coords also change for no reason.

looking for ILINNGPN_01482 in the file (Before):

NODE_16_length_55139_cov_21.7549        Prodigal:2.6    CDS     25370   26155   .       -       0       ID=ILINNGPN_01482;inference=ab initio prediction:Prodigal:2.6;locus_tag=ILINNGPN_01482;product=hypothetical protein

looking for tfpI in the edited file (After):

NODE_16_length_55139_cov_21.7549        Prodigal:2.6    CDS     24581   25048   .       +       0       ID=tfpI;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:GCF_002014975.1_ASM201497v1_genomic.gbff:WP_078275419.1;locus_tag=tfpI;product=pilin

Have a clue for what's going on here?

Thanks!

sed gff • 1.1k views
ADD COMMENT
0
Entering edit mode

what is the output of

cat -n bovis_PROKKA.gff | sed 's/ILINNGPN_01481/tfpI/g' | grep -n tfpI
ADD REPLY
0
Entering edit mode

Thanks Pierre for that line!

ADD REPLY
1
Entering edit mode
4.1 years ago
rbagnall ★ 1.8k

Sed command has ILINNGPN_01481

You are looking at ILINNGPN_01482 in the before file

ADD COMMENT
0
Entering edit mode

Thanks rbagnall! rigth then I posted this I realize that! That's what was happening

ADD REPLY

Login before adding your answer.

Traffic: 2036 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6