Rename Fasta header by Regular expression
1
0
Entering edit mode
16 months ago
炎峰 • 0

I want the chromosome number to be kept, and the content after the only "space" is all removed.

>Chr01:10181894..10189044_INT#LTR/Copia|Class_I/LTR/Ty1_copia/Ale:Ty1-RT ID=Chr01:10181894..10189044_INT#LTR/Copia|Class_I/LTR/Ty1_copia/Ale:Ty1-RT;gene=RT;clade=Ale;evalue=2.7e-66;coverage=26.1;probability=0.98

>Chr02:12852689..12856792_INT#LTR/Copia|Class_I/LTR/Ty1_copia/Ivana:Ty1-RT ID=Chr02:12852689..12856792_INT#LTR/Copia|Class_I/LTR/Ty1_copia/Ivana:Ty1-RT;gene=RT;clade=Ivana;evalue=2.3e-107;coverage=83.3;probability=0.99

>Chr05:11339854..11345668_INT#LTR/Copia|Class_I/LTR/Ty1_copia/Angela:Ty1-RT ID=Chr05:11339854..11345668_INT#LTR/Copia|Class_I/LTR/Ty1_copia/Angela:Ty1-RT;gene=RT;clade=Angela;evalue=6.1e-41;coverage=31.8;probability=0.93

to

>Chr01_Ale 

>Chr02_Ivana

>Chr05_Angela
Regular expression • 490 views
ADD COMMENT
1
Entering edit mode
16 months ago
seidel 11k

This is something you could have fun deciphering yourself by playing with regex. But if you're giving the fun away, this works on your example:

perl -pe 's/>(\w+):.*\/.*\/.*\/(.*):.*/>$1_$2/' input.fa > output.fa
ADD COMMENT

Login before adding your answer.

Traffic: 2321 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6