Rename Fasta header by Regular expression
1
0
Entering edit mode
2.8 years ago
炎峰 • 0

I want the chromosome number to be kept, and the content after the only "space" is all removed.

>Chr01:10181894..10189044_INT#LTR/Copia|Class_I/LTR/Ty1_copia/Ale:Ty1-RT ID=Chr01:10181894..10189044_INT#LTR/Copia|Class_I/LTR/Ty1_copia/Ale:Ty1-RT;gene=RT;clade=Ale;evalue=2.7e-66;coverage=26.1;probability=0.98

>Chr02:12852689..12856792_INT#LTR/Copia|Class_I/LTR/Ty1_copia/Ivana:Ty1-RT ID=Chr02:12852689..12856792_INT#LTR/Copia|Class_I/LTR/Ty1_copia/Ivana:Ty1-RT;gene=RT;clade=Ivana;evalue=2.3e-107;coverage=83.3;probability=0.99

>Chr05:11339854..11345668_INT#LTR/Copia|Class_I/LTR/Ty1_copia/Angela:Ty1-RT ID=Chr05:11339854..11345668_INT#LTR/Copia|Class_I/LTR/Ty1_copia/Angela:Ty1-RT;gene=RT;clade=Angela;evalue=6.1e-41;coverage=31.8;probability=0.93

to

>Chr01_Ale 

>Chr02_Ivana

>Chr05_Angela
Regular expression • 782 views
ADD COMMENT
1
Entering edit mode
2.8 years ago
seidel 11k

This is something you could have fun deciphering yourself by playing with regex. But if you're giving the fun away, this works on your example:

perl -pe 's/>(\w+):.*\/.*\/.*\/(.*):.*/>$1_$2/' input.fa > output.fa
ADD COMMENT

Login before adding your answer.

Traffic: 3331 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6