Perl Script To Remove Unnecessary Headers
3
0
Entering edit mode
8.8 years ago

A FASTA file containing:

>ppa020722m pacid=19 locus=ppa020722m.g ID=ppa020722m.v1.0 annot-version=v1.0
CTCCAACAAGCCTTCACCAACATCCAAAATCATTGCTCTGGTTTCCTGCACCACCTCTCTCACTTTCCTGTCTTCAACCC
CAACCCCTCTTCCCTCAAAACCCAACTCGAGTCCACCTTCGCCAACCTCCAAAACCAAG
>ppa020723m pacid=10 locus=ppa020723m.g ID=ppa020723m.v1.0 annot-version=v1.0
CTCCAACAAGCCTTCACCAACATCCAAAATCATTGCTCTGGTTTCCTGCACCACCTCTCTCACTTTCCTGTCTTCAACCC
CAACCCCTCTTCCCTCAAAACCCAACTCGAGTCCACCTTCGCCAACCTCCAAAACCAAG
>ppa020724m pacid=15 locus=ppa020724m.g ID=ppa020724m.v1.0 annot-version=v1.0
CTCCAACAAGCCTTCACCAACATCCAAAATCATTGCTCTGGTTTCCTGCACCACCTCTCTCACTTTCCTGTCTTCAACCC
CAACCCCTCTTCCCTCAAAACCCAACTCGAGTCCACCTTCGCCAACCTCCAAAACCAAG
......

Can anybody provide a Perl script to clean the headers in this FASTA database to make an output file as:

>ppa020722m
CTCCAACAAGCCTTCACCAACATCCAAAATCATTGCTCTGGTTTCCTGCACCACCTCTCTCACTTTCCTGTCTTCAACCC
CAACCCCTCTTCCCTCAAAACCCAACTCGAGTCCACCTTCGCCAACCTCCAAAACCAAG
>ppa020723m
CTCCAACAAGCCTTCACCAACATCCAAAATCATTGCTCTGGTTTCCTGCACCACCTCTCTCACTTTCCTGTCTTCAACCC
CAACCCCTCTTCCCTCAAAACCCAACTCGAGTCCACCTTCGCCAACCTCCAAAACCAAG
>ppa020724m
CTCCAACAAGCCTTCACCAACATCCAAAATCATTGCTCTGGTTTCCTGCACCACCTCTCTCACTTTCCTGTCTTCAACCC
CAACCCCTCTTCCCTCAAAACCCAACTCGAGTCCACCTTCGCCAACCTCCAAAACCAAG
......

Many thanks.

perl fasta • 3.5k views
ADD COMMENT
3
Entering edit mode

what have you tried?

ADD REPLY
0
Entering edit mode

Forget perl, just use cut.

ADD REPLY
4
Entering edit mode
8.8 years ago
perl -ane 'print "$F[0]\n";' yourFile.fa > yourFile.clean.fa
ADD COMMENT
2
Entering edit mode
8.8 years ago
Prakki Rama ★ 2.6k

as dpryan79 suggested, you can use cut -d " " -f 1 input.fasta

ADD COMMENT
1
Entering edit mode
8.8 years ago

There you go:

ADD COMMENT
1
Entering edit mode

Also awk alternative:

awk '/>/{$0=$1}1' test.fasta
ADD REPLY
0
Entering edit mode

Or if there are no line breaks in the sequences:

awk '{print $1}' file.fasta > output.fasta
ADD REPLY

Login before adding your answer.

Traffic: 1868 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6