Question: Help Needed For Formatting Fasta Headers
1
gravatar for redspider19800915
5.2 years ago by
redspider1980091540 wrote:
 >HWI-ST667:190:C0TPFACXX:1:1101:2885:1985 1:N:0:GATCAG
 TCGGATAGAGCTCCAAATCTATCT
 >HWI-ST667:190:C0TPFACXX:1:1101:3058:1999 1:N:0:GATCAG
 CAATATCAACTGCTGCAACTCTCT
 >HWI-ST667:190:C0TPFACXX:1:1101:3372:1992 1:N:0:GATCAG
 TCAAAGGTTGAAGAGAATGAAATTTCT
 ......

How to use perl script to change the above FASTA file (just the header) into the following format? Many thanks! I'm a biologist with little programing background.

 >seq_1
 TCGGATAGAGCTCCAAATCTATCT      
 >seq_2
 CAATATCAACTGCTGCAACTCTCT
 >seq_3
 TCAAAGGTTGAAGAGAATGAAATTTCT
 ......
perl fasta • 1.4k views
ADD COMMENTlink modified 5.2 years ago by Neilfws48k • written 5.2 years ago by redspider1980091540
4
gravatar for Irsan
5.2 years ago by
Irsan6.6k
Amsterdam
Irsan6.6k wrote:

On a linux command line do:

awk 'BEGIN{OFS="_";seq=1}{if($0 ~ /^>/){print ">seq",seq;seq++}else{print $0}}' yourFile.fasta

Of course change yourFile.fasta for the name of your file...

gives you:

>seq_1
TCGGATAGAGCTCCAAATCTATCT
>seq_2
CAATATCAACTGCTGCAACTCTCT
>seq_3
TCAAAGGTTGAAGAGAATGAAATTTCT
ADD COMMENTlink modified 5.2 years ago • written 5.2 years ago by Irsan6.6k

Thank you so much! It also worked on iMac : P

ADD REPLYlink written 5.2 years ago by redspider1980091540
2
gravatar for Kenosis
5.2 years ago by
Kenosis1.2k
Kenosis1.2k wrote:

Here are two more options:

use strict;
use warnings;

my $i = 0;
while (<>) {
    s/^>\K.+/'seq_' . ++$i/e;
    print;
}

Usage: perl script.pl inFile [>outFile]

The last, optional parameter directs output to a file.

As a oneliner:

perl -ne 's/^>\K.+/'seq_' . ++$i/e; print' inFile [>outFile]

Output on your dataset from both:

>seq_1
TCGGATAGAGCTCCAAATCTATCT
>seq_2
CAATATCAACTGCTGCAACTCTCT
>seq_3
TCAAAGGTTGAAGAGAATGAAATTTCT

Hope this helps!

ADD COMMENTlink modified 5.2 years ago • written 5.2 years ago by Kenosis1.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 705 users visited in the last hour