Extraction Of Header Of Sequences In Fasta File
9.5 years ago

Hi all I have a fasta file that i want to extract just header of sequences. is there any perl code or some thing like this to do that? thanks a lot in advance

By "header", you mean everything after the ">"? Or just some part of everything after the ">"? Or including the ">"? It's important to be specific since a lot of people misunderstand "header".

I just want everything after the ">". and i have to say that i am not familiar with perl and i want a perl code to run. if possible help me. thanks a lot. regards

err, why don't you just post your code then?

9.5 years ago

For perl code, you can visit http://www.bioperl.org/wiki/Main_Page. If you just want to extract the headers, on a Linux/Unix system, a simple grep "^>" myfile.fasta should work.

9.5 years ago

Why so complicated? ;) Only the header in a fasta file contains > so you can use grep :

grep -e ">" my.fasta


or awk to remove the >:

$awk 'sub(/^>/, "")' >aksdjfljfd aksdjfljfd  ADD COMMENT 0 Entering edit mode Thanks so much, but i am not familiar with perl code. i need a complete code to run it. if possible guide me more. thanks again ADD REPLY 0 Entering edit mode Thanks. I fixed my problem. regards ADD REPLY 0 Entering edit mode this is not perl, it's unix ;) ADD REPLY 0 Entering edit mode what about i want to extract the header and their belonging sequences? ADD REPLY 7 Entering edit mode 9.5 years ago Caddymob ▴ 990 Expression in perl would be basically the same as the grep above (m/^>/).. There are easier 1-liner ways to do this, but this is a basic outline of the perl code that should be pretty readable. #!/usr/bin/perl open(FASTA, "<your.fa"); while(<FASTA>) { chomp($_);
if ($_ =~ m/^>/ ) { my$header = $_; print "$header\n";
}
}

thanks so much. your code is ok but how can i write it in a text file. i am not familiar with perl code.

