Hi to all, I have a fasta file with lots of sequences with the description not like NCBI.But the file has keyword mentioning the length of the sequence. I want to extract sequences only of, say upto 100 bp in length.How can I do this with perl?
the format is as follows:
>gene1 group5 length=84
Yes the length of the sequence is is not as mentioned. I did that for brevity(sorry!). I have many sequences of varying length. I am trying to write a program counting the strings. But I also felt it will be easy if I use the keyword length in FASTA description line to extract sequences within a range of values. thanx once again.raghul
NCBI fasta description is as follows,It has gi no.followed by reseq ID & organism etc which is NCBI way of describing the sequence but Mine is not so when u observe.
>gi|159476307|ref|XM_001696201.1| Chlamydomonas reinhardtii strain CC-503 cw92 mt+