Hi, What you mean by

  1. genomic .fna.gz
  2. genomic .gbff.gz
  3. genomic .gff.gz
  4. protein .faa.gz - protein sequences
  5. protein .gpff.gz

I tried to know for others, but i could not get any results.

Please guide me

Thanks Naresh

All files are text files, compressed using the linux/unix program gzip, use gunzip, to extract, zcat to write the content without saving it to a file.

The following are conventions, which a lot of people, not all, follow:

  • fna = FastA format file containing Nucleotide sequence (DNA)
  • gbff = Genbank Genome file containing genome sequence and annotation
  • gff = general feature format containing genomic regions, the "genes, transcripts, etc"
  • faa = FastA format file containing Amino-acid sequence (Protein, peptide)
  • gpff = Genbank Protein file containing protein sequence and annotation

See for more explanation.

Thank you for your wonderful explanation. Thanks alot.

Thank you for your wonderful explanation. Thanks alot.
