I have word file of whole genome sequence around 1709 pages, each gene is separated by ">". I need to blast whole genome sequence against a protein sequence from other organism for homology. Is there anyway to remove this information line ">gm_orf648 67_127_d_D 579383 580123 + 741_nt 246_aa" at once. instead of manually deleting one by one.
The correct answer to your problem is create a blast database from your file and blast the protein against this database - and blast can correctly parse the lines with
The first line in a FASTA file started either with a ">" (greater-than) symbol
A fasta file is just a text file, I guess Word is configured to open text files on your computer - but I doubt it is really a Word document.
Virtually all bioinformatics software can correctly parse fasta format, and there is no need to remove these lines.
For the sake of completeness (even if I shouldn't), here is the answer to your original question:
sed -i.bak '/^>/d' file