Entering edit mode
6.8 years ago
s_bio
▴
10
Hi All,
I have the following FASTA file.....
>NC_001434.1 Hepatitis E virus, complete genome
GCCATGGAGGCCCATCAGTTTATTAAGGCTCCTGGCATCACTACTGCTATTGAGCAGGCTGCTCTAGCAGCGGCCAACTCTGCCCTTGCGA............
>AB189071.1 Hepatitis E virus genomic RNA, nearly complete genome, isolate: JDEER-Hyo03L
GTCGATGCCATGGAGGCCCACCAGTTTATTAAGGCTCCTGGCATTACTACTGCCATTGAGCAGGCTGCTCTGGCTGCGGCCAACTCCGCCTT...................
>AB220979.1 Hepatitis E virus genomic RNA, complete genome, genotype 4, isolate: HE-JA41
GCAGACCACGTATGTGGTCGACGCCATGGAGGCCCACCAGTTCATAAAGGCTCCTGGCGTCACTACTGCTATTGAGCAGGCAGCTCTAGCAGC......................
>AB220974.1 Hepatitis E virus genomic RNA, complete genome, genotype 4, isolate: HE-JA2
GCAGACCACGTATGTGGTCGACGCCATGGAGGCCCATCAGTTCATAAAGGCTCCTGGCGTCACTACTGCTATTGAGCAGGCAGCTCTAGCAGCGG...................
>JQ001749.1 Bat hepevirus isolate BatHEV/BS7/GE/2009, complete genome
GCATCCTTGCCACAGAGTCCATGGTCCGCCGCCATGGACATTTCACAGTGGTCCGCCCCGAAGGGGGCGGGCGCAGCCTTCGAAGCGTACGCTCA............
>NW_006725532.1 Balaenoptera acutorostrata scammoni unplaced genomic scaffold, BalAcu1.0 scaffold126, whole genome shotgun sequence
CGCGACCCAGATGGTCCAGGGAGCCCCTTCCATGACTGCGGGCGCCGGCGTCGCTGGGGGTCGTGTGTAAAGACAAAGGCTTCGTCTGCCTC.......................
>DS547128.1 Laccaria bicolor S238N-H82 LACBIscaffold_38 genomic scaffold, whole genome shotgun sequence
AGGAGTTTGAAACGATTTGACCTTTGGCTGAATTATGCCACAACCGCTACACTCCGGGCCTCCAAATTTTGTGTGGTGGTGAATGTGTATCTCATGAACATGAATATATTTTTA..........
My query is that I just want to retain the sequences with keyword Hepatitis and want to delete the remaining entries. I would appreciate if somebody can suggest me the code to do so.
Thanks...