Off topic:Filtering FASTA sequences
1
0
Entering edit mode
6.8 years ago
s_bio ▴ 10

Hi All,

I have the following FASTA file.....

 >NC_001434.1 Hepatitis E virus, complete genome
GCCATGGAGGCCCATCAGTTTATTAAGGCTCCTGGCATCACTACTGCTATTGAGCAGGCTGCTCTAGCAGCGGCCAACTCTGCCCTTGCGA............

 >AB189071.1 Hepatitis E virus genomic RNA, nearly complete genome, isolate: JDEER-Hyo03L
GTCGATGCCATGGAGGCCCACCAGTTTATTAAGGCTCCTGGCATTACTACTGCCATTGAGCAGGCTGCTCTGGCTGCGGCCAACTCCGCCTT...................

 >AB220979.1 Hepatitis E virus genomic RNA, complete genome, genotype 4, isolate: HE-JA41
GCAGACCACGTATGTGGTCGACGCCATGGAGGCCCACCAGTTCATAAAGGCTCCTGGCGTCACTACTGCTATTGAGCAGGCAGCTCTAGCAGC......................

 >AB220974.1 Hepatitis E virus genomic RNA, complete genome, genotype 4, isolate: HE-JA2
GCAGACCACGTATGTGGTCGACGCCATGGAGGCCCATCAGTTCATAAAGGCTCCTGGCGTCACTACTGCTATTGAGCAGGCAGCTCTAGCAGCGG...................

 >JQ001749.1 Bat hepevirus isolate BatHEV/BS7/GE/2009, complete genome
GCATCCTTGCCACAGAGTCCATGGTCCGCCGCCATGGACATTTCACAGTGGTCCGCCCCGAAGGGGGCGGGCGCAGCCTTCGAAGCGTACGCTCA............

 >NW_006725532.1 Balaenoptera acutorostrata scammoni unplaced genomic scaffold, BalAcu1.0 scaffold126, whole genome shotgun sequence
CGCGACCCAGATGGTCCAGGGAGCCCCTTCCATGACTGCGGGCGCCGGCGTCGCTGGGGGTCGTGTGTAAAGACAAAGGCTTCGTCTGCCTC.......................

 >DS547128.1 Laccaria bicolor S238N-H82 LACBIscaffold_38 genomic scaffold, whole genome shotgun sequence
AGGAGTTTGAAACGATTTGACCTTTGGCTGAATTATGCCACAACCGCTACACTCCGGGCCTCCAAATTTTGTGTGGTGGTGAATGTGTATCTCATGAACATGAATATATTTTTA..........

My query is that I just want to retain the sequences with keyword Hepatitis and want to delete the remaining entries. I would appreciate if somebody can suggest me the code to do so.

Thanks...

genome sequence • 1.1k views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 1779 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6