Off topic: extract fasta sequences using gene names
0
0
Entering edit mode
5.2 years ago
savscosta • 0

Hello. I want to get from the fasta file only those sequences matching the name of gene from a name.txt file

Ex. Input.fasta

>lcl|CP001829.1_gene_1 [gene=dnaA] [locus_tag=CpC231_0001] [location=1..1812] [gbkey=Gene]
TTGTCGGAGGCTCCATCGACATGGAACGAGCGGTGGCAAGAAGTTACTAATGAGCTGCTGTCACAGTCTC
>lcl|CP001829.1_gene_2 [locus_tag=CpC231_0001a] [location=complement(1821..1967)] [gbkey=Gene]
GTGTCGAGTATCACTGAATTACAAGTTTGTAATTACACAGCGTGTATAACTCTGTGGACTACTTTTAAAA
>lcl|CP001829.1_gene_3 [gene=dnaN] [locus_tag=CpC231_0002] [location=2396..3583] [gbkey=Gene]
CCACGTGAATCTTGAACCGGCCACGTGAATCTTGAACCGGCCACGTGAATCTTGAACCGG
>lcl|CP001829.1_gene_4 [gene=recF] [locus_tag=CpC231_0003] [location=3650..4864] [gbkey=Gene]
GTGTACATTCGCGAGCTATCGCTCCGAGATTTTCGTTCGTGGGCAGACTGCCACGTGAATCTTGAACCGG
>lcl|CP001829.1_gene_5 [locus_tag=CpC231_0004] [location=4854..5426] [gbkey=Gene]
ATGAGCAATAAACCTGCTGATGCTGGATCAGAAGATCCCGTAGCAGAGGCATTTGCTGCTATTCGTGCGG
AAGCCCAGCGGCGCACAGGGCGCATCCCCGATCTCTCCGTCCAAGCTCCGCGTTCTGGTTTAAAGCTTAA
>lcl|CP001829.1_gene_6 [gene=gyrB] [locus_tag=CpC231_0005] [location=5566..7611] [gbkey=Gene]
GTGGCAACCGCTGAACATGAATATGGCGCCTCATCCATTACGATCCTTGAGGGTCTAGAGGCTGTACGTA

name.txt

recF
gyrB

output:

>lcl|CP001829.1_gene_6 [gene=gyrB] [locus_tag=CpC231_0005] [location=5566..7611] [gbkey=Gene]
GTGGCAACCGCTGAACATGAATATGGCGCCTCATCCATTACGATCCTTGAGGGTCTAGAGGCTGTACGTA
>lcl|CP001829.1_gene_4 [gene=recF] [locus_tag=CpC231_0003] [location=3650..4864] [gbkey=Gene]
GTGTACATTCGCGAGCTATCGCTCCGAGATTTTCGTTCGTGGGCAGACTGCCACGTGAATCTTGAACCGG

thanks for your help!

genome • 800 views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 2640 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6