This entrez command give me an outpout with the genes sequences of a gen/enzymes for various organism:
esearch -db gene -query "glutaminase-asparaginase [Gene/Protein Name] AND (bacteria [orgn] OR fungi [orgn] OR archaea [orgn]) AND alive [prop]" | efetch -format docsum | xtract -pattern GenomicInfoType -element ChrAccVer -element ChrStart -element ChrStop |xargs -n 3 sh -c 'efetch -db nuccore -id "$0" -seq_start "$1" -seq_stop "$2" -format fasta'
The output is similiar to:
>NC_030957.1:c4121890-4120582 Colletotrichum higginsianum TGAGAGCTTCTTACTTGTCGACGCTGTTGTTGCCAGCTCTGGTAGCCCATGGTTTCGCCTCCCCAGTCGG >NC_016603.1:c898826-897759 Acinetobacter pittii TGTTGACTAAAACTGTTAAATCTTTAGGTTTAGCGATGGGCTTATTAG >NC_002947.4:c2800289-2799201 Pseudomonas putida TGAATGCCGCACTGAAAACCTTCGCCCCAAGCGCACTCGCCCTGCTGCTGATCCTGCCATCCAGCGCCTC
But I need to do this for several genes that i have in a first column of a table, like that:
GeneNameA OtherColumn OtherColumn GeneNameB OtherColumn Other Colmn
I am searching for a Perl script that read the first column and pass each GeneName to this space of the entrez command : "X" [Gene/Protein Name], and create a multifasta files that contains the sequences for each Gene separetely.
My programming skills are poor yet and I am stuck in this part. I´ll will be grateful with your help!