Extract sequences from BLAST database in base on the name of the protein
0
0
Entering edit mode
2.2 years ago
Giffredo ▴ 10

Hi,

I would like to create a new sub-database from the nr BLAST DB containing all the sequences related to biogenic amines. So, I need a script to extract sequences from the nr BLAST database based on partial protein names.

Example: for the word "phosph" I would like to reach the fasta output like that:

   >VFG037176(gb|YP_001844723) (plc) phospholipase C [Phospholipase C (VF0470)] [Acinetobacter baumannii ACICU]
    MNRREFLLNSTKTMFGTAALASFPLSIQKALAIDAKVESGTIQDVKHIV...
    >VFG037177(gb|YP_001846906) (plc) phospholipase C [Phospholipase C (VF0470)] [Acinetobacter baumannii ACICU]
    MITRRKFLNYSLNMGFGAAALAAFPSSIQKALAIPANNKTGTIQDVEHV...
    >VFG037203(gb|YP_001847849) (plcD) phosphatidylserine/phosphatidylglycerophosphate/cardiolipin synthase [Phospholipase D (VF0469)] [Acinetobacter baumannii ACICU]
    MAQSFHSKQLQTHQLANGFLIKASIVVCSSFAVALTGCSTLPKHSPEPI...
BLAST • 427 views
ADD COMMENT

Login before adding your answer.

Traffic: 3678 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6