Question: Metagenomic Degenerate Primers Design - How to download multiple gene sequences from Genbank?
1
gravatar for Tim
5.0 years ago by
Tim110
United Kingdom
Tim110 wrote:

Hello, everyone. I might be asking a question with a very simple answer and probably the one that was already answered here before, but I would really appreciate any help.

I am trying to design a new set of degenerate primers to amplify a gene (pstS) from bacterial metagenomes. While I have never worked before with metagenomic samples, I clearly understand the steps that need to be taken in order to do so:

  1. Find and download nucleotide sequences of my gene (Genbank) from different bacterial species;
  2. Perform multiple alignment of these nucleotide sequences and/or their protein translations (CLUSTAL, MUSCLE, T-COFFEE etc.);
  3. Identify conservative regions;
  4. Select primer sequences, either manually or using a specialized program (CODEHOP, Primaclade, HYDEN).

So my question is simple: how to batch download gene sequences from Genbank? If I use Entrez Nucleotide, it gives me all the sequences containing pstS, including whole genomic sequences, plasmids and so on, and I have no idea how to filter them out. I am not afraid to use BioPerl/BioPython or any other way of collecting data from Genbank programmatically, but I am worried that there exist a simple method that I am missing.

Thank you in advance, I am really struggling with this simple step and that makes me uncomfortable.

ADD COMMENTlink modified 5.0 years ago by umer.zeeshan.ijaz1.7k • written 5.0 years ago by Tim110
0
gravatar for umer.zeeshan.ijaz
5.0 years ago by
Glasgow, UK
umer.zeeshan.ijaz1.7k wrote:

If you have a locally installed nt or nr database, then you can use blastdbcmd from blast suite to extract the sequences. For example, the following link tells you how to extract 16S rRNA sequences:

http://userweb.eng.gla.ac.uk/umer.ijaz/bioinformatics/oneliners.html?#BLASTDBCMD

For downloading sequences from Uniprot (Swiss-Prot,trEMBL), you can use extract_fasta_swissprot.py script from

http://userweb.eng.gla.ac.uk/umer.ijaz/bioinformatics/annotation.html?#UNIPROT

Also read section 9 from Biopython Cookbook:

http://biopython.org/DIST/docs/tutorial/Tutorial.html#sec118

Best Wishes,

Umer

 

 

ADD COMMENTlink modified 5.0 years ago • written 5.0 years ago by umer.zeeshan.ijaz1.7k

Thank you, Umer, I will try your suggestions

ADD REPLYlink written 5.0 years ago by Tim110
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1724 users visited in the last hour