Find all sequence of a specific genus amplified by a couple of primers
1
0
Entering edit mode
3.3 years ago
Shred ▴ 460

Is there a way to find every sequence present in a database amplified by a couple of custom primers? I've designed two primers to amplify a variable region of the genome of a Staphylococcus, and I want to look how many references sequence were amplified by this couple of primers.

Is there an online approach? Or I need to download every genomes of the family Staphylococcaceae and try to do something locally?

alignment pcr primer sequence • 1.1k views
0
Entering edit mode

Hi Shred,

Maybe you can try the tools introduced by this paper. You can find the scripts in the supplemental files.

0
Entering edit mode
3.3 years ago
Anima Mundi ★ 2.9k

Hello,

a rapid way to do that is to:

1) reverse-complement (RC) your right primer

2) build your subject database as a FASTA file (if needed)

3) go to NCBI nucleotide BLAST

4) paste your left primer, type a series of Ns right next to it, then paste your RC right primer

5) load your database file (or select an appropriate one; you can also restrict the search to a specific taxon, e.g. Staphylococcus (taxid:1279) or Staphylococcaceae (taxid:90964))

6) adjust BLAST parameters as needed (to allow low-score alignments to be displayed)

7) run BLAST

Of course, you could also run BLAST locally with custom database and options, building the query as suggested above.

PS: Staphylococcaceae is a family, not a genus ;).

0
Entering edit mode

Thanks. What if I've already got an amplicon and I want to do a DB of all the amplicons obtained from? I've thought this way:

1) Build a reference database of the region of every known genome of the family
2) Blast the amplicon against this DB
3) Use the sstart and ssend to trim the sequence inside the DB


Could this approach be valid?

0
Entering edit mode

Assuming you performed your PCR on cDNA, checking for amplicons from genome is merely a way to check for potential sources of genomic contamination. Doing this using your whole amplicon would just return genomic locations in which your full amplicon (or something thereof) is present, if any. For instance, say you amplified a portion of a human myosin from cDNA encompassing more than one exon: if you built up genomic databases of human and chimpanzee, and you queried such a database via BLASTn, you would probably end up with alignments on myosin processed pseudogenes.

My feeling is that you perhaps need to clarify what your scientific question is in order to optimize your BLAST strategy. Also, checking for primer specificity is something you would normally do before starting wet-lab experiments.