Question: How to curate my BLAST results
6.3 years ago by
siddharth.avadhanam30 wrote:


I've been trying to design a primer from the 16S gene sequence of leptospira, to detect any species of leptospira in the isolate ( as the 16S gene sequence is highly conserved ) and  BLASTing the primers that I've designed to test them. The problem with the BLAST results I get is that of the 100   hits that displayed, a good number of them are repeats. For example, a hit with the whole genome shotgun sequence of the fiocruz strain is duplicated some 20 times ( i'm guessing that these are by different workers ). Is there any way to avoid this. In particular, it would be great if I could get at least one hit per species, and not more than five. I have no clue how to go about doing this ... any ideas ?  


blast alignment genome • 1.8k views
6.3 years ago by
Yannick Wurm2.3k
Queen Mary University London
Yannick Wurm2.3k wrote:

A well assembled genome sequence should contain each bigger sequence only once. Thus either your genome is badly assembled, or each genome contains multiple 16S sequences, or your primer sequence is in a repeat (e.g. a transposon). 

You should be able to write a small script to do what you want. 

I think, on average, a bacterial genome contains about five 16S copies. Here's a recent paper on the topic.

I'm sorry. Maybe I wasn't being clear. I'm using single 16S gene sequences from the ncbi website. Not a genome. I've compiled them into a single fasta file, and generated a multiple alignment through clustal. After that, I ran a conserved region through primer BLAST, identified primers, and finally BLASTed those primers. The problem that i've mentioned above is with the BLAST results for these primers. 

