Question: Designing primers by mapping short NGS reads to reference genes
gravatar for bioinfo
5.3 years ago by
bioinfo790 wrote:

We would like to do some PCRs for 4 genes (e.g. blaX, tetX etc.) that were identified from the metagenomic sequencing data. In order to design primers and set up the PCRs, we need the sequences (from metagenome) of that gene present in the sample. To be able to find regions suitable for primer design, it is important for us, not only to get consensus sequence, but also to get information about positions where different bases are found in different reads due to presence of multiple variants of the respective gene in the sample.

Now I have protein (aa) sequences of the 4 reference genes (~300 aa each) and short 100 bp (nt) over 6000 reads from metagenomes that matched those genes. I used blastx to do that.

blastall -p blastx -i reads.6756.fasta -d ref.fasta -a 16 -F F -e 0.000001 -m 8 -o BLAST.out -K 1 -b 1

Then I tried to convert the blastx output to sam or gff format (Error Running Blast2Sam.PlError Running Blast2Sam.Pl) so I can get the alignments, the consensus sequences, SNPs and visualize it. Any suggestions?

blast sam samtools alignment gff • 1.5k views
ADD COMMENTlink modified 5.3 years ago • written 5.3 years ago by bioinfo790

The outpout of blastx is a proteic pairwise alignment, how could it be converted to SAM ?

ADD REPLYlink written 5.3 years ago by Pierre Lindenbaum131k

I guess blastx output (BLAST.out) to SAM that what I need. I tried with two different scripts posted above (link) but didn't work. I remember you posted Blast2Sam script in C some time ago "written by your student". I will try with that. Where exactly did you post that one? Was that to convert  blastn output to Sam or any blast output to Sam?

ADD REPLYlink modified 5.3 years ago • written 5.3 years ago by bioinfo790 but again, those tools expect a DNA input, not a protein.

ADD REPLYlink written 5.3 years ago by Pierre Lindenbaum131k

Ok. Then option 1. I have to get the Nucleotide sequences of my Reference proteins from NCBI then run the blastn followed by conversion from blast output to Sam format.

Option 2. as I already have the 6000 short reads that matched to the reference protein last time, I can directly map those short 6000 metagenomic matched reads using bowtie or similar aligner to newly obtained reference genes (nt) from NCBI and create the Sam file.

ADD REPLYlink modified 11 months ago by RamRS30k • written 5.3 years ago by bioinfo790
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1009 users visited in the last hour