Designing primers by mapping short NGS reads to reference genes
0
0
Entering edit mode
7.2 years ago
bioinfo ▴ 830

We would like to do some PCRs for 4 genes (e.g. blaX, tetX etc.) that were identified from the metagenomic sequencing data. In order to design primers and set up the PCRs, we need the sequences (from metagenome) of that gene present in the sample. To be able to find regions suitable for primer design, it is important for us, not only to get consensus sequence, but also to get information about positions where different bases are found in different reads due to presence of multiple variants of the respective gene in the sample.

Now I have protein (aa) sequences of the 4 reference genes (~300 aa each) and short 100 bp (nt) over 6000 reads from metagenomes that matched those genes. I used blastx to do that.

blastall -p blastx -i reads.6756.fasta -d ref.fasta -a 16 -F F -e 0.000001 -m 8 -o BLAST.out -K 1 -b 1

Then I tried to convert the blastx output to sam or gff format (Error Running Blast2Sam.PlError Running Blast2Sam.Pl) so I can get the alignments, the consensus sequences, SNPs and visualize it. Any suggestions?

blast sam samtools gff alignment • 2.0k views
ADD COMMENT
0
Entering edit mode

The outpout of blastx is a proteic pairwise alignment, how could it be converted to SAM ?

ADD REPLY
0
Entering edit mode

I guess blastx output (BLAST.out) to SAM that what I need. I tried with two different scripts posted above (link) but didn't work. I remember you posted Blast2Sam script in C some time ago "written by your student". I will try with that. Where exactly did you post that one? Was that to convert  blastn output to Sam or any blast output to Sam?

ADD REPLY
0
Entering edit mode

https://github.com/guyduche/Blast2Bam but again, those tools expect a DNA input, not a protein.

ADD REPLY
0
Entering edit mode

Ok. Then option 1. I have to get the Nucleotide sequences of my Reference proteins from NCBI then run the blastn followed by conversion from blast output to Sam format.

Option 2. as I already have the 6000 short reads that matched to the reference protein last time, I can directly map those short 6000 metagenomic matched reads using bowtie or similar aligner to newly obtained reference genes (nt) from NCBI and create the Sam file.

ADD REPLY

Login before adding your answer.

Traffic: 1215 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6