Entering edit mode
2.9 years ago
Saran
▴
50
Hello,
I have PCR-amplified a specific region that will either align to amplicon #1 or #2 dependent on what type of virus infection. What is the best way to run my fastq files against two amplicons and get the percentage that aligned to #1 versus #2.
Thank You, Sara
What is the size of these amplicons? Are you only looking for full length alignments? What sequencing tech (long/short reads) are these from? Is there any sequence similarity between amplicons?
I am working with Illumina paired-end adapter sequences spiked-in on a 2X150 run. The two reference amplicons are 173 and 179 bp and have similarities so I would need to be stringent in allowed mismatches I suppose.....
1: aaaaagtataaatataggaccaggcagagcattttatacaacaggagaaataataggagatataagacaagcacattgtaaccttagtagagcaaaatggaatgacactttaaataagatagttataaaattaagagaacaatttgggaataaaacaatagtctttaagcact
2: aaaaagtatccgtatccagaggggaccagggagagcatttgttacaataggaaaaataggaaatatgagacaagcacattgtaacattagtagagcaaaatggaatgccactttaaaacagatagctagcaaattaagagaacaatttggaaataataaaacaataatctttaagcaat
We want to know if one virus wins over another in infection based off of the differences between these two sequences; so essentially the percentage that align best to #1 and the percentage that align best to #2.
I think you can try to create a long representation from PE data (these should overlap since your amplicon is only ~170 bp) by using a tool like
bbmerge.shor FLASH and then align the single read to the two amplicons independently. You will need to play with stringency parameters to test alignments.Try
bbmap.shout and adjustminid=option when you align.Thank You for the advice!