Question: Find all perfect matches for short sequences to human reference genome
0
gravatar for Mick
21 days ago by
Mick10
Mick10 wrote:

Hi guys,

I'm trying to get something done that sound super simple but I'm stuck and I don't know how to go about it.

I have a large excel file with Primer sequences that someone designed for the human genome and I need to find out which part of the genome they cover. All i have is the sequence for the forward and reverse primer. So ideally I'd like to align the forward and reverse primer to the genome, I only want perfect matches to the genome because the primers have been designed in such a way.

There should be a linux tool to do this right? Maybe Blast, or bowtie? If anyone could guide me in the right direction I would much appreciate that.

Thanks :)

alignment • 153 views
ADD COMMENTlink modified 20 days ago • written 21 days ago by Mick10
1

bbmap has the following modes that might be of use...

perfectmode=f           Allow only perfect mappings when set to true (very fast).
semiperfectmode=f       Allow only perfect and semiperfect (perfect except for N's in the reference) mappings.
ADD REPLYlink written 21 days ago by jean.elbers1.2k

bowtie is perfectly fine. I would set all mismatch and penalty parameters to 1000, set seed mismatches to zero etc. to ensure only perfect matches are returned.

ADD REPLYlink written 21 days ago by ATpoint21k

Thank you for the quick reply, I'll let you know how it works :)

ADD REPLYlink written 21 days ago by Mick10

For bowtie I think one could use -n 0 -k 1 -m 1 --best --strata -v0. For bowtie2 I used in the past (but this was for NGS perfect matches not short primers) --end-to-end -N 0 --mp 10000 --np 10000 --rdg 10000 --rfg 10000. Try things out a bit :)

ADD REPLYlink modified 20 days ago • written 21 days ago by ATpoint21k

Depending on how many sequences there are just use in silico PCR tool: http://genome.ucsc.edu/cgi-bin/hgPcr?db=hg38.
You can also use the web interface for blat for a more flexible search. http://genome.ucsc.edu/cgi-bin/hgBlat?command=start

ADD REPLYlink modified 21 days ago • written 21 days ago by genomax70k
1
gravatar for Mick
20 days ago by
Mick10
Mick10 wrote:

Thank you guys so much for all the replys. I tried bowtie and used these parameters:

bowtie hg19 -v 0 -a -X 800 -r1 "sequences1.txt" -r2 "sequences2.txt"

It worked really well for the most part. It only missed a couple of primers, where it didn't find any alignment. I manually searched for the location of these primers with blast. They were perfect matches to the genome, I don't really know why bowtie may have missed those. But anyhow, thank you guys so much for the help. :)

ADD COMMENTlink written 20 days ago by Mick10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1670 users visited in the last hour