bowtie options for exact match with reference short sequence
1
0
Entering edit mode
8.8 years ago
manekineko ▴ 150

Hi, I have bowtie with short sequences of 20nt. Is there a way to map and output a results with exact match also exact lenght e.g. searching for indentical sequences - if the reference sequence is 20nt to map and output the result with exact match and lenght 20nt, and not exact match also of 19, 18 seq?

bowtie exact option • 3.5k views
ADD COMMENT
0
Entering edit mode

Just to make sure bowtie or bowtie2?

ADD REPLY
0
Entering edit mode

bowtie

I just want to detect the identical sequences - no mismatches and same length.

ADD REPLY
3
Entering edit mode
8.8 years ago

I wrote a program called Dedupe (part of BBTools) that might accomplish what you're trying to do. It finds all exactly matching sequences. The default mode is to remove all but one copy of exactly matching sequences, but you can also tell it to remove all nonunique sequences, which allows you to do this:

dedupe.sh in=reads.fq ac=f out=a.fq
dedupe.sh in=ref.fq ac=f out=b.fq

Now a.fq and b.fq contain only 1 copy of each sequence.

dedupe.sh in=a.fq,b.fq ac=f out=unique.fq uniqueonly

unique.fq contains only sequences that are not shared between a.fq and b.fq.

dedupe.sh in=a.fq,unique.fq ac=f out=shared.fq uniqueonly

shared.fq will now contain all the sequences that are shared between a.fq and b.fq, which is hopefully what you are looking for.

For some reason Sourceforge is down right now but you can download it here:https://drive.google.com/open?id=0B3llHR93L14wTzZSMjZvdjdmYzg

ADD COMMENT
0
Entering edit mode

How to tell your tool to extract only the exact matching reads. I have asked the question C: Count exact match reads in SAM file

ADD REPLY

Login before adding your answer.

Traffic: 3133 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6