Exact Matching With Bowtie, Blat And Blast+
2
3
Entering edit mode
8.1 years ago

I am running bowtie with the following parameters, to look for up to, say, 10 exact matches of a 36-base nucleotide string to a GRCh37/hg19 index, _e.g._:

$ bowtie -S hg19 -v 0 -k 10 -f sequence.fa > hits.sam

As a sanity check, I sample the 36 base sequence from the same assembly of hg19 (using the same FASTA files used to create the bowtie index) in order to verify that I receive all matches, using UCSC BLAT and NCBI BLAST searches as confirmation.

Some questions:

  1. The docs say that bowtie accepts read lengths with an upper bound of 1000 bp. In practice, what is the lower bound of query sequence lengths that it will accept and reliably align?

  2. Is the -oneOff parameter to the BLAT command-line tool used to limit mismatches to 0?

  3. Is there a way to translate accession code hits from an NCBI BLAST search to genomic coordinates (chromosome, start, stop)?

  4. Is there a parameter to limit BLAST+ command-line tool searches for hits that are the same length as the query sequence, or otherwise limit results to exact matches to the query sequence?

blast bowtie blat blast+ • 2.9k views
ADD COMMENT
1
Entering edit mode

Hmm... You know my post on the evaluation of finding all hits. I actually wrote that post in particular for you, but you do not trust me. Interesting.

ADD REPLY
0
Entering edit mode

Actually, I probably missed part of your post which addresses some of these questions. I apologize for my oversight and will take another look.

ADD REPLY
0
Entering edit mode

Hey, I re-read your question and you do address some aspects of my question. Thanks for reminding me about it.

ADD REPLY
0
Entering edit mode
8.1 years ago
Dan ▴ 520

1) According to this: http://wwwdev.ebi.ac.uk/fg/hts_mappers/ the min length is 4 2) According to this: http://genomewiki.ucsc.edu/index.php/Blat-FAQ you could try -oneoff 0 3) Not that I know of... "Accession code hits" - I find that a bit confusing. 4) Not that I know of... check here: http://genome.ucsc.edu/goldenPath/help/blatSpec.html ?

ADD COMMENT
0
Entering edit mode
8.1 years ago

Is there a parameter to limit BLAST+ command-line tool searches for hits that are the same length as the query sequence, or otherwise limit results to exact matches to the query sequence?

this has to be done post-hoc as far as I know - your first search is glocal as you know which is not BLAST's modus operandi

ADD COMMENT

Login before adding your answer.

Traffic: 1170 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6