Question: Faster BLAST alternative
1
gravatar for igor
4.2 years ago by
igor8.3k
United States
igor8.3k wrote:

Are there some faster alternatives to BLAST (specifically nucleotide BLAST)? I like that I can search through all GenBank+EMBL+DDBJ+PDB+RefSeq sequences (nt collection), but I feel like there must be a faster way. If I wanted to identify thousands or millions of sequences, it's somewhat inefficient.

sequencing blast alignment • 6.6k views
ADD COMMENTlink modified 3.4 years ago • written 4.2 years ago by igor8.3k
4
gravatar for igor
3.4 years ago by
igor8.3k
United States
igor8.3k wrote:

I guess what I was really asking for is a metagenomic classifier. There are a few of those out there:

Sure, these are not exactly same as BLAST, but if you need to quickly classify a lot of reads, these tools will do that.

ADD COMMENTlink written 3.4 years ago by igor8.3k

I see it's old, but why answering with something not asked????? the question is on a blat alternatives and you answer with metagenomic tools for classification?

ADD REPLYlink written 7 weeks ago by matteo.brilli.bip0

The answer was marked as "accepted", so it seems to be sufficiently relevant.

ADD REPLYlink written 7 weeks ago by igor8.3k
1
gravatar for 5heikki
4.2 years ago by
5heikki8.5k
Finland
5heikki8.5k wrote:

If you have millions of query sequences it's not a bad idea to cluster them and only blast the representative sequences. Further more, with millions of query sequences and no cluster at hand, it might be a good idea to select a smaller reference database such as UniRef90, but this depends on your research questions. I think DIAMOND is one of the most recent blast alternatives. As far as I recall, they overview some other alternatives in the article (don't have access from home). If you want to do just nucleotide-nucleotide another option would be blat.

ADD COMMENTlink modified 4.2 years ago • written 4.2 years ago by 5heikki8.5k
1

Unfortunately, DIAMOND is for protein (not nucleotide) alignment.

Blat is a good suggestion. Not sure how easy it would be to summarize the results.

ADD REPLYlink written 4.2 years ago by igor8.3k
1

Well, DIAMOND is blastx-like so nucletide-vs-protein, which is almost always better than nucleotide-nucleotide if you want to detect putative homologs. You haven't really told us anything about your research questions nor the type of your query sequences (length, source, etc.) so it's hard to say. Also blat has output option that is similar to tabular blast output, which is the way to go IMO.

ADD REPLYlink modified 4.2 years ago • written 4.2 years ago by 5heikki8.5k

Sorry if I was being too vague. I am trying to identify contaminants in raw sequencing data. For example, the reads should be human, but only 50% align to human. What are the other reads? I can check some likely contaminants, but I'd like to check against all known sequences.

ADD REPLYlink written 4.2 years ago by igor8.3k

If I were you, I would take a small subsample of the non-human mapping reads and blast then against nt to see what is going on..

ADD REPLYlink written 4.2 years ago by 5heikki8.5k

See this thread: http://seqanswers.com/forums/showthread.php?t=60696

Hopefully you are not the same person as the originator of that thread.

ADD REPLYlink written 4.2 years ago by genomax71k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1752 users visited in the last hour