Question

Recommendation to blast short <1kb sequences contained in a fasta file against a fastq file (obtained from Oxford Nanopore sequence)?

0

Entering edit mode

5.0 years ago

shawn.3.slater • 0

Hi everyone,

I'm a novice in the world of bioinformatics and I was wondering if anyone could help me blast a short series of sequences contained in a fasta file, against a fastq file obtained through Oxford Nanapore MinION sequencing? My goal is to see if the fastq files contains the gene(s) I'm looking for.

Thanks!

sequence alignment sequencing • 1.8k views

ADD COMMENT • link updated 5.0 years ago by flogin ▴ 280 • written 5.0 years ago by shawn.3.slater • 0

0

Entering edit mode

You should ideally use minimap2 (https://github.com/lh3/minimap2 ). How many sequences are in respective files? How long are your fasta sequences?

ADD REPLY • link 5.0 years ago by GenoMax 141k

0

Entering edit mode

As your title suggests, you could do this with BLAST, but as Genomax points out, minimap2 might be better, depending on the data.

ADD REPLY • link 5.0 years ago by Joe 21k

0

Entering edit mode

There are 35 sequences in the fasta file (all between 100bp and 300bp). The fastQ file(s) I would like to try and align the sequences contained in the fasta file against contains raw Oxford Nanapore fastq sequence at around 1Mb. Thanks for the minimap2 recommend.

ADD REPLY • link 5.0 years ago by shawn.3.slater • 0

0

Entering edit mode

Do you need to do this from reads? It may be sensible to assemble your nanopore data first, just on the off chance that one of your hits is at the end of a read or something. It would also reduce the dimensions of the output to just one or two hits, rather than several tens or hundreds of reads too.

ADD REPLY • link 5.0 years ago by Joe 21k

score 0 · Answer 1 · 2019-04-03

I have no idea if any program accept a fastq to make database (blast, DIAMOND, Bowtie2...).

If I'm not mistaken, you can convert you fastq of Nanopore output to fasta format, and use it to make your blast analysis.

You can easily make your analysis...

$ makeblastdb -in nanopore.fasta -dbtype nucl
$ blastn -db nanopore.fasta -query sequences.fasta -out sequences.fasta.blastn -outfmt 6 -evalue 0.00001 -word_size 7

The -outfmt 6 format output to a spreadsheet with several informations (identity, cover, gaps, regions if alignments, e.g.) and the word_size 7 make your analysis more sensitive.

I hope that it helps you.