Question: Blast many read sequences
0
gravatar for zoegward
6 months ago by
zoegward50
zoegward50 wrote:

Hi, I have a fastq file which I think contains sequences from different organisms. Is there a way I can blast all of the sequences in the fastq file to find out where what these organisms are??

Thanks in advance.

ADD COMMENTlink modified 6 months ago by Vijay Lakhujani3.4k • written 6 months ago by zoegward50

I am not able to understand what you are trying to do because your question is not explain properly but I can explain you the steps you can do.

Convert .fastq to .fasta

sed -n '1~4s/^@/>/p;2~4p' file.fq > file.fa

Use BLAST Command Line Application for fasta file

manual

ADD REPLYlink modified 6 months ago by Vijay Lakhujani3.4k • written 6 months ago by MSM5580

I have added/removed tags to keep the post relevant

ADD REPLYlink written 6 months ago by Vijay Lakhujani3.4k
3
gravatar for Vijay Lakhujani
6 months ago by
Vijay Lakhujani3.4k
India
Vijay Lakhujani3.4k wrote:

Blasting reads sounds like a bad idea considering small lengths and the number of reads. May be you can shuffle few thousand reads (seqkit?) and then try it, however, I will suggest using fastq-screen to map reads on the genomes of organims that you suspect to be present in your raw data.

ADD COMMENTlink written 6 months ago by Vijay Lakhujani3.4k

+1 for "fastq-screen"

ADD REPLYlink written 6 months ago by mbk0asis390
3
gravatar for shenwei356
6 months ago by
shenwei3564.3k
China
shenwei3564.3k wrote:

You need taxonomic profiling softwares, like Kraken and Kaiju . BLAST is the slowest for this kind of task.

ADD COMMENTlink written 6 months ago by shenwei3564.3k

+1 for that. That ll surely help OP

ADD REPLYlink written 6 months ago by Vijay Lakhujani3.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2205 users visited in the last hour