Question: Blast many read sequences
0
gravatar for zoegward
10 months ago by
zoegward60
zoegward60 wrote:

Hi, I have a fastq file which I think contains sequences from different organisms. Is there a way I can blast all of the sequences in the fastq file to find out where what these organisms are??

Thanks in advance.

ADD COMMENTlink modified 10 months ago by Vijay Lakhujani4.0k • written 10 months ago by zoegward60

I am not able to understand what you are trying to do because your question is not explain properly but I can explain you the steps you can do.

Convert .fastq to .fasta

sed -n '1~4s/^@/>/p;2~4p' file.fq > file.fa

Use BLAST Command Line Application for fasta file

manual

ADD REPLYlink modified 10 months ago by Vijay Lakhujani4.0k • written 10 months ago by MSM5590

I have added/removed tags to keep the post relevant

ADD REPLYlink written 10 months ago by Vijay Lakhujani4.0k
3
gravatar for Vijay Lakhujani
10 months ago by
Vijay Lakhujani4.0k
India
Vijay Lakhujani4.0k wrote:

Blasting reads sounds like a bad idea considering small lengths and the number of reads. May be you can shuffle few thousand reads (seqkit?) and then try it, however, I will suggest using fastq-screen to map reads on the genomes of organims that you suspect to be present in your raw data.

ADD COMMENTlink written 10 months ago by Vijay Lakhujani4.0k

+1 for "fastq-screen"

ADD REPLYlink written 10 months ago by mbk0asis410
3
gravatar for shenwei356
10 months ago by
shenwei3564.5k
China
shenwei3564.5k wrote:

You need taxonomic profiling softwares, like Kraken and Kaiju . BLAST is the slowest for this kind of task.

ADD COMMENTlink written 10 months ago by shenwei3564.5k

+1 for that. That ll surely help OP

ADD REPLYlink written 10 months ago by Vijay Lakhujani4.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 771 users visited in the last hour